OpenCV
4.0.0rc
Open Source Computer Vision

Classes  
class  cv::DenseOpticalFlow 
class  cv::DISOpticalFlow 
DIS optical flow algorithm. More...  
class  cv::FarnebackOpticalFlow 
Class computing a dense optical flow using the Gunnar Farneback's algorithm. More...  
class  cv::KalmanFilter 
Kalman filter class. More...  
class  cv::SparseOpticalFlow 
Base interface for sparse optical flow algorithms. More...  
class  cv::SparsePyrLKOpticalFlow 
Class used for calculating a sparse optical flow. More...  
class  cv::VariationalRefinement 
Variational optical flow refinement. More...  
Enumerations  
enum  { cv::OPTFLOW_USE_INITIAL_FLOW = 4, cv::OPTFLOW_LK_GET_MIN_EIGENVALS = 8, cv::OPTFLOW_FARNEBACK_GAUSSIAN = 256 } 
enum  { cv::MOTION_TRANSLATION = 0, cv::MOTION_EUCLIDEAN = 1, cv::MOTION_AFFINE = 2, cv::MOTION_HOMOGRAPHY = 3 } 
Functions  
int  cv::buildOpticalFlowPyramid (InputArray img, OutputArrayOfArrays pyramid, Size winSize, int maxLevel, bool withDerivatives=true, int pyrBorder=BORDER_REFLECT_101, int derivBorder=BORDER_CONSTANT, bool tryReuseInputImage=true) 
Constructs the image pyramid which can be passed to calcOpticalFlowPyrLK. More...  
void  cv::calcOpticalFlowFarneback (InputArray prev, InputArray next, InputOutputArray flow, double pyr_scale, int levels, int winsize, int iterations, int poly_n, double poly_sigma, int flags) 
Computes a dense optical flow using the Gunnar Farneback's algorithm. More...  
void  cv::calcOpticalFlowPyrLK (InputArray prevImg, InputArray nextImg, InputArray prevPts, InputOutputArray nextPts, OutputArray status, OutputArray err, Size winSize=Size(21, 21), int maxLevel=3, TermCriteria criteria=TermCriteria(TermCriteria::COUNT+TermCriteria::EPS, 30, 0.01), int flags=0, double minEigThreshold=1e4) 
Calculates an optical flow for a sparse feature set using the iterative LucasKanade method with pyramids. More...  
RotatedRect  cv::CamShift (InputArray probImage, Rect &window, TermCriteria criteria) 
Finds an object center, size, and orientation. More...  
Mat  cv::estimateRigidTransform (InputArray src, InputArray dst, bool fullAffine) 
Computes an optimal affine transformation between two 2D point sets. More...  
double  cv::findTransformECC (InputArray templateImage, InputArray inputImage, InputOutputArray warpMatrix, int motionType=MOTION_AFFINE, TermCriteria criteria=TermCriteria(TermCriteria::COUNT+TermCriteria::EPS, 50, 0.001), InputArray inputMask=noArray()) 
Finds the geometric transform (warp) between two images in terms of the ECC criterion [53] . More...  
int  cv::meanShift (InputArray probImage, Rect &window, TermCriteria criteria) 
Finds an object on a back projection image. More...  
Mat  cv::readOpticalFlow (const String &path) 
Read a .flo file. More...  
bool  cv::writeOpticalFlow (const String &path, InputArray flow) 
Write a .flo to disk. More...  
anonymous enum 
anonymous enum 
int cv::buildOpticalFlowPyramid  (  InputArray  img, 
OutputArrayOfArrays  pyramid,  
Size  winSize,  
int  maxLevel,  
bool  withDerivatives = true , 

int  pyrBorder = BORDER_REFLECT_101 , 

int  derivBorder = BORDER_CONSTANT , 

bool  tryReuseInputImage = true 

) 
Python:  

retval, pyramid  =  cv.buildOpticalFlowPyramid(  img, winSize, maxLevel[, pyramid[, withDerivatives[, pyrBorder[, derivBorder[, tryReuseInputImage]]]]]  ) 
Constructs the image pyramid which can be passed to calcOpticalFlowPyrLK.
img  8bit input image. 
pyramid  output pyramid. 
winSize  window size of optical flow algorithm. Must be not less than winSize argument of calcOpticalFlowPyrLK. It is needed to calculate required padding for pyramid levels. 
maxLevel  0based maximal pyramid level number. 
withDerivatives  set to precompute gradients for the every pyramid level. If pyramid is constructed without the gradients then calcOpticalFlowPyrLK will calculate them internally. 
pyrBorder  the border mode for pyramid layers. 
derivBorder  the border mode for gradients. 
tryReuseInputImage  put ROI of input image into the pyramid if possible. You can pass false to force data copying. 
void cv::calcOpticalFlowFarneback  (  InputArray  prev, 
InputArray  next,  
InputOutputArray  flow,  
double  pyr_scale,  
int  levels,  
int  winsize,  
int  iterations,  
int  poly_n,  
double  poly_sigma,  
int  flags  
) 
Python:  

flow  =  cv.calcOpticalFlowFarneback(  prev, next, flow, pyr_scale, levels, winsize, iterations, poly_n, poly_sigma, flags  ) 
Computes a dense optical flow using the Gunnar Farneback's algorithm.
prev  first 8bit singlechannel input image. 
next  second input image of the same size and the same type as prev. 
flow  computed flow image that has the same size as prev and type CV_32FC2. 
pyr_scale  parameter, specifying the image scale (<1) to build pyramids for each image; pyr_scale=0.5 means a classical pyramid, where each next layer is twice smaller than the previous one. 
levels  number of pyramid layers including the initial image; levels=1 means that no extra layers are created and only the original images are used. 
winsize  averaging window size; larger values increase the algorithm robustness to image noise and give more chances for fast motion detection, but yield more blurred motion field. 
iterations  number of iterations the algorithm does at each pyramid level. 
poly_n  size of the pixel neighborhood used to find polynomial expansion in each pixel; larger values mean that the image will be approximated with smoother surfaces, yielding more robust algorithm and more blurred motion field, typically poly_n =5 or 7. 
poly_sigma  standard deviation of the Gaussian that is used to smooth derivatives used as a basis for the polynomial expansion; for poly_n=5, you can set poly_sigma=1.1, for poly_n=7, a good value would be poly_sigma=1.5. 
flags  operation flags that can be a combination of the following:

The function finds an optical flow for each prev pixel using the [55] algorithm so that
\[\texttt{prev} (y,x) \sim \texttt{next} ( y + \texttt{flow} (y,x)[1], x + \texttt{flow} (y,x)[0])\]
void cv::calcOpticalFlowPyrLK  (  InputArray  prevImg, 
InputArray  nextImg,  
InputArray  prevPts,  
InputOutputArray  nextPts,  
OutputArray  status,  
OutputArray  err,  
Size  winSize = Size(21, 21) , 

int  maxLevel = 3 , 

TermCriteria  criteria = TermCriteria(TermCriteria::COUNT+TermCriteria::EPS, 30, 0.01) , 

int  flags = 0 , 

double  minEigThreshold = 1e4 

) 
Python:  

nextPts, status, err  =  cv.calcOpticalFlowPyrLK(  prevImg, nextImg, prevPts, nextPts[, status[, err[, winSize[, maxLevel[, criteria[, flags[, minEigThreshold]]]]]]]  ) 
Calculates an optical flow for a sparse feature set using the iterative LucasKanade method with pyramids.
prevImg  first 8bit input image or pyramid constructed by buildOpticalFlowPyramid. 
nextImg  second input image or pyramid of the same size and the same type as prevImg. 
prevPts  vector of 2D points for which the flow needs to be found; point coordinates must be singleprecision floatingpoint numbers. 
nextPts  output vector of 2D points (with singleprecision floatingpoint coordinates) containing the calculated new positions of input features in the second image; when OPTFLOW_USE_INITIAL_FLOW flag is passed, the vector must have the same size as in the input. 
status  output status vector (of unsigned chars); each element of the vector is set to 1 if the flow for the corresponding features has been found, otherwise, it is set to 0. 
err  output vector of errors; each element of the vector is set to an error for the corresponding feature, type of the error measure can be set in flags parameter; if the flow wasn't found then the error is not defined (use the status parameter to find such cases). 
winSize  size of the search window at each pyramid level. 
maxLevel  0based maximal pyramid level number; if set to 0, pyramids are not used (single level), if set to 1, two levels are used, and so on; if pyramids are passed to input then algorithm will use as many levels as pyramids have but no more than maxLevel. 
criteria  parameter, specifying the termination criteria of the iterative search algorithm (after the specified maximum number of iterations criteria.maxCount or when the search window moves by less than criteria.epsilon. 
flags  operation flags:

minEigThreshold  the algorithm calculates the minimum eigen value of a 2x2 normal matrix of optical flow equations (this matrix is called a spatial gradient matrix in [22]), divided by number of pixels in a window; if this value is less than minEigThreshold, then a corresponding feature is filtered out and its flow is not processed, so it allows to remove bad points and get a performance boost. 
The function implements a sparse iterative version of the LucasKanade optical flow in pyramids. See [22] . The function is parallelized with the TBB library.
RotatedRect cv::CamShift  (  InputArray  probImage, 
Rect &  window,  
TermCriteria  criteria  
) 
Python:  

retval, window  =  cv.CamShift(  probImage, window, criteria  ) 
Finds an object center, size, and orientation.
probImage  Back projection of the object histogram. See calcBackProject. 
window  Initial search window. 
criteria  Stop criteria for the underlying meanShift. returns (in old interfaces) Number of iterations CAMSHIFT took to converge The function implements the CAMSHIFT object tracking algorithm [25] . First, it finds an object center using meanShift and then adjusts the window size and finds the optimal rotation. The function returns the rotated rectangle structure that includes the object position, size, and orientation. The next position of the search window can be obtained with RotatedRect::boundingRect() 
See the OpenCV sample camshiftdemo.c that tracks colored objects.
Mat cv::estimateRigidTransform  (  InputArray  src, 
InputArray  dst,  
bool  fullAffine  
) 
Computes an optimal affine transformation between two 2D point sets.
src  First input 2D point set stored in std::vector or Mat, or an image stored in Mat. 
dst  Second input 2D point set of the same size and the same type as A, or another image. 
fullAffine  If true, the function finds an optimal affine transformation with no additional restrictions (6 degrees of freedom). Otherwise, the class of transformations to choose from is limited to combinations of translation, rotation, and uniform scaling (4 degrees of freedom). 
The function finds an optimal affine transform [Ab] (a 2 x 3 floatingpoint matrix) that approximates best the affine transformation between:
Two point sets Two raster images. In this case, the function first finds some features in the src image and finds the corresponding features in dst image. After that, the problem is reduced to the first case.
In case of point sets, the problem is formulated as follows: you need to find a 2x2 matrix A and 2x1 vector b so that:
\[[A^*b^*] = arg \min _{[Ab]} \sum _i \ \texttt{dst}[i]  A { \texttt{src}[i]}^T  b \ ^2\]
where src[i] and dst[i] are the ith points in src and dst, respectively \([Ab]\) can be either arbitrary (when fullAffine=true ) or have a form of
\[\begin{bmatrix} a_{11} & a_{12} & b_1 \\ a_{12} & a_{11} & b_2 \end{bmatrix}\]
when fullAffine=false.
double cv::findTransformECC  (  InputArray  templateImage, 
InputArray  inputImage,  
InputOutputArray  warpMatrix,  
int  motionType = MOTION_AFFINE , 

TermCriteria  criteria = TermCriteria(TermCriteria::COUNT+TermCriteria::EPS, 50, 0.001) , 

InputArray  inputMask = noArray() 

) 
Python:  

retval, warpMatrix  =  cv.findTransformECC(  templateImage, inputImage, warpMatrix[, motionType[, criteria[, inputMask]]]  ) 
Finds the geometric transform (warp) between two images in terms of the ECC criterion [53] .
templateImage  singlechannel template image; CV_8U or CV_32F array. 
inputImage  singlechannel input image which should be warped with the final warpMatrix in order to provide an image similar to templateImage, same type as temlateImage. 
warpMatrix  floatingpoint \(2\times 3\) or \(3\times 3\) mapping matrix (warp). 
motionType  parameter, specifying the type of motion:

criteria  parameter, specifying the termination criteria of the ECC algorithm; criteria.epsilon defines the threshold of the increment in the correlation coefficient between two iterations (a negative criteria.epsilon makes criteria.maxcount the only termination criterion). Default values are shown in the declaration above. 
inputMask  An optional mask to indicate valid values of inputImage. 
The function estimates the optimum transformation (warpMatrix) with respect to ECC criterion ([53]), that is
\[\texttt{warpMatrix} = \texttt{warpMatrix} = \arg\max_{W} \texttt{ECC}(\texttt{templateImage}(x,y),\texttt{inputImage}(x',y'))\]
where
\[\begin{bmatrix} x' \\ y' \end{bmatrix} = W \cdot \begin{bmatrix} x \\ y \\ 1 \end{bmatrix}\]
(the equation holds with homogeneous coordinates for homography). It returns the final enhanced correlation coefficient, that is the correlation coefficient between the template image and the final warped input image. When a \(3\times 3\) matrix is given with motionType =0, 1 or 2, the third row is ignored.
Unlike findHomography and estimateRigidTransform, the function findTransformECC implements an areabased alignment that builds on intensity similarities. In essence, the function updates the initial transformation that roughly aligns the images. If this information is missing, the identity warp (unity matrix) is used as an initialization. Note that if images undergo strong displacements/rotations, an initial transformation that roughly aligns the images is necessary (e.g., a simple euclidean/similarity transform that allows for the images showing the same image content approximately). Use inverse warping in the second image to take an image close to the first one, i.e. use the flag WARP_INVERSE_MAP with warpAffine or warpPerspective. See also the OpenCV sample image_alignment.cpp that demonstrates the use of the function. Note that the function throws an exception if algorithm does not converges.
int cv::meanShift  (  InputArray  probImage, 
Rect &  window,  
TermCriteria  criteria  
) 
Python:  

retval, window  =  cv.meanShift(  probImage, window, criteria  ) 
Finds an object on a back projection image.
probImage  Back projection of the object histogram. See calcBackProject for details. 
window  Initial search window. 
criteria  Stop criteria for the iterative search algorithm. returns : Number of iterations CAMSHIFT took to converge. The function implements the iterative object search algorithm. It takes the input back projection of an object and the initial position. The mass center in window of the back projection image is computed and the search window center shifts to the mass center. The procedure is repeated until the specified number of iterations criteria.maxCount is done or until the window center shifts by less than criteria.epsilon. The algorithm is used inside CamShift and, unlike CamShift , the search window size or orientation do not change during the search. You can simply pass the output of calcBackProject to this function. But better results can be obtained if you prefilter the back projection and remove the noise. For example, you can do this by retrieving connected components with findContours , throwing away contours with small area ( contourArea ), and rendering the remaining contours with drawContours. 
Read a .flo file.
path  Path to the file to be loaded 
The function readOpticalFlow loads a flow field from a file and returns it as a single matrix. Resulting Mat has a type CV_32FC2  floatingpoint, 2channel. First channel corresponds to the flow in the horizontal direction (u), second  vertical (v).
bool cv::writeOpticalFlow  (  const String &  path, 
InputArray  flow  
) 
Python:  

retval  =  cv.writeOpticalFlow(  path, flow  ) 
Write a .flo to disk.
path  Path to the file to be written 
flow  Flow field to be stored 
The function stores a flow field in a file, returns true on success, false otherwise. The flow field must be a 2channel, floatingpoint matrix (CV_32FC2). First channel corresponds to the flow in the horizontal direction (u), second  vertical (v).