OpenCV  4.5.2
Open Source Computer Vision
Geometric Transformations of Images

Goals

Transformations

Scaling

Scaling is just resizing of the image. OpenCV comes with a function cv.resize() for this purpose. The size of the image can be specified manually, or you can specify the scaling factor. Different interpolation methods are used. Preferable interpolation methods are cv.INTER_AREA for shrinking and cv.INTER_CUBIC (slow) & cv.INTER_LINEAR for zooming.

We use the function: cv.resize (src, dst, dsize, fx = 0, fy = 0, interpolation = cv.INTER_LINEAR)

Parameters
srcinput image
dstoutput image; it has the size dsize (when it is non-zero) or the size computed from src.size(), fx, and fy; the type of dst is the same as of src.
dsizeoutput image size; if it equals zero, it is computed as:

\[𝚍𝚜𝚒𝚣𝚎 = 𝚂𝚒𝚣𝚎(𝚛𝚘𝚞𝚗𝚍(𝚏𝚡*𝚜𝚛𝚌.𝚌𝚘𝚕𝚜), 𝚛𝚘𝚞𝚗𝚍(𝚏𝚢*𝚜𝚛𝚌.𝚛𝚘𝚠𝚜))\]

Either dsize or both fx and fy must be non-zero.
fxscale factor along the horizontal axis; when it equals 0, it is computed as

\[(𝚍𝚘𝚞𝚋𝚕𝚎)𝚍𝚜𝚒𝚣𝚎.𝚠𝚒𝚍𝚝𝚑/𝚜𝚛𝚌.𝚌𝚘𝚕𝚜\]

fyscale factor along the vertical axis; when it equals 0, it is computed as

\[(𝚍𝚘𝚞𝚋𝚕𝚎)𝚍𝚜𝚒𝚣𝚎.𝚑𝚎𝚒𝚐𝚑𝚝/𝚜𝚛𝚌.𝚛𝚘𝚠𝚜\]

interpolationinterpolation method(see cv.InterpolationFlags)

Try it

Translation

Translation is the shifting of object's location. If you know the shift in (x,y) direction, let it be \((t_x,t_y)\), you can create the transformation matrix \(\textbf{M}\) as follows:

\[M = \begin{bmatrix} 1 & 0 & t_x \\ 0 & 1 & t_y \end{bmatrix}\]

We use the function: cv.warpAffine (src, dst, M, dsize, flags = cv.INTER_LINEAR, borderMode = cv.BORDER_CONSTANT, borderValue = new cv.Scalar())

Parameters
srcinput image.
dstoutput image that has the size dsize and the same type as src.
Mat2 × 3 transformation matrix(cv.CV_64FC1 type).
dsizesize of the output image.
flagscombination of interpolation methods(see cv.InterpolationFlags) and the optional flag WARP_INVERSE_MAP that means that M is the inverse transformation ( 𝚍𝚜𝚝→𝚜𝚛𝚌 )
borderModepixel extrapolation method (see cv.BorderTypes); when borderMode = BORDER_TRANSPARENT, it means that the pixels in the destination image corresponding to the "outliers" in the source image are not modified by the function.
borderValuevalue used in case of a constant border; by default, it is 0.

rows.

Try it

Rotation

Rotation of an image for an angle \(\theta\) is achieved by the transformation matrix of the form

\[M = \begin{bmatrix} cos\theta & -sin\theta \\ sin\theta & cos\theta \end{bmatrix}\]

But OpenCV provides scaled rotation with adjustable center of rotation so that you can rotate at any location you prefer. Modified transformation matrix is given by

\[\begin{bmatrix} \alpha & \beta & (1- \alpha ) \cdot center.x - \beta \cdot center.y \\ - \beta & \alpha & \beta \cdot center.x + (1- \alpha ) \cdot center.y \end{bmatrix}\]

where:

\[\begin{array}{l} \alpha = scale \cdot \cos \theta , \\ \beta = scale \cdot \sin \theta \end{array}\]

We use the function: cv.getRotationMatrix2D (center, angle, scale)

Parameters
centercenter of the rotation in the source image.
anglerotation angle in degrees. Positive values mean counter-clockwise rotation (the coordinate origin is assumed to be the top-left corner).
scaleisotropic scale factor.

Try it

Affine Transformation

In affine transformation, all parallel lines in the original image will still be parallel in the output image. To find the transformation matrix, we need three points from input image and their corresponding locations in output image. Then cv.getAffineTransform will create a 2x3 matrix which is to be passed to cv.warpAffine.

We use the function: cv.getAffineTransform (src, dst)

Parameters
srcthree points([3, 1] size and cv.CV_32FC2 type) from input imag.
dstthree corresponding points([3, 1] size and cv.CV_32FC2 type) in output image.

Try it

Perspective Transformation

For perspective transformation, you need a 3x3 transformation matrix. Straight lines will remain straight even after the transformation. To find this transformation matrix, you need 4 points on the input image and corresponding points on the output image. Among these 4 points, 3 of them should not be collinear. Then transformation matrix can be found by the function cv.getPerspectiveTransform. Then apply cv.warpPerspective with this 3x3 transformation matrix.

We use the functions: cv.warpPerspective (src, dst, M, dsize, flags = cv.INTER_LINEAR, borderMode = cv.BORDER_CONSTANT, borderValue = new cv.Scalar())

Parameters
srcinput image.
dstoutput image that has the size dsize and the same type as src.
Mat3 × 3 transformation matrix(cv.CV_64FC1 type).
dsizesize of the output image.
flagscombination of interpolation methods (cv.INTER_LINEAR or cv.INTER_NEAREST) and the optional flag WARP_INVERSE_MAP, that sets M as the inverse transformation (𝚍𝚜𝚝→𝚜𝚛𝚌).
borderModepixel extrapolation method (cv.BORDER_CONSTANT or cv.BORDER_REPLICATE).
borderValuevalue used in case of a constant border; by default, it is 0.

cv.getPerspectiveTransform (src, dst)

Parameters
srccoordinates of quadrangle vertices in the source image.
dstcoordinates of the corresponding quadrangle vertices in the destination image.

Try it