Prev Tutorial: Motion Deblur Filter
Next Tutorial: Periodic Noise Removing Filter
| |
Original author | Karpushin Vladislav |
Compatibility | OpenCV >= 3.0 |
Goal
In this tutorial you will learn:
- what the gradient structure tensor is
- how to estimate orientation and coherency of an anisotropic image by a gradient structure tensor
- how to segment an anisotropic image with a single local orientation by a gradient structure tensor
Theory
- Note
- The explanation is based on the books [135], [27] and [284]. Good physical explanation of a gradient structure tensor is given in [309]. Also, you can refer to a wikipedia page Structure tensor.
-
A anisotropic image on this page is a real world image.
What is the gradient structure tensor?
In mathematics, the gradient structure tensor (also referred to as the second-moment matrix, the second order moment tensor, the inertia tensor, etc.) is a matrix derived from the gradient of a function. It summarizes the predominant directions of the gradient in a specified neighborhood of a point, and the degree to which those directions are coherent (coherency). The gradient structure tensor is widely used in image processing and computer vision for 2D/3D image segmentation, motion detection, adaptive filtration, local image features detection, etc.
Important features of anisotropic images include orientation and coherency of a local anisotropy. In this paper we will show how to estimate orientation and coherency, and how to segment an anisotropic image with a single local orientation by a gradient structure tensor.
The gradient structure tensor of an image is a 2x2 symmetric matrix. Eigenvectors of the gradient structure tensor indicate local orientation, whereas eigenvalues give coherency (a measure of anisotropism).
The gradient structure tensor \(J\) of an image \(Z\) can be written as:
\[J = \begin{bmatrix}
J_{11} & J_{12} \\
J_{12} & J_{22}
\end{bmatrix}\]
where \(J_{11} = M[Z_{x}^{2}]\), \(J_{22} = M[Z_{y}^{2}]\), \(J_{12} = M[Z_{x}Z_{y}]\) - components of the tensor, \(M[]\) is a symbol of mathematical expectation (we can consider this operation as averaging in a window w), \(Z_{x}\) and \(Z_{y}\) are partial derivatives of an image \(Z\) with respect to \(x\) and \(y\).
The eigenvalues of the tensor can be found in the below formula:
\[\lambda_{1,2} = \frac{1}{2} \left [ J_{11} + J_{22} \pm \sqrt{(J_{11} - J_{22})^{2} + 4J_{12}^{2}} \right ] \]
where \(\lambda_1\) - largest eigenvalue, \(\lambda_2\) - smallest eigenvalue.
How to estimate orientation and coherency of an anisotropic image by gradient structure tensor?
The orientation of an anisotropic image:
\[\alpha = 0.5arctg\frac{2J_{12}}{J_{22} - J_{11}}\]
Coherency:
\[C = \frac{\lambda_1 - \lambda_2}{\lambda_1 + \lambda_2}\]
The coherency ranges from 0 to 1. For ideal local orientation ( \(\lambda_2\) = 0, \(\lambda_1\) > 0) it is one, for an isotropic gray value structure ( \(\lambda_1\) = \(\lambda_2\) > 0) it is zero.
Source code
You can find source code in the samples/cpp/tutorial_code/ImgProc/anisotropic_image_segmentation/anisotropic_image_segmentation.cpp
of the OpenCV source code library.
C++
#include <iostream>
using namespace std;
void calcGST(
const Mat& inputImg,
Mat& imgCoherencyOut,
Mat& imgOrientationOut,
int w);
{
int W = 52;
double C_Thr = 0.43;
int LowThr = 35;
int HighThr = 57;
samples::addSamplesDataSearchSubDirectory("doc/tutorials/imgproc/anisotropic_image_segmentation/images");
Mat imgIn = imread(samples::findFile(
"gst_input.jpg"), IMREAD_GRAYSCALE);
{
cout << "ERROR : Image cannot be loaded..!!" << endl;
return -1;
}
Mat imgCoherency, imgOrientation;
calcGST(imgIn, imgCoherency, imgOrientation, W);
imgCoherencyBin = imgCoherency > C_Thr;
inRange(imgOrientation,
Scalar(LowThr),
Scalar(HighThr), imgOrientationBin);
imgBin = imgCoherencyBin & imgOrientationBin;
normalize(imgCoherency, imgCoherency, 0, 255, NORM_MINMAX,
CV_8U);
normalize(imgOrientation, imgOrientation, 0, 255, NORM_MINMAX,
CV_8U);
imshow("Original", imgIn);
imshow("Result", 0.5 * (imgIn + imgBin));
imshow("Coherency", imgCoherency);
imshow("Orientation", imgOrientation);
imwrite("result.jpg", 0.5*(imgIn + imgBin));
imwrite("Coherency.jpg", imgCoherency);
imwrite("Orientation.jpg", imgOrientation);
waitKey(0);
return 0;
}
void calcGST(
const Mat& inputImg,
Mat& imgCoherencyOut,
Mat& imgOrientationOut,
int w)
{
Mat imgDiffX, imgDiffY, imgDiffXY;
Sobel(img, imgDiffX,
CV_32F, 1, 0, 3);
Sobel(img, imgDiffY,
CV_32F, 0, 1, 3);
multiply(imgDiffX, imgDiffY, imgDiffXY);
Mat imgDiffXX, imgDiffYY;
multiply(imgDiffX, imgDiffX, imgDiffXX);
multiply(imgDiffY, imgDiffY, imgDiffYY);
Mat tmp1, tmp2, tmp3, tmp4;
tmp1 = J11 + J22;
tmp2 = J11 - J22;
multiply(tmp2, tmp2, tmp2);
multiply(J12, J12, tmp3);
sqrt(tmp2 + 4.0 * tmp3, tmp4);
lambda1 = tmp1 + tmp4;
lambda1 = 0.5*lambda1;
lambda2 = tmp1 - tmp4;
lambda2 = 0.5*lambda2;
divide(lambda1 - lambda2, lambda1 + lambda2, imgCoherencyOut);
phase(J22 - J11, 2.0*J12, imgOrientationOut, true);
imgOrientationOut = 0.5*imgOrientationOut;
}
n-dimensional dense array class
Definition mat.hpp:950
bool empty() const
Returns true if the array has no elements.
void convertTo(OutputArray m, int rtype, double alpha=1, double beta=0) const
Converts an array to another data type with optional scaling.
Template class for specifying the size of an image or rectangle.
Definition types.hpp:338
#define CV_8U
Definition interface.h:76
#define CV_32F
Definition interface.h:81
int main(int argc, char *argv[])
Definition highgui_qt.cpp:3
Python
import cv2 as cv
import numpy as np
import argparse
W = 52
C_Thr = 0.43
LowThr = 35
HighThr = 57
def calcGST(inputIMG, w):
img = inputIMG.astype(np.float32)
imgDiffX =
cv.Sobel(img, cv.CV_32F, 1, 0, 3)
imgDiffY =
cv.Sobel(img, cv.CV_32F, 0, 1, 3)
tmp1 = J11 + J22
tmp2 = J11 - J22
tmp4 = np.sqrt(tmp2 + 4.0 * tmp3)
lambda1 = 0.5*(tmp1 + tmp4)
lambda2 = 0.5*(tmp1 - tmp4)
imgCoherencyOut =
cv.divide(lambda1 - lambda2, lambda1 + lambda2)
imgOrientationOut =
cv.phase(J22 - J11, 2.0 * J12, angleInDegrees =
True)
imgOrientationOut = 0.5 * imgOrientationOut
return imgCoherencyOut, imgOrientationOut
parser = argparse.ArgumentParser(description='Code for Anisotropic image segmentation tutorial.')
parser.add_argument('-i', '--input', help='Path to input image.', required=True)
args = parser.parse_args()
imgIn =
cv.imread(args.input, cv.IMREAD_GRAYSCALE)
if imgIn is None:
print('Could not open or find the image: {}'.format(args.input))
exit(0)
imgCoherency, imgOrientation = calcGST(imgIn, W)
_, imgCoherencyBin =
cv.threshold(imgCoherency, C_Thr, 255, cv.THRESH_BINARY)
_, imgOrientationBin =
cv.threshold(imgOrientation, LowThr, HighThr, cv.THRESH_BINARY)
imgCoherency =
cv.normalize(imgCoherency,
None, alpha=0, beta=1, norm_type=cv.NORM_MINMAX, dtype=cv.CV_32F)
imgOrientation =
cv.normalize(imgOrientation,
None, alpha=0, beta=1, norm_type=cv.NORM_MINMAX, dtype=cv.CV_32F)
cv.imshow(
'result.jpg', np.uint8(0.5*(imgIn + imgBin)))
void bitwise_and(InputArray src1, InputArray src2, OutputArray dst, InputArray mask=noArray())
computes bitwise conjunction of the two arrays (dst = src1 & src2) Calculates the per-element bit-wis...
void divide(InputArray src1, InputArray src2, OutputArray dst, double scale=1, int dtype=-1)
Performs per-element division of two arrays or a scalar by an array.
void normalize(InputArray src, InputOutputArray dst, double alpha=1, double beta=0, int norm_type=NORM_L2, int dtype=-1, InputArray mask=noArray())
Normalizes the norm or value range of an array.
void multiply(InputArray src1, InputArray src2, OutputArray dst, double scale=1, int dtype=-1)
Calculates the per-element scaled product of two arrays.
void phase(InputArray x, InputArray y, OutputArray angle, bool angleInDegrees=false)
Calculates the rotation angle of 2D vectors.
void imshow(const String &winname, InputArray mat)
Displays an image in the specified window.
int waitKey(int delay=0)
Waits for a pressed key.
CV_EXPORTS_W Mat imread(const String &filename, int flags=IMREAD_COLOR_BGR)
Loads an image from a file.
void Sobel(InputArray src, OutputArray dst, int ddepth, int dx, int dy, int ksize=3, double scale=1, double delta=0, int borderType=BORDER_DEFAULT)
Calculates the first, second, third, or mixed image derivatives using an extended Sobel operator.
void boxFilter(InputArray src, OutputArray dst, int ddepth, Size ksize, Point anchor=Point(-1,-1), bool normalize=true, int borderType=BORDER_DEFAULT)
Blurs an image using the box filter.
double threshold(InputArray src, OutputArray dst, double thresh, double maxval, int type)
Applies a fixed-level threshold to each array element.
Explanation
An anisotropic image segmentation algorithm consists of a gradient structure tensor calculation, an orientation calculation, a coherency calculation and an orientation and coherency thresholding:
C++
Mat imgCoherency, imgOrientation;
calcGST(imgIn, imgCoherency, imgOrientation, W);
imgCoherencyBin = imgCoherency > C_Thr;
inRange(imgOrientation,
Scalar(LowThr),
Scalar(HighThr), imgOrientationBin);
imgBin = imgCoherencyBin & imgOrientationBin;
Python
imgCoherency, imgOrientation = calcGST(imgIn, W)
_, imgCoherencyBin =
cv.threshold(imgCoherency, C_Thr, 255, cv.THRESH_BINARY)
_, imgOrientationBin =
cv.threshold(imgOrientation, LowThr, HighThr, cv.THRESH_BINARY)
A function calcGST() calculates orientation and coherency by using a gradient structure tensor. An input parameter w defines a window size:
C++
void calcGST(
const Mat& inputImg,
Mat& imgCoherencyOut,
Mat& imgOrientationOut,
int w)
{
Mat imgDiffX, imgDiffY, imgDiffXY;
Sobel(img, imgDiffX,
CV_32F, 1, 0, 3);
Sobel(img, imgDiffY,
CV_32F, 0, 1, 3);
multiply(imgDiffX, imgDiffY, imgDiffXY);
Mat imgDiffXX, imgDiffYY;
multiply(imgDiffX, imgDiffX, imgDiffXX);
multiply(imgDiffY, imgDiffY, imgDiffYY);
Mat tmp1, tmp2, tmp3, tmp4;
tmp1 = J11 + J22;
tmp2 = J11 - J22;
multiply(tmp2, tmp2, tmp2);
multiply(J12, J12, tmp3);
sqrt(tmp2 + 4.0 * tmp3, tmp4);
lambda1 = tmp1 + tmp4;
lambda1 = 0.5*lambda1;
lambda2 = tmp1 - tmp4;
lambda2 = 0.5*lambda2;
divide(lambda1 - lambda2, lambda1 + lambda2, imgCoherencyOut);
phase(J22 - J11, 2.0*J12, imgOrientationOut, true);
imgOrientationOut = 0.5*imgOrientationOut;
}
Python
def calcGST(inputIMG, w):
img = inputIMG.astype(np.float32)
imgDiffX =
cv.Sobel(img, cv.CV_32F, 1, 0, 3)
imgDiffY =
cv.Sobel(img, cv.CV_32F, 0, 1, 3)
tmp1 = J11 + J22
tmp2 = J11 - J22
tmp4 = np.sqrt(tmp2 + 4.0 * tmp3)
lambda1 = 0.5*(tmp1 + tmp4)
lambda2 = 0.5*(tmp1 - tmp4)
imgCoherencyOut =
cv.divide(lambda1 - lambda2, lambda1 + lambda2)
imgOrientationOut =
cv.phase(J22 - J11, 2.0 * J12, angleInDegrees =
True)
imgOrientationOut = 0.5 * imgOrientationOut
return imgCoherencyOut, imgOrientationOut
The below code applies a thresholds LowThr and HighThr to image orientation and a threshold C_Thr to image coherency calculated by the previous function. LowThr and HighThr define orientation range:
C++
imgCoherencyBin = imgCoherency > C_Thr;
inRange(imgOrientation,
Scalar(LowThr),
Scalar(HighThr), imgOrientationBin);
Python
_, imgCoherencyBin =
cv.threshold(imgCoherency, C_Thr, 255, cv.THRESH_BINARY)
_, imgOrientationBin =
cv.threshold(imgOrientation, LowThr, HighThr, cv.THRESH_BINARY)
And finally we combine thresholding results:
C++
imgBin = imgCoherencyBin & imgOrientationBin;
Python
Result
Below you can see the real anisotropic image with single direction:
Below you can see the orientation and coherency of the anisotropic image:
Below you can see the segmentation result:
The result has been computed with w = 52, C_Thr = 0.43, LowThr = 35, HighThr = 57. We can see that the algorithm selected only the areas with one single direction.
References