OpenCV
3.3.0dev
Open Source Computer Vision

Learn to
So what is histogram ? You can consider histogram as a graph or plot, which gives you an overall idea about the intensity distribution of an image. It is a plot with pixel values (ranging from 0 to 255, not always) in Xaxis and corresponding number of pixels in the image on Yaxis.
It is just another way of understanding the image. By looking at the histogram of an image, you get intuition about contrast, brightness, intensity distribution etc of that image. Almost all image processing tools today, provides features on histogram. Below is an image from Cambridge in Color website, and I recommend you to visit the site for more details.
You can see the image and its histogram. (Remember, this histogram is drawn for grayscale image, not color image). Left region of histogram shows the amount of darker pixels in image and right region shows the amount of brighter pixels. From the histogram, you can see dark region is more than brighter region, and amount of midtones (pixel values in midrange, say around 127) are very less.
Now we have an idea on what is histogram, we can look into how to find this. Both OpenCV and Numpy come with inbuilt function for this. Before using those functions, we need to understand some terminologies related with histograms.
BINS :The above histogram shows the number of pixels for every pixel value, ie from 0 to 255. ie you need 256 values to show the above histogram. But consider, what if you need not find the number of pixels for all pixel values separately, but number of pixels in a interval of pixel values? say for example, you need to find the number of pixels lying between 0 to 15, then 16 to 31, ..., 240 to 255. You will need only 16 values to represent the histogram. And that is what is shown in example given in OpenCV Tutorials on histograms.
So what you do is simply split the whole histogram to 16 subparts and value of each subpart is the sum of all pixel count in it. This each subpart is called "BIN". In first case, number of bins were 256 (one for each pixel) while in second case, it is only 16. BINS is represented by the term histSize in OpenCV docs.
DIMS : It is the number of parameters for which we collect the data. In this case, we collect data regarding only one thing, intensity value. So here it is 1.
RANGE : It is the range of intensity values you want to measure. Normally, it is [0,256], ie all intensity values.
So now we use cv2.calcHist() function to find the histogram. Let's familiarize with the function and its parameters :
So let's start with a sample image. Simply load an image in grayscale mode and find its full histogram.
hist is a 256x1 array, each value corresponds to number of pixels in that image with its corresponding pixel value.
Numpy also provides you a function, np.histogram(). So instead of calcHist() function, you can try below line :
hist is same as we calculated before. But bins will have 257 elements, because Numpy calculates bins as 00.99, 11.99, 22.99 etc. So final range would be 255255.99. To represent that, they also add 256 at end of bins. But we don't need that 256. Upto 255 is sufficient.
Now we should plot histograms, but how?
There are two ways for this,
Matplotlib comes with a histogram plotting function : matplotlib.pyplot.hist()
It directly finds the histogram and plot it. You need not use calcHist() or np.histogram() function to find the histogram. See the code below:
You will get a plot as below :
Or you can use normal plot of matplotlib, which would be good for BGR plot. For that, you need to find the histogram data first. Try below code:
Result:
You can deduct from the above graph that, blue has some high value areas in the image (obviously it should be due to the sky)
Well, here you adjust the values of histograms along with its bin values to look like x,y coordinates so that you can draw it using cv2.line() or cv2.polyline() function to generate same image as above. This is already available with OpenCVPython2 official samples. Check the code at samples/python/hist.py.
We used cv2.calcHist() to find the histogram of the full image. What if you want to find histograms of some regions of an image? Just create a mask image with white color on the region you want to find histogram and black otherwise. Then pass this as the mask.
See the result. In the histogram plot, blue line shows histogram of full image while green line shows histogram of masked region.