OpenCV  4.6.0
Open Source Computer Vision
Cascade Classifier Training

Prev Tutorial: Cascade Classifier

Next Tutorial: Introduction to Support Vector Machines

Introduction

Working with a boosted cascade of weak classifiers includes two major stages: the training and the detection stage. The detection stage using either HAAR or LBP based models, is described in the object detection tutorial. This documentation gives an overview of the functionality needed to train your own boosted cascade of weak classifiers. The current guide will walk through all the different stages: collecting training data, preparation of the training data and executing the actual model training.

To support this tutorial, several official OpenCV applications will be used: opencv_createsamples, opencv_annotation, opencv_traincascade and opencv_visualisation.

Note
Createsamples and traincascade are disabled since OpenCV 4.0. Consider using these apps for training from 3.4 branch for Cascade Classifier. Model format is the same between 3.4 and 4.x.

Important notes

Preparation of the training data

For training a boosted cascade of weak classifiers we need a set of positive samples (containing actual objects you want to detect) and a set of negative images (containing everything you do not want to detect). The set of negative samples must be prepared manually, whereas set of positive samples is created using the opencv_createsamples application.

Negative Samples

Negative samples are taken from arbitrary images, not containing objects you want to detect. These negative images, from which the samples are generated, should be listed in a special negative image file containing one image path per line (can be absolute or relative). Note that negative samples and sample images are also called background samples or background images, and are used interchangeably in this document.

Described images may be of different sizes. However, each image should be equal or larger than the desired training window size (which corresponds to the model dimensions, most of the times being the average size of your object), because these images are used to subsample a given negative image into several image samples having this training window size.

An example of such a negative description file:

Directory structure:

/img
img1.jpg
img2.jpg
bg.txt

File bg.txt:

img/img1.jpg
img/img2.jpg

Your set of negative window samples will be used to tell the machine learning step, boosting in this case, what not to look for, when trying to find your objects of interest.

Positive Samples

Positive samples are created by the opencv_createsamples application. They are used by the boosting process to define what the model should actually look for when trying to find your objects of interest. The application supports two ways of generating a positive sample dataset.

  1. You can generate a bunch of positives from a single positive object image.
  2. You can supply all the positives yourself and only use the tool to cut them out, resize them and put them in the opencv needed binary format.

While the first approach works decently for fixed objects, like very rigid logo's, it tends to fail rather soon for less rigid objects. In that case we do suggest to use the second approach. Many tutorials on the web even state that 100 real object images, can lead to a better model than 1000 artificially generated positives, by using the opencv_createsamples application. If you however do decide to take the first approach, keep some things in mind:

The first approach takes a single object image with for example a company logo and creates a large set of positive samples from the given object image by randomly rotating the object, changing the image intensity as well as placing the image on arbitrary backgrounds. The amount and range of randomness can be controlled by command line arguments of the opencv_createsamples application.

Command line arguments:

When running opencv_createsamples in this way, the following procedure is used to create a sample object instance: The given source image is rotated randomly around all three axes. The chosen angle is limited by -maxxangle, -maxyangle and -maxzangle. Then pixels having the intensity from the [bg_color-bg_color_threshold; bg_color+bg_color_threshold] range are interpreted as transparent. White noise is added to the intensities of the foreground. If the -inv key is specified then foreground pixel intensities are inverted. If -randinv key is specified then algorithm randomly selects whether inversion should be applied to this sample. Finally, the obtained image is placed onto an arbitrary background from the background description file, resized to the desired size specified by -w and -h and stored to the vec-file, specified by the -vec command line option.

Positive samples also may be obtained from a collection of previously marked up images, which is the desired way when building robust object models. This collection is described by a text file similar to the background description file. Each line of this file corresponds to an image. The first element of the line is the filename, followed by the number of object annotations, followed by numbers describing the coordinates of the objects bounding rectangles (x, y, width, height).

An example of description file:

Directory structure:

/img
img1.jpg
img2.jpg
info.dat

File info.dat:

img/img1.jpg 1 140 100 45 45
img/img2.jpg 2 100 200 50 50 50 30 25 25

Image img1.jpg contains single object instance with the following coordinates of bounding rectangle: (140, 100, 45, 45). Image img2.jpg contains two object instances.

In order to create positive samples from such collection, -info argument should be specified instead of -img:

Note that in this case, parameters like -bg, -bgcolor, -bgthreshold, -inv, -randinv, -maxxangle, -maxyangle, -maxzangle are simply ignored and not used anymore. The scheme of samples creation in this case is as follows. The object instances are taken from the given images, by cutting out the supplied bounding boxes from the original images. Then they are resized to target samples size (defined by -w and -h) and stored in output vec-file, defined by the -vec parameter. No distortion is applied, so the only affecting arguments are -w, -h, -show and -num.

The manual process of creating the -info file can also been done by using the opencv_annotation tool. This is an open source tool for visually selecting the regions of interest of your object instances in any given images. The following subsection will discuss in more detail on how to use this application.

Extra remarks

Using OpenCV's integrated annotation tool

Since OpenCV 3.x the community has been supplying and maintaining a open source annotation tool, used for generating the -info file. The tool can be accessed by the command opencv_annotation if the OpenCV applications where build.

Using the tool is quite straightforward. The tool accepts several required and some optional parameters:

Note that the optional parameters can only be used together. An example of a command that could be used can be seen below

opencv_annotation --annotations=/path/to/annotations/file.txt --images=/path/to/image/folder/

This command will fire up a window containing the first image and your mouse cursor which will be used for annotation. A video on how to use the annotation tool can be found here. Basically there are several keystrokes that trigger an action. The left mouse button is used to select the first corner of your object, then keeps drawing until you are fine, and stops when a second left mouse button click is registered. After each selection you have the following choices:

Finally you will end up with a usable annotation file that can be passed to the -info argument of opencv_createsamples.

Cascade Training

The next step is the actual training of the boosted cascade of weak classifiers, based on the positive and negative dataset that was prepared beforehand.

Command line arguments of opencv_traincascade application grouped by purposes:

After the opencv_traincascade application has finished its work, the trained cascade will be saved in cascade.xml file in the -data folder. Other files in this folder are created for the case of interrupted training, so you may delete them after completion of training.

Training is finished and you can test your cascade classifier!

Visualising Cascade Classifiers

From time to time it can be useful to visualise the trained cascade, to see which features it selected and how complex its stages are. For this OpenCV supplies a opencv_visualisation application. This application has the following commands:

An example command can be seen below

opencv_visualisation --image=/data/object.png --model=/data/model.xml --data=/data/result/

Some limitations of the current visualisation tool

Example of the HAAR/LBP face model ran on a given window of Angelina Jolie, which had the same preprocessing as cascade classifier files –>24x24 pixel image, grayscale conversion and histogram equalisation:

A video is made with for each stage each feature visualised:

visualisation_video.png

Each stage is stored as an image for future validation of the features:

visualisation_single_stage.png

This work was created for OpenCV 3 Blueprints by StevenPuttemans but Packt Publishing agreed integration into OpenCV.