Object Categorization

This section describes approaches based on local 2D features and used to categorize objects.

BOWTrainer

class BOWTrainer

Abstract base class for training the bag of visual words vocabulary from a set of descriptors. For details, see, for example, Visual Categorization with Bags of Keypoints by Gabriella Csurka, Christopher R. Dance, Lixin Fan, Jutta Willamowski, Cedric Bray, 2004.

class BOWTrainer
{
public:
    BOWTrainer(){}
    virtual ~BOWTrainer(){}

    void add( const Mat& descriptors );
    const vector<Mat>& getDescriptors() const;
    int descripotorsCount() const;

    virtual void clear();

    virtual Mat cluster() const = 0;
    virtual Mat cluster( const Mat& descriptors ) const = 0;

protected:
    ...
};

BOWTrainer::add

Adds descriptors to a training set.

C++: void BOWTrainer::add(const Mat& descriptors)
Parameters:
  • descriptors – Descriptors to add to a training set. Each row of the descriptors matrix is a descriptor.

The training set is clustered using clustermethod to construct the vocabulary.

BOWTrainer::getDescriptors

Returns a training set of descriptors.

C++: const vector<Mat>& BOWTrainer::getDescriptors() const

BOWTrainer::descripotorsCount

Returns the count of all descriptors stored in the training set.

C++: const vector<Mat>& BOWTrainer::descripotorsCount() const

BOWTrainer::cluster

Clusters train descriptors.

C++: Mat BOWTrainer::cluster() const
C++: Mat BOWTrainer::cluster(const Mat& descriptors) const
Parameters:
  • descriptors – Descriptors to cluster. Each row of the descriptors matrix is a descriptor. Descriptors are not added to the inner train descriptor set.

The vocabulary consists of cluster centers. So, this method returns the vocabulary. In the first variant of the method, train descriptors stored in the object are clustered. In the second variant, input descriptors are clustered.

BOWKMeansTrainer

class BOWKMeansTrainer

kmeans() -based class to train visual vocabulary using the bag of visual words approach.

class BOWKMeansTrainer : public BOWTrainer
{
public:
    BOWKMeansTrainer( int clusterCount, const TermCriteria& termcrit=TermCriteria(),
                      int attempts=3, int flags=KMEANS_PP_CENTERS );
    virtual ~BOWKMeansTrainer(){}

    // Returns trained vocabulary (i.e. cluster centers).
    virtual Mat cluster() const;
    virtual Mat cluster( const Mat& descriptors ) const;

protected:
    ...
};

BOWKMeansTrainer::BOWKMeansTrainer

The constructor.

C++: BOWKMeansTrainer::BOWKMeansTrainer(int clusterCount, const TermCriteria& termcrit=TermCriteria(), int attempts=3, int flags=KMEANS_PP_CENTERS )

See kmeans() function parameters.

BOWImgDescriptorExtractor

class BOWImgDescriptorExtractor

Class to compute an image descriptor using the bag of visual words. Such a computation consists of the following steps:

  1. Compute descriptors for a given image and its keypoints set.
  2. Find the nearest visual words from the vocabulary for each keypoint descriptor.
  3. Compute the bag-of-words image descriptor as is a normalized histogram of vocabulary words encountered in the image. The i-th bin of the histogram is a frequency of i-th word of the vocabulary in the given image.

The class declaration is the following:

class BOWImgDescriptorExtractor
{
public:
    BOWImgDescriptorExtractor( const Ptr<DescriptorExtractor>& dextractor,
                               const Ptr<DescriptorMatcher>& dmatcher );
    virtual ~BOWImgDescriptorExtractor(){}

    void setVocabulary( const Mat& vocabulary );
    const Mat& getVocabulary() const;
    void compute( const Mat& image, vector<KeyPoint>& keypoints,
                  Mat& imgDescriptor,
                  vector<vector<int> >* pointIdxsOfClusters=0,
                  Mat* descriptors=0 );
    int descriptorSize() const;
    int descriptorType() const;

protected:
    ...
};

BOWImgDescriptorExtractor::BOWImgDescriptorExtractor

The constructor.

C++: BOWImgDescriptorExtractor::BOWImgDescriptorExtractor(const Ptr<DescriptorExtractor>& dextractor, const Ptr<DescriptorMatcher>& dmatcher)
Parameters:
  • dextractor – Descriptor extractor that is used to compute descriptors for an input image and its keypoints.
  • dmatcher – Descriptor matcher that is used to find the nearest word of the trained vocabulary for each keypoint descriptor of the image.

BOWImgDescriptorExtractor::setVocabulary

Sets a visual vocabulary.

C++: void BOWImgDescriptorExtractor::setVocabulary(const Mat& vocabulary)
Parameters:
  • vocabulary – Vocabulary (can be trained using the inheritor of BOWTrainer ). Each row of the vocabulary is a visual word (cluster center).

BOWImgDescriptorExtractor::getVocabulary

Returns the set vocabulary.

C++: const Mat& BOWImgDescriptorExtractor::getVocabulary() const

BOWImgDescriptorExtractor::compute

Computes an image descriptor using the set visual vocabulary.

C++: void BOWImgDescriptorExtractor::compute(const Mat& image, vector<KeyPoint>& keypoints, Mat& imgDescriptor, vector<vector<int>>* pointIdxsOfClusters=0, Mat* descriptors=0 )
Parameters:
  • image – Image, for which the descriptor is computed.
  • keypoints – Keypoints detected in the input image.
  • imgDescriptor – Computed output image descriptor.
  • pointIdxsOfClusters – Indices of keypoints that belong to the cluster. This means that pointIdxsOfClusters[i] are keypoint indices that belong to the i -th cluster (word of vocabulary) returned if it is non-zero.
  • descriptors – Descriptors of the image keypoints that are returned if they are non-zero.

BOWImgDescriptorExtractor::descriptorSize

Returns an image discriptor size if the vocabulary is set. Otherwise, it returns 0.

C++: int BOWImgDescriptorExtractor::descriptorSize() const

BOWImgDescriptorExtractor::descriptorType

Returns an image descriptor type.

C++: int BOWImgDescriptorExtractor::descriptorType() const