OpenCV  5.0.0alpha
Open Source Computer Vision
Loading...
Searching...
No Matches
cv::ml::DTrees Class Referenceabstract

The class represents a single decision tree or a collection of decision trees. More...

#include <opencv2/ml.hpp>

Collaboration diagram for cv::ml::DTrees:

Classes

class  Node
 The class represents a decision tree node. More...
 
class  Split
 The class represents split in a decision tree. More...
 

Public Types

enum  Flags {
  PREDICT_AUTO =0 ,
  PREDICT_SUM =(1<<8) ,
  PREDICT_MAX_VOTE =(2<<8) ,
  PREDICT_MASK =(3<<8)
}
 
- Public Types inherited from cv::ml::StatModel
enum  Flags {
  UPDATE_MODEL = 1 ,
  RAW_OUTPUT =1 ,
  COMPRESSED_INPUT =2 ,
  PREPROCESSED_INPUT =4
}
 

Public Member Functions

virtual int getCVFolds () const =0
 
virtual int getMaxCategories () const =0
 
virtual int getMaxDepth () const =0
 
virtual int getMinSampleCount () const =0
 
virtual const std::vector< Node > & getNodes () const =0
 Returns all the nodes.
 
virtual cv::Mat getPriors () const =0
 The array of a priori class probabilities, sorted by the class label value.
 
virtual float getRegressionAccuracy () const =0
 
virtual const std::vector< int > & getRoots () const =0
 Returns indices of root nodes.
 
virtual const std::vector< Split > & getSplits () const =0
 Returns all the splits.
 
virtual const std::vector< int > & getSubsets () const =0
 Returns all the bitsets for categorical splits.
 
virtual bool getTruncatePrunedTree () const =0
 
virtual bool getUse1SERule () const =0
 
virtual bool getUseSurrogates () const =0
 
virtual void setCVFolds (int val)=0
 
virtual void setMaxCategories (int val)=0
 
virtual void setMaxDepth (int val)=0
 
virtual void setMinSampleCount (int val)=0
 
virtual void setPriors (const cv::Mat &val)=0
 The array of a priori class probabilities, sorted by the class label value.
 
virtual void setRegressionAccuracy (float val)=0
 
virtual void setTruncatePrunedTree (bool val)=0
 
virtual void setUse1SERule (bool val)=0
 
virtual void setUseSurrogates (bool val)=0
 
- Public Member Functions inherited from cv::ml::StatModel
virtual float calcError (const Ptr< TrainData > &data, bool test, OutputArray resp) const
 Computes error on the training or test dataset.
 
virtual bool empty () const CV_OVERRIDE
 Returns true if the Algorithm is empty (e.g. in the very beginning or after unsuccessful read.
 
virtual int getVarCount () const =0
 Returns the number of variables in training samples.
 
virtual bool isClassifier () const =0
 Returns true if the model is classifier.
 
virtual bool isTrained () const =0
 Returns true if the model is trained.
 
virtual float predict (InputArray samples, OutputArray results=noArray(), int flags=0) const =0
 Predicts response(s) for the provided sample(s)
 
virtual bool train (const Ptr< TrainData > &trainData, int flags=0)
 Trains the statistical model.
 
virtual bool train (InputArray samples, int layout, InputArray responses)
 Trains the statistical model.
 
- Public Member Functions inherited from cv::Algorithm
 Algorithm ()
 
virtual ~Algorithm ()
 
virtual void clear ()
 Clears the algorithm state.
 
virtual String getDefaultName () const
 
virtual void read (const FileNode &fn)
 Reads algorithm parameters from a file storage.
 
virtual void save (const String &filename) const
 
virtual void write (FileStorage &fs) const
 Stores algorithm parameters in a file storage.
 
void write (FileStorage &fs, const String &name) const
 

Static Public Member Functions

static Ptr< DTreescreate ()
 Creates the empty model.
 
static Ptr< DTreesload (const String &filepath, const String &nodeName=String())
 Loads and creates a serialized DTrees from a file.
 
- Static Public Member Functions inherited from cv::ml::StatModel
template<typename _Tp >
static Ptr< _Tptrain (const Ptr< TrainData > &data, int flags=0)
 Create and train model with default parameters.
 
- Static Public Member Functions inherited from cv::Algorithm
template<typename _Tp >
static Ptr< _Tpload (const String &filename, const String &objname=String())
 Loads algorithm from the file.
 
template<typename _Tp >
static Ptr< _TploadFromString (const String &strModel, const String &objname=String())
 Loads algorithm from a String.
 
template<typename _Tp >
static Ptr< _Tpread (const FileNode &fn)
 Reads algorithm from the file node.
 

Additional Inherited Members

- Protected Member Functions inherited from cv::Algorithm
void writeFormat (FileStorage &fs) const
 

Detailed Description

The class represents a single decision tree or a collection of decision trees.

The current public interface of the class allows user to train only a single decision tree, however the class is capable of storing multiple decision trees and using them for prediction (by summing responses or using a voting schemes), and the derived from DTrees classes (such as RTrees and Boost) use this capability to implement decision tree ensembles.

See also
Decision Trees

Member Enumeration Documentation

◆ Flags

Predict options

Enumerator
PREDICT_AUTO 
PREDICT_SUM 
PREDICT_MAX_VOTE 
PREDICT_MASK 

Member Function Documentation

◆ create()

static Ptr< DTrees > cv::ml::DTrees::create ( )
static
Python:
cv.ml.DTrees.create() -> retval
cv.ml.DTrees_create() -> retval

Creates the empty model.

The static method creates empty decision tree with the specified parameters. It should be then trained using train method (see StatModel::train). Alternatively, you can load the model from file using Algorithm::load<DTrees>(filename).

◆ getCVFolds()

virtual int cv::ml::DTrees::getCVFolds ( ) const
pure virtual
Python:
cv.ml.DTrees.getCVFolds() -> retval

If CVFolds > 1 then algorithms prunes the built decision tree using K-fold cross-validation procedure where K is equal to CVFolds. Default value is 10.

See also
setCVFolds

◆ getMaxCategories()

virtual int cv::ml::DTrees::getMaxCategories ( ) const
pure virtual
Python:
cv.ml.DTrees.getMaxCategories() -> retval

Cluster possible values of a categorical variable into K<=maxCategories clusters to find a suboptimal split. If a discrete variable, on which the training procedure tries to make a split, takes more than maxCategories values, the precise best subset estimation may take a very long time because the algorithm is exponential. Instead, many decision trees engines (including our implementation) try to find sub-optimal split in this case by clustering all the samples into maxCategories clusters that is some categories are merged together. The clustering is applied only in n > 2-class classification problems for categorical variables with N > max_categories possible values. In case of regression and 2-class classification the optimal split can be found efficiently without employing clustering, thus the parameter is not used in these cases. Default value is 10.

See also
setMaxCategories

◆ getMaxDepth()

virtual int cv::ml::DTrees::getMaxDepth ( ) const
pure virtual
Python:
cv.ml.DTrees.getMaxDepth() -> retval

The maximum possible depth of the tree. That is the training algorithms attempts to split a node while its depth is less than maxDepth. The root node has zero depth. The actual depth may be smaller if the other termination criteria are met (see the outline of the training procedure here), and/or if the tree is pruned. Default value is INT_MAX.

See also
setMaxDepth

◆ getMinSampleCount()

virtual int cv::ml::DTrees::getMinSampleCount ( ) const
pure virtual
Python:
cv.ml.DTrees.getMinSampleCount() -> retval

If the number of samples in a node is less than this parameter then the node will not be split.

Default value is 10.

See also
setMinSampleCount

◆ getNodes()

virtual const std::vector< Node > & cv::ml::DTrees::getNodes ( ) const
pure virtual

Returns all the nodes.

all the node indices are indices in the returned vector

◆ getPriors()

virtual cv::Mat cv::ml::DTrees::getPriors ( ) const
pure virtual
Python:
cv.ml.DTrees.getPriors() -> retval

The array of a priori class probabilities, sorted by the class label value.

The parameter can be used to tune the decision tree preferences toward a certain class. For example, if you want to detect some rare anomaly occurrence, the training base will likely contain much more normal cases than anomalies, so a very good classification performance will be achieved just by considering every case as normal. To avoid this, the priors can be specified, where the anomaly probability is artificially increased (up to 0.5 or even greater), so the weight of the misclassified anomalies becomes much bigger, and the tree is adjusted properly.

You can also think about this parameter as weights of prediction categories which determine relative weights that you give to misclassification. That is, if the weight of the first category is 1 and the weight of the second category is 10, then each mistake in predicting the second category is equivalent to making 10 mistakes in predicting the first category. Default value is empty Mat.

See also
setPriors

◆ getRegressionAccuracy()

virtual float cv::ml::DTrees::getRegressionAccuracy ( ) const
pure virtual
Python:
cv.ml.DTrees.getRegressionAccuracy() -> retval

Termination criteria for regression trees. If all absolute differences between an estimated value in a node and values of train samples in this node are less than this parameter then the node will not be split further. Default value is 0.01f

See also
setRegressionAccuracy

◆ getRoots()

virtual const std::vector< int > & cv::ml::DTrees::getRoots ( ) const
pure virtual

Returns indices of root nodes.

◆ getSplits()

virtual const std::vector< Split > & cv::ml::DTrees::getSplits ( ) const
pure virtual

Returns all the splits.

all the split indices are indices in the returned vector

◆ getSubsets()

virtual const std::vector< int > & cv::ml::DTrees::getSubsets ( ) const
pure virtual

Returns all the bitsets for categorical splits.

Split::subsetOfs is an offset in the returned vector

◆ getTruncatePrunedTree()

virtual bool cv::ml::DTrees::getTruncatePrunedTree ( ) const
pure virtual
Python:
cv.ml.DTrees.getTruncatePrunedTree() -> retval

If true then pruned branches are physically removed from the tree. Otherwise they are retained and it is possible to get results from the original unpruned (or pruned less aggressively) tree. Default value is true.

See also
setTruncatePrunedTree

◆ getUse1SERule()

virtual bool cv::ml::DTrees::getUse1SERule ( ) const
pure virtual
Python:
cv.ml.DTrees.getUse1SERule() -> retval

If true then a pruning will be harsher. This will make a tree more compact and more resistant to the training data noise but a bit less accurate. Default value is true.

See also
setUse1SERule

◆ getUseSurrogates()

virtual bool cv::ml::DTrees::getUseSurrogates ( ) const
pure virtual
Python:
cv.ml.DTrees.getUseSurrogates() -> retval

If true then surrogate splits will be built. These splits allow to work with missing data and compute variable importance correctly. Default value is false.

Note
currently it's not implemented.
See also
setUseSurrogates

◆ load()

static Ptr< DTrees > cv::ml::DTrees::load ( const String & filepath,
const String & nodeName = String() )
static
Python:
cv.ml.DTrees.load(filepath[, nodeName]) -> retval
cv.ml.DTrees_load(filepath[, nodeName]) -> retval

Loads and creates a serialized DTrees from a file.

Use DTree::save to serialize and store an DTree to disk. Load the DTree from this file again, by calling this function with the path to the file. Optionally specify the node for the file containing the classifier

Parameters
filepathpath to serialized DTree
nodeNamename of node containing the classifier

◆ setCVFolds()

virtual void cv::ml::DTrees::setCVFolds ( int val)
pure virtual
Python:
cv.ml.DTrees.setCVFolds(val) -> None

See also
getCVFolds

◆ setMaxCategories()

virtual void cv::ml::DTrees::setMaxCategories ( int val)
pure virtual
Python:
cv.ml.DTrees.setMaxCategories(val) -> None

◆ setMaxDepth()

virtual void cv::ml::DTrees::setMaxDepth ( int val)
pure virtual
Python:
cv.ml.DTrees.setMaxDepth(val) -> None

See also
getMaxDepth

◆ setMinSampleCount()

virtual void cv::ml::DTrees::setMinSampleCount ( int val)
pure virtual
Python:
cv.ml.DTrees.setMinSampleCount(val) -> None

◆ setPriors()

virtual void cv::ml::DTrees::setPriors ( const cv::Mat & val)
pure virtual
Python:
cv.ml.DTrees.setPriors(val) -> None

The array of a priori class probabilities, sorted by the class label value.

See also
getPriors

◆ setRegressionAccuracy()

virtual void cv::ml::DTrees::setRegressionAccuracy ( float val)
pure virtual
Python:
cv.ml.DTrees.setRegressionAccuracy(val) -> None

◆ setTruncatePrunedTree()

virtual void cv::ml::DTrees::setTruncatePrunedTree ( bool val)
pure virtual
Python:
cv.ml.DTrees.setTruncatePrunedTree(val) -> None

◆ setUse1SERule()

virtual void cv::ml::DTrees::setUse1SERule ( bool val)
pure virtual
Python:
cv.ml.DTrees.setUse1SERule(val) -> None

See also
getUse1SERule

◆ setUseSurrogates()

virtual void cv::ml::DTrees::setUseSurrogates ( bool val)
pure virtual
Python:
cv.ml.DTrees.setUseSurrogates(val) -> None

The documentation for this class was generated from the following file: