OCRTesseract class provides an interface with the tesseract-ocr API (v3.02.02) in C++.
More...
#include <opencv2/text/ocr.hpp>
|
String | run (InputArray image, InputArray mask, int min_confidence, int component_level=0) |
|
String | run (InputArray image, int min_confidence, int component_level=0) |
|
virtual void | run (Mat &image, Mat &mask, std::string &output_text, std::vector< Rect > *component_rects=NULL, std::vector< std::string > *component_texts=NULL, std::vector< float > *component_confidences=NULL, int component_level=0) CV_OVERRIDE |
|
virtual void | run (Mat &image, std::string &output_text, std::vector< Rect > *component_rects=NULL, std::vector< std::string > *component_texts=NULL, std::vector< float > *component_confidences=NULL, int component_level=0) CV_OVERRIDE |
| Recognize text using the tesseract-ocr API.
|
|
virtual void | setWhiteList (const String &char_whitelist)=0 |
|
virtual | ~BaseOCR () |
|
OCRTesseract class provides an interface with the tesseract-ocr API (v3.02.02) in C++.
Notice that it is compiled only when tesseract-ocr is correctly installed.
- Note
-
◆ create()
static Ptr< OCRTesseract > cv::text::OCRTesseract::create |
( |
const char * | datapath = NULL, |
|
|
const char * | language = NULL, |
|
|
const char * | char_whitelist = NULL, |
|
|
int | oem = OEM_DEFAULT, |
|
|
int | psmode = PSM_AUTO ) |
|
static |
Python: |
---|
| cv.text.OCRTesseract.create( | [, datapath[, language[, char_whitelist[, oem[, psmode]]]]] | ) -> | retval |
| cv.text.OCRTesseract_create( | [, datapath[, language[, char_whitelist[, oem[, psmode]]]]] | ) -> | retval |
Creates an instance of the OCRTesseract class. Initializes Tesseract.
- Parameters
-
datapath | the name of the parent directory of tessdata ended with "/", or NULL to use the system's default directory. |
language | an ISO 639-3 code or NULL will default to "eng". |
char_whitelist | specifies the list of characters used for recognition. NULL defaults to "" (All characters will be used for recognition). |
oem | tesseract-ocr offers different OCR Engine Modes (OEM), by default tesseract::OEM_DEFAULT is used. See the tesseract-ocr API documentation for other possible values. |
psmode | tesseract-ocr offers different Page Segmentation Modes (PSM) tesseract::PSM_AUTO (fully automatic layout analysis) is used. See the tesseract-ocr API documentation for other possible values. |
- Note
- The char_whitelist default is changed after OpenCV 4.7.0/3.19.0 from "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ" to "".
◆ run() [1/4]
Python: |
---|
| cv.text.OCRTesseract.run( | image, min_confidence[, component_level] | ) -> | retval |
| cv.text.OCRTesseract.run( | image, mask, min_confidence[, component_level] | ) -> | retval |
◆ run() [2/4]
String cv::text::OCRTesseract::run |
( |
InputArray | image, |
|
|
int | min_confidence, |
|
|
int | component_level = 0 ) |
Python: |
---|
| cv.text.OCRTesseract.run( | image, min_confidence[, component_level] | ) -> | retval |
| cv.text.OCRTesseract.run( | image, mask, min_confidence[, component_level] | ) -> | retval |
◆ run() [3/4]
virtual void cv::text::OCRTesseract::run |
( |
Mat & | image, |
|
|
Mat & | mask, |
|
|
std::string & | output_text, |
|
|
std::vector< Rect > * | component_rects = NULL, |
|
|
std::vector< std::string > * | component_texts = NULL, |
|
|
std::vector< float > * | component_confidences = NULL, |
|
|
int | component_level = 0 ) |
|
virtual |
Python: |
---|
| cv.text.OCRTesseract.run( | image, min_confidence[, component_level] | ) -> | retval |
| cv.text.OCRTesseract.run( | image, mask, min_confidence[, component_level] | ) -> | retval |
◆ run() [4/4]
virtual void cv::text::OCRTesseract::run |
( |
Mat & | image, |
|
|
std::string & | output_text, |
|
|
std::vector< Rect > * | component_rects = NULL, |
|
|
std::vector< std::string > * | component_texts = NULL, |
|
|
std::vector< float > * | component_confidences = NULL, |
|
|
int | component_level = 0 ) |
|
virtual |
Python: |
---|
| cv.text.OCRTesseract.run( | image, min_confidence[, component_level] | ) -> | retval |
| cv.text.OCRTesseract.run( | image, mask, min_confidence[, component_level] | ) -> | retval |
Recognize text using the tesseract-ocr API.
Takes image on input and returns recognized text in the output_text parameter. Optionally provides also the Rects for individual text elements found (e.g. words), and the list of those text elements with their confidence values.
- Parameters
-
image | Input image CV_8UC1 or CV_8UC3 |
output_text | Output text of the tesseract-ocr. |
component_rects | If provided the method will output a list of Rects for the individual text elements found (e.g. words or text lines). |
component_texts | If provided the method will output a list of text strings for the recognition of individual text elements found (e.g. words or text lines). |
component_confidences | If provided the method will output a list of confidence values for the recognition of individual text elements found (e.g. words or text lines). |
component_level | OCR_LEVEL_WORD (by default), or OCR_LEVEL_TEXTLINE. |
Implements cv::text::BaseOCR.
◆ setWhiteList()
virtual void cv::text::OCRTesseract::setWhiteList |
( |
const String & | char_whitelist | ) |
|
|
pure virtual |
Python: |
---|
| cv.text.OCRTesseract.setWhiteList( | char_whitelist | ) -> | None |
The documentation for this class was generated from the following file: