Package org.opencv.text
Class OCRTesseract
- java.lang.Object
-
- org.opencv.text.BaseOCR
-
- org.opencv.text.OCRTesseract
-
public class OCRTesseract extends BaseOCR
OCRTesseract class provides an interface with the tesseract-ocr API (v3.02.02) in C++. Notice that it is compiled only when tesseract-ocr is correctly installed. Note:-
(C++) An example of OCRTesseract recognition combined with scene text detection can be found
at the end_to_end_recognition demo:
<https://github.com/opencv/opencv_contrib/blob/master/modules/text/samples/end_to_end_recognition.cpp>
- (C++) Another example of OCRTesseract recognition combined with scene text detection can be found at the webcam_demo: <https://github.com/opencv/opencv_contrib/blob/master/modules/text/samples/webcam_demo.cpp>
-
(C++) An example of OCRTesseract recognition combined with scene text detection can be found
at the end_to_end_recognition demo:
<https://github.com/opencv/opencv_contrib/blob/master/modules/text/samples/end_to_end_recognition.cpp>
-
-
Constructor Summary
Constructors Modifier Constructor Description protected
OCRTesseract(long addr)
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description static OCRTesseract
__fromPtr__(long addr)
static OCRTesseract
create()
Creates an instance of the OCRTesseract class.static OCRTesseract
create(java.lang.String datapath)
Creates an instance of the OCRTesseract class.static OCRTesseract
create(java.lang.String datapath, java.lang.String language)
Creates an instance of the OCRTesseract class.static OCRTesseract
create(java.lang.String datapath, java.lang.String language, java.lang.String char_whitelist)
Creates an instance of the OCRTesseract class.static OCRTesseract
create(java.lang.String datapath, java.lang.String language, java.lang.String char_whitelist, int oem)
Creates an instance of the OCRTesseract class.static OCRTesseract
create(java.lang.String datapath, java.lang.String language, java.lang.String char_whitelist, int oem, int psmode)
Creates an instance of the OCRTesseract class.protected void
finalize()
java.lang.String
run(Mat image, int min_confidence)
Recognize text using the tesseract-ocr API.java.lang.String
run(Mat image, int min_confidence, int component_level)
Recognize text using the tesseract-ocr API.java.lang.String
run(Mat image, Mat mask, int min_confidence)
java.lang.String
run(Mat image, Mat mask, int min_confidence, int component_level)
void
setWhiteList(java.lang.String char_whitelist)
-
Methods inherited from class org.opencv.text.BaseOCR
getNativeObjAddr
-
-
-
-
Method Detail
-
__fromPtr__
public static OCRTesseract __fromPtr__(long addr)
-
create
public static OCRTesseract create(java.lang.String datapath, java.lang.String language, java.lang.String char_whitelist, int oem, int psmode)
Creates an instance of the OCRTesseract class. Initializes Tesseract.- Parameters:
datapath
- the name of the parent directory of tessdata ended with "/", or NULL to use the system's default directory.language
- an ISO 639-3 code or NULL will default to "eng".char_whitelist
- specifies the list of characters used for recognition. NULL defaults to "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ".oem
- tesseract-ocr offers different OCR Engine Modes (OEM), by default tesseract::OEM_DEFAULT is used. See the tesseract-ocr API documentation for other possible values.psmode
- tesseract-ocr offers different Page Segmentation Modes (PSM) tesseract::PSM_AUTO (fully automatic layout analysis) is used. See the tesseract-ocr API documentation for other possible values.- Returns:
- automatically generated
-
create
public static OCRTesseract create(java.lang.String datapath, java.lang.String language, java.lang.String char_whitelist, int oem)
Creates an instance of the OCRTesseract class. Initializes Tesseract.- Parameters:
datapath
- the name of the parent directory of tessdata ended with "/", or NULL to use the system's default directory.language
- an ISO 639-3 code or NULL will default to "eng".char_whitelist
- specifies the list of characters used for recognition. NULL defaults to "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ".oem
- tesseract-ocr offers different OCR Engine Modes (OEM), by default tesseract::OEM_DEFAULT is used. See the tesseract-ocr API documentation for other possible values. (fully automatic layout analysis) is used. See the tesseract-ocr API documentation for other possible values.- Returns:
- automatically generated
-
create
public static OCRTesseract create(java.lang.String datapath, java.lang.String language, java.lang.String char_whitelist)
Creates an instance of the OCRTesseract class. Initializes Tesseract.- Parameters:
datapath
- the name of the parent directory of tessdata ended with "/", or NULL to use the system's default directory.language
- an ISO 639-3 code or NULL will default to "eng".char_whitelist
- specifies the list of characters used for recognition. NULL defaults to "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ". tesseract::OEM_DEFAULT is used. See the tesseract-ocr API documentation for other possible values. (fully automatic layout analysis) is used. See the tesseract-ocr API documentation for other possible values.- Returns:
- automatically generated
-
create
public static OCRTesseract create(java.lang.String datapath, java.lang.String language)
Creates an instance of the OCRTesseract class. Initializes Tesseract.- Parameters:
datapath
- the name of the parent directory of tessdata ended with "/", or NULL to use the system's default directory.language
- an ISO 639-3 code or NULL will default to "eng". "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ". tesseract::OEM_DEFAULT is used. See the tesseract-ocr API documentation for other possible values. (fully automatic layout analysis) is used. See the tesseract-ocr API documentation for other possible values.- Returns:
- automatically generated
-
create
public static OCRTesseract create(java.lang.String datapath)
Creates an instance of the OCRTesseract class. Initializes Tesseract.- Parameters:
datapath
- the name of the parent directory of tessdata ended with "/", or NULL to use the system's default directory. "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ". tesseract::OEM_DEFAULT is used. See the tesseract-ocr API documentation for other possible values. (fully automatic layout analysis) is used. See the tesseract-ocr API documentation for other possible values.- Returns:
- automatically generated
-
create
public static OCRTesseract create()
Creates an instance of the OCRTesseract class. Initializes Tesseract. system's default directory. "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ". tesseract::OEM_DEFAULT is used. See the tesseract-ocr API documentation for other possible values. (fully automatic layout analysis) is used. See the tesseract-ocr API documentation for other possible values.- Returns:
- automatically generated
-
run
public java.lang.String run(Mat image, int min_confidence, int component_level)
Recognize text using the tesseract-ocr API. Takes image on input and returns recognized text in the output_text parameter. Optionally provides also the Rects for individual text elements found (e.g. words), and the list of those text elements with their confidence values.- Parameters:
image
- Input image CV_8UC1 or CV_8UC3 text elements found (e.g. words or text lines). recognition of individual text elements found (e.g. words or text lines). for the recognition of individual text elements found (e.g. words or text lines).component_level
- OCR_LEVEL_WORD (by default), or OCR_LEVEL_TEXTLINE.min_confidence
- automatically generated- Returns:
- automatically generated
-
run
public java.lang.String run(Mat image, int min_confidence)
Recognize text using the tesseract-ocr API. Takes image on input and returns recognized text in the output_text parameter. Optionally provides also the Rects for individual text elements found (e.g. words), and the list of those text elements with their confidence values.- Parameters:
image
- Input image CV_8UC1 or CV_8UC3 text elements found (e.g. words or text lines). recognition of individual text elements found (e.g. words or text lines). for the recognition of individual text elements found (e.g. words or text lines).min_confidence
- automatically generated- Returns:
- automatically generated
-
setWhiteList
public void setWhiteList(java.lang.String char_whitelist)
-
-