tess-two OCR没有正确解码

我已经按照教程来获取Tesseract,特别是苔丝二和眼睛二安装和我的Android应用程序的一部分。tess-two OCR没有正确解码

它运行,但从 baseApi.getUTF8Text();返回的OCR文本是完整的乱码。

BitmapFactory.Options options = new BitmapFactory.Options(); 

options.inSampleSize = 4;

Bitmap bmp = BitmapFactory.decodeFile(path , options);

receipt.setImageBitmap(bmp);

try {

ExifInterface exif = new ExifInterface(path);

int exifOrientation = exif.getAttributeInt(ExifInterface.TAG_ORIENTATION , ExifInterface.ORIENTATION_NORMAL);

int rotate = 0;

switch (exifOrientation) {

case ExifInterface.ORIENTATION_ROTATE_90: rotate = 90; break;

case ExifInterface.ORIENTATION_ROTATE_180: rotate = 180; break;

case ExifInterface.ORIENTATION_ROTATE_270: rotate = 270; break;

}

if (rotate != 0) {

int w = bmp.getWidth();

int h = bmp.getHeight();

Matrix matrix = new Matrix();

matrix.preRotate(rotate);

bmp = Bitmap.createBitmap(bmp, 0, 0, w, h, matrix, false);

}

bmp = bmp.copy(Bitmap.Config.ARGB_8888, true);

TessBaseAPI baseApi = new TessBaseAPI();

baseApi.init(DATA_PATH , "eng");

baseApi.setImage(bmp);

String OCRText = baseApi.getUTF8Text();

baseApi.end();

Log.i("OCR Text", "rotate " + rotate);

Log.i("OCR Text", "OCR ");

Log.i("OCR Text", OCRText);

Log.i("OCR Text", "=======================================================================================");

拍摄具有OCR字符 返回

05-14 11:01:59.131: I/OCR Text(18199): rotate 90 

05-14 11:01:59.131: I/OCR Text(18199): OCR

05-14 11:01:59.131: I/OCR Text(18199): 4— ‘ ‘

05-14 11:01:59.131: I/OCR Text(18199): \Dxfi ‘

05-14 11:01:59.131: I/OCR Text(18199): I W man"! no Accounv

05-14 11:01:59.131: I/OCR Text(18199): 1’

05-14 11:01:59.131: I/OCR Text(18199): my... «unblm m. mm.

05-14 11:01:59.131: I/OCR Text(18199): :~A

05-14 11:01:59.131: I/OCR Text(18199): «Ln.

05-14 11:01:59.131: I/OCR Text(18199): ‘ “w “IN. N I “H‘M‘

05-14 11:01:59.131: I/OCR Text(18199): mmnwnmw- .; k. '

05-14 11:01:59.131: I/OCR Text(18199): Wilt-run”. uni” nl

05-14 11:01:59.131: I/OCR Text(18199): mam. I

05-14 11:01:59.131: I/OCR Text(18199): =======================================================================================

如何清理和纠正OCR识别任何意见支票?使用 设备是三星Galaxy 7"

回答:

您可以使用类似

OCRText = OCRText.replaceAll("[^a-zA-Z0-9]+", " "); 

OCRText = OCRText.trim();

它是基于一个正方体实现我发现这里:SimpleAndroidOCRActivity.java

以上是 tess-two OCR没有正确解码 的全部内容, 来源链接: utcz.com/qa/263354.html

回到顶部