Tessaract Ocr read Specific location pdf

KhalidAnsari · January 8, 2019

Hi,

I need to read specific location from .pdf that has digital signature on it. I have gone through forum related this, I have seen screen capture and Tessaract example. I don't known how to get exact location.

Please help me out with this.

Thanks,

Khalid Ansari

FrancescoDiMuro · January 8, 2019

@KhalidAnsari
Digging a little bit on the Internet, you can read data from a PDF file without using Tesseract OCR.
You can directly use VBA (translate it to AutoIt)

KhalidAnsari · January 8, 2019

Hi @FrancescoDiMuro

Thanks for reply. I have converted this .net but i need to read digital signature that is in image format. So i though i should use Tessaract and screen capture.

Any other help will be appreciated.

Thanks,

Khalid

FrancescoDiMuro · January 8, 2019

@KhalidAnsari
Then you can take a look at this UDF (or something similiar available on the Forum)

KhalidAnsari · January 9, 2019

Hi @FrancescoDiMuro

Thanks for reply. I can able to find the position through screen capture UDF. Out i am getting in Junk character.

Actual text = "Login For Chat"

Output = "Ln-gin fur Chat |"

Am I missing any character recognition for English?

Following are the tessdata file detail

image.png.531e0809c01f84be2144e6bb2a182ade.png

Thanks,

Khalid

Edited January 9, 2019 by KhalidAnsari

FrancescoDiMuro · January 9, 2019

@KhalidAnsari
How do you call the Tesseract OCR from your script?

KhalidAnsari · January 9, 2019

Hi

Thanks for quick reply.

I am calling it this way

Local $TESS_EXE="C:\Program Files (x86)\Tesseract-OCR\tesseract.exe"

FrancescoDiMuro · January 9, 2019

@KhalidAnsari
Could you please post the entire code you are using?

KhalidAnsari · January 9, 2019

@FrancescoDiMuro

Attached is my sample file which i am using.OcrScreenCapture.au3

Thanks,

Khalid

FrancescoDiMuro · January 9, 2019

@KhalidAnsari
As mentioned in the Tesseract OCR Wiki, try to use the parameter -l:

$TESS_PARAMS= $TIF_FILENAME & " output -l eng"

Edited January 9, 2019 by FrancescoDiMuro

KhalidAnsari · January 9, 2019

@FrancescoDiMuro

Thanks for quick reply.

I m making change then I post my output.

Thanks,

Khalid

KhalidAnsari · January 9, 2019

Hi @FrancescoDiMuro

Same out put after changing that line.

$TESS_PARAMS= $TIF_FILENAME & " output -l eng" . Below is my tiff image.

thanks

Khalidtest.tif

Edited January 9, 2019 by KhalidAnsari

FrancescoDiMuro · January 9, 2019

@KhalidAnsari
Sorry if I ask, but could you provide the .pdf or the download link?

KhalidAnsari · January 9, 2019

Hi @FrancescoDiMuro

Thanks for reply. I really appreciate ur effort.

Following this I am doing

1. Read the Application Title name. Based on that I will use winWait till that application window load.

2. Then i need to pass data from application.

3. Then i need to open pdf file and again i need to read OCr detail for specific location of pdf .

Point 1 and 3 are for OCR.

Attaching sample pdf which don't have digital signature.

Thanks,

Khalid

invoice1.pdf

Edited January 9, 2019 by KhalidAnsari

KhalidAnsari · January 12, 2019

Hi

Any suggestions or help

Thanks

FrancescoDiMuro · January 12, 2019

@KhalidAnsari

A little research on the Forum

KhalidAnsari · January 22, 2019

Hi @FrancescoDiMuro

I have search the forum and able to read the pdf and tiff file. I am facing difficulty while read this type of hand written scanned pdf data. Out put result is in txt file. Please review it

image.png.0d52b65488ecd252f36c3b1a7d5c600c.png

Result.txt

Edited January 22, 2019 by KhalidAnsari

KhalidAnsari · January 24, 2019

Any help please.

Thanks,

Khalid

KhalidAnsari · February 7, 2019

Hi @FrancescoDiMuro

Any suggestion will be appreciated

Thanks,

FrancescoDiMuro · February 7, 2019

@KhalidAnsari

Tesseract OCR is a free OCR software, so, it has its limits.

If you want more from you're OCR software, you may spend some money with a kore reliable (and non-free) software, which could help you with this kind of recognition

Sign In

Tessaract Ocr read Specific location pdf

Recommended Posts

KhalidAnsari

FrancescoDiMuro

KhalidAnsari

FrancescoDiMuro

KhalidAnsari

FrancescoDiMuro

KhalidAnsari

FrancescoDiMuro

KhalidAnsari

FrancescoDiMuro

KhalidAnsari

KhalidAnsari

FrancescoDiMuro

KhalidAnsari

KhalidAnsari

FrancescoDiMuro

KhalidAnsari

KhalidAnsari

KhalidAnsari

FrancescoDiMuro

Create an account or sign in to comment

Create an account

Sign in

Similar Content

Au3 LibreOffice API Inspector tools

Could Someone Give me Advice with Automating a Dynamic Web Form in AutoIt?

Pal, Peter's AutoIt functions Library 1 2

ControlClick chooses the wrong context menu item when I provide position of the item

Script Fails when EXE, runs as au3

Browse

AutoIt Resources

Release

Beta