2 posts in this topic
i have this picture here *attached* and this script here:
$ImageToReadPath = @MyDocumentsDir & "\GDIPlus_Image2.jpg" $ResultTextPath = @MyDocumentsDir & "\Result" $OutPutPath = $ResultTextPath & ".txt" $TesseractExePath = @MyDocumentsDir & "\Tesseract.exe" ShellExecuteWait($TesseractExePath, '"' & $ImageToReadPath & '" "' & $ResultTextPath & '"', "", "", @SW_HIDE) If @error Then Exit MsgBox(0, "Error", @error) EndIf MsgBox(0, "Result", FileRead($OutPutPath)) FileDelete($OutPutPath)
but tesseract doesnt recognized the correct word... and gives me trash back...
this is the image >>
and the result was >> "samm"
the image was an normal jpg and generated with this code here:
_ScreenCapture_Capture(@MyDocumentsDir & "\GDIPlus_Image2.jpg", 712,268,853,284)
Could anybody give me a hint what i can do better to get this easy image to text?
thank u very much!!!
Edit: i also tried to capture the screen as bmp with a higher resolution... nothing changed...
_ScreenCapture_SetBMPFormat(4) _ScreenCapture_Capture(@MyDocumentsDir & "\GDIPlus_Image.bmp", 712,279,853,295)
I'm trying to get Tesseract to work using the example script here: https://www.autoitscript.com/forum/topic/174483-tesseract-simple-example/ Downloading the script and running it with the example image just gives me a blank readout. Someone else had the same problem here: https://www.autoitscript.com/forum/topic/174476-single-dll-file-for-ocr/#comment-1263034 but doesn't provide an explanation of how they fixed it. Has anyone else experienced this problem and know of a fix?
There has been many questions about using tesseract of late.
Here is a very basic example which works for me, along with the exact version of standalone tesseract executable and English language data used
I found it some time ago at a time I thought I needed it, I do not recall from where.
$ImageToReadPath = @ScriptDir & "\image.bmp" $ResultTextPath = @ScriptDir & "\Result" $OutPutPath = $ResultTextPath & ".txt" $TesseractExePath = @ScriptDir & "\Tesseract.exe" ShellExecuteWait($TesseractExePath, '"' & $ImageToReadPath & '" "' & $ResultTextPath & '"', "", "", @SW_HIDE) If @error Then Exit MsgBox(0, "Error", @error) EndIf MsgBox(0, "Result", FileRead($OutPutPath)) FileDelete($OutPutPath) Some Answers:
The files contained in the download, only support English language.
From the only documentation I got with this version...
Original Binaries and Source can be found here: http://code.google.com/p/tesseract-ocr/ I do not know where to get other languages support.
I do not know if there is a later standalone version.
I do not know why it does not read your image accurately.
It does not have a virus in it.
You can search the forums or internet to learn how to create / cut / copy / paste, or otherwise manipulate your own images.
I've been trying this UDF from
I'm trying to get the text from a combobox in notepad font property but it only save a tiff file and does not return the array of texts.
ShellExecute("notepad.exe") WinWaitActive("Untitled - Notepad") $hWnd = "[CLASS:Edit; INSTANCE:1]" Send("!O") Send("F") WinWaitActive("Font") _TesseractControlFind("Font", "", "[CLASS:ComboBox; INSTANCE:5]", "Western", 1, 0, "", 1, 1, 1, 5, 2, 0, 0, 0, 0, 1)