OCR Tesseract wrong coords


I am using Tesseract UDF to try to get some letters and numbers from some screenshots i take with a project.

I managed somehow making it work but i am having some problems. This is my code, i have added the 2 functions needed for it to work in case you don't have tesseract UDF on your computer.

#include <Array.au3>
#include <File.au3>
#include <ScreenCapture.au3>
#include <ScrollBarConstants.au3>
#include <WindowsConstants.au3>

Global $last_capture
Global $tesseract_temp_path = @TempDir

$diff = GetLetters()
If @error Then
ConsoleWrite("Error: " & @error & @CRLF)
ConsoleWrite($diff & @CRLF)

Func GetLetters()
$mon = _TesseractScreenCapture(0, "", 1, 2, 1, 1, 1000, 1000, 0) ;Image number = 345213123
ConsoleWrite($mon & @CRLF)
$mon = CleanOCR($mon)
Return $mon
EndFunc ;==>GetLetters

Func CleanOCR($text)
$out = ""
For $i = 1 To StringLen($text)
$chr = Asc(StringMid($text, $i, 1))
If $chr >= 48 And $chr <= 57 Then $out = $out & Chr($chr)
If $chr = 79 Then $out = $out & '0'
Return $out
EndFunc ;==>CleanOCR

Func _TesseractScreenCapture($get_last_capture = 0, $delimiter = "", $cleanup = 1, $scale = 2, $left_indent = 0, $top_indent = 0, $right_indent = 0, $bottom_indent = 0, $show_capture = 0)

Local $tInfo
Dim $aArray, $final_ocr[1], $xyPos_old = -1, $capture_scale = 3
Local $tSCROLLINFO = DllStructCreate($tagSCROLLINFO)
DllStructSetData($tSCROLLINFO, "cbSize", DllStructGetSize($tSCROLLINFO))
DllStructSetData($tSCROLLINFO, "fMask", $SIF_ALL)

If $last_capture = "" Then

$last_capture = ObjCreate("Scripting.Dictionary")

; if last capture is requested, and one exists.
If $get_last_capture = 1 And $last_capture.item(0) <> "" Then

Return $last_capture.item(0)

$capture_filename = _TempFile($tesseract_temp_path, "~", ".tif")
$ocr_filename = StringLeft($capture_filename, StringLen($capture_filename) - 4)
$ocr_filename_and_ext = $ocr_filename & ".txt"

CaptureToTIFF("", "", "", $capture_filename, $scale, $left_indent, $top_indent, $right_indent, $bottom_indent)

ShellExecuteWait(@ProgramFilesDir & "\Tesseract-OCR\tesseract.exe", $capture_filename & " " & $ocr_filename, "", "", @SW_HIDE)

; If no delimter specified, then return a string
If StringCompare($delimiter, "") = 0 Then

$final_ocr = FileRead($ocr_filename_and_ext)

_FileReadToArray($ocr_filename_and_ext, $aArray)
_ArrayDelete($aArray, 0)

; Append the recognised text to a final array
_ArrayConcatenate($final_ocr, $aArray)

; If the captures are to be displayed
If $show_capture = 1 Then

GUICreate("Tesseract Screen Capture. Note: image displayed is not to scale", 640, 480, 0, 0, $WS_SIZEBOX + $WS_SYSMENU) ; will create a dialog box that when displayed is centered


$Obj1 = ObjCreate("Preview.Preview.1")
$Obj1_ctrl = GUICtrlCreateObj($Obj1, 0, 0, 640, 480)
$Obj1.ShowFile($capture_filename, 1)


If IsArray($final_ocr) Then

_ArrayDisplay($aArray, "Tesseract Text Capture")

MsgBox(0, "Tesseract Text Capture", $final_ocr)


FileDelete($ocr_filename & ".*")

; Cleanup
If IsArray($final_ocr) And $cleanup = 1 Then

; Cleanup the items
For $final_ocr_num = 1 To (UBound($final_ocr) - 1)

; Remove erroneous characters
$final_ocr[$final_ocr_num] = StringReplace($final_ocr[$final_ocr_num], ".", "")
$final_ocr[$final_ocr_num] = StringReplace($final_ocr[$final_ocr_num], "'", "")
$final_ocr[$final_ocr_num] = StringReplace($final_ocr[$final_ocr_num], ",", "")
$final_ocr[$final_ocr_num] = StringStripWS($final_ocr[$final_ocr_num], 3)

; Remove duplicate and blank items
For $each In $final_ocr

$found_item = _ArrayFindAll($final_ocr, $each)

; Remove blank items
If IsArray($found_item) Then
If StringCompare($final_ocr[$found_item[0]], "") = 0 Then

_ArrayDelete($final_ocr, $found_item[0])

; Remove duplicate items
For $found_item_num = 2 To UBound($found_item)

_ArrayDelete($final_ocr, $found_item[$found_item_num - 1])

; Store a copy of the capture
If $last_capture.item(0) = "" Then

$last_capture.item(0) = $final_ocr

Return $final_ocr
EndFunc ;==>_TesseractScreenCapture

Func CaptureToTIFF($win_title = "", $win_text = "", $ctrl_id = "", $sOutImage = "", $scale = 1, $left_indent = 0, $top_indent = 0, $right_indent = 0, $bottom_indent = 0)
Local $hWnd, $hwnd2, $hDC, $hBMP, $hImage1, $hGraphic, $CLSID, $tParams, $pParams, $tData, $i = 0, $hImage2, $pos[4], $tar_leftx, $tar_lefty, $tar_rightx, $tar_righty, $winsize[4]
Local $Ext = StringUpper(StringMid($sOutImage, StringInStr($sOutImage, ".", 0, -1) + 1))
Local $giTIFColorDepth = 24
; If capturing a control
If StringCompare($ctrl_id, "") <> 0 Then
$hwnd2 = ControlGetHandle($win_title, $win_text, $ctrl_id)
$pos = ControlGetPos($win_title, $win_text, $ctrl_id)
; If capturing a window
If StringCompare($win_title, "") <> 0 Then
$hwnd2 = WinGetHandle($win_title, $win_text)
$pos = WinGetPos($win_title, $win_text)
; If capturing the desktop
$hwnd2 = ""
$pos[0] = 0
$pos[1] = 0
$pos[2] = @DesktopWidth
$pos[3] = @DesktopHeight

; Capture an image of the window / control
If IsHWnd($hwnd2) Then
WinActivate($win_title, $win_text)
;added to calculate missing variables from function call needed to control the screen shot ProcessClose
$winsize = WinGetPos($win_title, $win_text)
$tar_leftx = $left_indent
$tar_lefty = $top_indent
$tar_rightx = $winsize[2] - $right_indent
$tar_righty = $winsize[3] - $bottom_indent
$hBitmap2 = _ScreenCapture_CaptureWnd("", $hwnd2, $tar_leftx, $tar_lefty, $tar_rightx, $tar_righty, False)
;added to calculate missing variables from function call needed to control the screen shot ProcessClose
$winsize = $pos
$tar_leftx = $left_indent
$tar_lefty = $top_indent
$tar_rightx = $winsize[2] - $right_indent
$tar_righty = $winsize[3] - $bottom_indent
$hBitmap2 = _ScreenCapture_Capture("", $tar_leftx, $tar_lefty, $tar_rightx, $tar_righty, False)
;old version of if statement - correction to function
;if IsHWnd($hwnd2) Then
; WinActivate($win_title, $win_text)
; $hBitmap2 = _ScreenCapture_CaptureWnd("", $hwnd2, 0, 0, -1, -1, False)
; $hBitmap2 = _ScreenCapture_Capture("", 0, 0, -1, -1, False)
; Convert the image to a bitmap
$hImage2 = _GDIPlus_BitmapCreateFromHBITMAP($hBitmap2)
;Commenting out what I had before
$hWnd = _WinAPI_GetDesktopWindow()
$hDC = _WinAPI_GetDC($hWnd)
;Old version of this function call
;$hBMP = _WinAPI_CreateCompatibleBitmap($hDC, ($pos[2] * $scale) - ($right_indent * $scale), ($pos[3] * $scale) - ($bottom_indent * $scale))
$hBMP = _WinAPI_CreateCompatibleBitmap($hDC, ($tar_rightx - $tar_leftx) * $scale, ($tar_righty - $tar_lefty) * $scale)
_WinAPI_ReleaseDC($hWnd, $hDC)
$hImage1 = _GDIPlus_BitmapCreateFromHBITMAP ($hBMP)
;Implementing UEZ's suggestion
$hImage1 = DllCall($ghGDIPDll, "uint", "GdipCreateBitmapFromScan0", _
"int", ($pos[2] * $scale) - ($right_indent * $scale), _
"int", ($pos[3] * $scale) - ($bottom_indent * $scale), _
"int", 0, "int", $GDIP_PXF24RGB, "ptr", 0, "int*", 0)
$hImage1 = $hImage1[6]
$hGraphic = _GDIPlus_ImageGetGraphicsContext($hImage1)
;Modified from orginal to support corrected screen captures
;_GDIPLus_GraphicsDrawImageRect($hGraphic, $hImage2, 0 - ($left_indent * $scale), 0 - ($top_indent * $scale), ($pos[2] * $scale) + $left_indent, ($pos[3] * $scale) + $top_indent)
_GDIPlus_GraphicsDrawImageRect($hGraphic, $hImage2, 0, 0, ($tar_rightx - $tar_leftx) * $scale, ($tar_righty - $tar_lefty) * $scale)
$CLSID = _GDIPlus_EncodersGetCLSID($Ext)
; Set TIFF parameters
$tParams = _GDIPlus_ParamInit(2)
$tData = DllStructCreate("int ColorDepth;int Compression")
DllStructSetData($tData, "ColorDepth", $giTIFColorDepth)
DllStructSetData($tData, "Compression", $giTIFCompression)
_GDIPlus_ParamAdd($tParams, $GDIP_EPGCOLORDEPTH, 1, $GDIP_EPTLONG, DllStructGetPtr($tData, "ColorDepth"))
_GDIPlus_ParamAdd($tParams, $GDIP_EPGCOMPRESSION, 1, $GDIP_EPTLONG, DllStructGetPtr($tData, "Compression"))
If IsDllStruct($tParams) Then $pParams = DllStructGetPtr($tParams)
; Save TIFF and cleanup
_GDIPlus_ImageSaveToFileEx($hImage1, $sOutImage, $CLSID, $pParams)
EndFunc ;==>CaptureToTIFF

The coords i am giving are from 1-1 to 1000-1000 which means i am covering a good part of the screen. However when i run it with the scite fullscreen it only returns me some little text with coords arround 1-1 to 400-100.

Any idea why this happens?

Attention. To be able to run the above script you must have tesseract installed in your computer and the path must be this:

@ProgramFilesDir & "Tesseract-OCRtesseract.exe

or you can change it ;)


Unless @DesktopHeight > 1000 you will be passing a negative coordinate.

1000 is the indent from the bottom so probably @DesktopHeight -1000 = a negative coord

Hope this helps

Good luck

It is 1050+. So if it is like this the smallest screen coords it can check for is 1000 x 1000? Too bad. And as i tested it is not very accurated...

It is free but not very accurated. A search on google was giving it an average accuracy of 90%. I wouldn't give it more than 70%. Maybe even less.

An it has difficulties searching for texts on high pixel screenshots...

I quitted this anyway.

