Jump to content

Tesseract (Screen OCR) UDF


Recommended Posts

  • Replies 136
  • Created
  • Last Reply

Top Posters In This Topic

Top Posters In This Topic

Popular Posts

This UDF provides text capturing support for applications and controls using Tesseract - an OCR engine currently developed by Google. Tesseract was originally developed as proprietary software at H

Some example code to test if concept still works used version 3.05.1 output.txt will contain the result (on my system about 30 seconds to analyze full screen which you probably normally should not do)

Ok, so I'm smashing my head with autoit and tesseract and I have a problem. The code works and does what it needs to do on one environment but not on another. When running it on Windows Server 2012 (2

Posted Images

  • 7 months later...

After trying my test script ...

Local $pid = Run('notepad.exe')
WinWait('[Class:Notepad]')
WinActivate('[Class:Notepad]')
WinWaitActive('[Class:Notepad]')

$main = GUICreate("test", 552, 333, 248, 83)
$Button1 = GUICtrlCreateButton("Capture", 32, 261, 61, 21, $WS_GROUP)
GUISetState(@SW_SHOW)



While 1
  $msg = GUIGetMsg()

  Select
    Case $msg = $Button1

    $button1_text = _TesseractWinCapture('[Class:Notepad]', "",  "[CLASS:Edit; INSTANCE:1]")
     MsgBox(0, 0, $button1_text[0])
    ;$combobox1_edit_text = _TesseractControlCapture("ListDemo", "", "[CLASS:ComboBox; INSTANCE:1]", 1, "", 1, 1, 1, 5, 3, 2, 2, 10, 2, 1)
    ;$combobox1_text = _TesseractControlCapture("ListDemo", "", "[CLASS:ComboBox; INSTANCE:1]", 1, @CRLF, 1, 1, 1, 5, 3, 1, 1, 5, 3, 1)
    ;$listbox1_text = _TesseractControlCapture("ListDemo", "", "[CLASS:ListBox; INSTANCE:1]", 1, @CRLF, 1, 1, 1, 5, 3, 2, 2, 200, 0, 1)

    Case $msg = $GUI_EVENT_CLOSE
      MsgBox(0, "GUI Event", "You clicked CLOSE! Exiting...")
      ExitLoop
  EndSelect
WEnd

C:\Users\Marko\Desktop\poker\my_bodog_hopper\_TesseractControlCapture.au3 (22) : ==> Subscript used with non-Array variable.:

MsgBox(0, 0, $button1_text[0])

MsgBox(0, 0, $button1_text^ ERROR

; Return values FOR _TesseractControlCapture()

On Success - Returns an array of text that was captured.

; On Failure - Returns an empty array.

So who is crazy here?

Hers my testing code, did anyone manage to put a simple code that works for this? So many people here talking but yet one working code to see

Edited by marko29
Link to post
Share on other sites

k after testing more it seems that without delimiter it will return just text, then OP please mention that in the return results, i know you did mention it in delimiter parameter but it is more appropriate to do it in return results to avoid confusion.

Also this code i posted seems to work on xp but not on win7(32bit)

Tried to use it with and without preview on win7, nothing works. Maybe someone can solve this.

Edited by marko29
Link to post
Share on other sites

Is there a way to call the OCR for an image already created (point it to the path)?

Going through this udf i found this line of code...

ShellExecuteWait(@ProgramFilesDir & "\tesseract\tesseract.exe", $capture_filename & " " & $ocr_filename)

Tried using this...

#include <ScreenCapture.au3>
_ScreenCapture_Capture(@DesktopDir & "\test images 2\WhatScreen.tif", 17, 111, 175, 125)
Sleep(1000)
ShellExecuteWait(@ProgramFilesDir & "\tesseract\tesseract.exe", @DesktopDir & "\test images 2\WhatScreen.tif" & " " & @DesktopDir & "\test images 2\WhatScreen.txt")

but no .txt file

swimming through that sea of code got me confused and i cant tell what i would replace the variables with. The help file for ShellExecuteWait is very confusing to me and i was wondering if anyone could explain it to me better.

Edited by blackmage999
Link to post
Share on other sites

Does anyone know how I can invert the colours on the bitmap image the script takes before it runs it through OCR?

The screen capture I want to OCR has yellow characters on a blue background that tesserac can't read unless I invert the colours (think MS Paint).

I have played around with the script and even tried opening the capture in MS paint to invert but no luck.

Anyone know a good line of code in GUI that will do this?

Also, is there a way to screen capture a window that is always minimized, or do I have to activate everytime?

Thanks,

GG

Link to post
Share on other sites

Hi,

here is my 2 cent addition for the library

using other language and version 3 of tesseract i made some change in V0.6 to work with both those specification:

In the init part (line around 58 in v0.6 of Tesseract.au3)

;Global $tesseract_temp_path = "C:\"
Global $tesseract_temp_path = @TempDir & "\"

Global $tesseract_Program_file = @ProgramFilesDir & "\tesseract-OCR\tesseract.exe"

;Global $LanguageOption = "eng"
Global $LanguageOption = "fra"

- using the windows temp folder instead of c:

- using another folder for calling tesseract (the "tesseract-OCR" is created with V3 instead of "tesseract" in V2)

- using other language at recognition (still need that tesseract have the language available, this is just the call)

In the different function ( _TesseractScreenCapture around 166, _TesseractWinCapture around 330 and _TesseractControlCapture around 555)

;ShellExecuteWait(@ProgramFilesDir & "\tesseract-OCR\tesseract.exe", $capture_filename & " " & $ocr_filename)
ShellExecuteWait( $tesseract_Program_file, $capture_filename & " " & $ocr_filename & " -l " & $LanguageOption)

- replace the call with the global variable

- add the call to the language.

Suggestion: call to the function with the possibility to specify the language, by default english

Link to post
Share on other sites

I'm finding that this formula is not working for me. My screen coordinates are 1024x768

say my real world coordinates are for a rectangle of text are: 429(leftx) 518(topy) 654(leftx) 532(bottomy)

then my indented coords should be: 429,518,(1024 - 654),(768 - 532)

right???

Everytime I do this I get a large box thats not even close to the right size or coordinates. I even set scaling to 1. Still too big. Am I doing something wrong??? I even tried turning my dual displays off.

_TesseractScreenCapture = is using the function that uses the Func CaptureToTIFF which has a bug in it with the Scale being multiplied against the coordinates.

Need to look some more, but looks like the two similar functions _TesseractScreenCapture and _TesseractWinCapture use this same function but it isn't compatible.

Link to post
Share on other sites

I attached two files to this post in a zip file. Both with the exact same variables fed into the function other than the scale factor.

Please note that the two screen shots should have been identical but aren't.

For what ever reason, in one of the functions the scale is multiplied against the coordinates causing the relative position to move with nothing more than a change in the scale variable.

Scale Factor 2 with Win Capture

post-51547-0-27453400-1299354033_thumb.p

Scale Factor 11 with Win Capture

post-51547-0-66795700-1299354042_thumb.p

Link to post
Share on other sites

The CaptureToTIFF() Function I rewrote in order to take advantage of Screen Capture's ability to preprocess the image and to correct for the problem where increasing the scale of the image was resulting in the target being lost.

So far this is working perfectly.

You will see that I left the original code in a note so you can see the before an after.

; #FUNCTION# ;===============================================================================
;
; Name...........:  CaptureToTIFF()
; Description ...:  Captures an image of the screen, a window or a control, and saves it to a TIFF file.
; Syntax.........:  CaptureToTIFF($win_title = "", $win_text = "", $ctrl_id = "", $sOutImage = "", $scale = 1, $left_indent = 0, $top_indent = 0, $right_indent = 0, $bottom_indent = 0)
; Parameters ....:  $win_title      - The title of the window to capture an image of.
;                   $win_text       - Optional: The text of the window to capture an image of.
;                   $ctrl_id        - Optional: The ID of the control to capture an image of.
;                                       An image of the window will be returned if one isn't provided.
;                   $sOutImage      - The filename to store the image in.
;                   $scale          - Optional: The scaling factor of the capture.
;                   $left_indent    - A number of pixels to indent the screen capture from the
;                                       left of the window or control.
;                   $top_indent     - A number of pixels to indent the screen capture from the
;                                       top of the window or control.
;                   $right_indent   - A number of pixels to indent the screen capture from the
;                                       right of the window or control.
;                   $bottom_indent  - A number of pixels to indent the screen capture from the
;                                       bottom of the window or control.
; Return values .:  None
; Author ........:  seangriffin
; Modified.......: 
; Remarks .......:  
; Related .......: 
; Link ..........: 
; Example .......:  No
;
; ;==========================================================================================
Func CaptureToTIFF($win_title = "", $win_text = "", $ctrl_id = "", $sOutImage = "", $scale = 1, $left_indent = 0, $top_indent = 0, $right_indent = 0, $bottom_indent = 0)

    Local $hWnd, $hwnd2, $hDC, $hBMP, $hImage1, $hGraphic, $CLSID, $tParams, $pParams, $tData, $i = 0, $hImage2, $pos[4], $tar_leftx, $tar_lefty, $tar_rightx, $tar_righty, $winsize[4]
    Local $Ext = StringUpper(StringMid($sOutImage, StringInStr($sOutImage, ".", 0, -1) + 1))
    Local $giTIFColorDepth = 24
    Local $giTIFCompression = $GDIP_EVTCOMPRESSIONNONE

    ; If capturing a control
    if StringCompare($ctrl_id, "") <> 0 Then

        $hwnd2 = ControlGetHandle($win_title, $win_text, $ctrl_id)
        $pos = ControlGetPos($win_title, $win_text, $ctrl_id)
    Else
        
        ; If capturing a window
        if StringCompare($win_title, "") <> 0 Then

            $hwnd2 = WinGetHandle($win_title, $win_text)
            $pos = WinGetPos($win_title, $win_text)
        Else
            
            ; If capturing the desktop
            $hwnd2 = ""
            $pos[0] = 0
            $pos[1] = 0
            $pos[2] = @DesktopWidth
            $pos[3] = @DesktopHeight
        EndIf
    EndIf
    

    
    
    ; Capture an image of the window / control
    if IsHWnd($hwnd2) Then
    
        WinActivate($win_title, $win_text)
        ;added to calculate missing variables from function call needed to control the screen shot ProcessClose
        $winsize = WinGetPos ( $win_title, $win_text )
        $tar_leftx = $left_indent
        $tar_lefty = $top_indent
        $tar_rightx = $winsize[2] - $right_indent
        $tar_righty = $winsize[3] - $bottom_indent 
        $hBitmap2 = _ScreenCapture_CaptureWnd("", $hwnd2, $tar_leftx, $tar_lefty, $tar_rightx, $tar_righty, False)
    Else
        ;added to calculate missing variables from function call needed to control the screen shot ProcessClose
        $winsize = 0, 0, @DesktopWidth, @DesktopHeight
        $tar_leftx = $left_indent
        $tar_lefty = $top_indent
        $tar_rightx = $winsize[2] - $right_indent
        $tar_righty = $winsize[3] - $bottom_indent 
        $hBitmap2 = _ScreenCapture_Capture("", $tar_leftx, $tar_lefty, $tar_rightx, $tar_righty, False)
    EndIf
    ;old version of if statement - correction to function
    ;if IsHWnd($hwnd2) Then
    ;
    ;   WinActivate($win_title, $win_text)
    ;   $hBitmap2 = _ScreenCapture_CaptureWnd("", $hwnd2, 0, 0, -1, -1, False)
    ;Else
    ;   
    ;   $hBitmap2 = _ScreenCapture_Capture("", 0, 0, -1, -1, False)
    ;EndIf

    _GDIPlus_Startup ()
    
    ; Convert the image to a bitmap
    $hImage2 = _GDIPlus_BitmapCreateFromHBITMAP ($hBitmap2)

    $hWnd = _WinAPI_GetDesktopWindow()
    $hDC = _WinAPI_GetDC($hWnd)
    ;Old version of this function call
    ;$hBMP = _WinAPI_CreateCompatibleBitmap($hDC, ($pos[2] * $scale) - ($right_indent * $scale), ($pos[3] * $scale) - ($bottom_indent * $scale))
    $hBMP = _WinAPI_CreateCompatibleBitmap($hDC, ($tar_rightx - $tar_leftx) * $scale, ($tar_righty - $tar_lefty) * $scale)

    _WinAPI_ReleaseDC($hWnd, $hDC)
    $hImage1 = _GDIPlus_BitmapCreateFromHBITMAP ($hBMP)
    $hGraphic = _GDIPlus_ImageGetGraphicsContext($hImage1)
    ;Modified from orginal to support corrected screen captures
    ;_GDIPLus_GraphicsDrawImageRect($hGraphic, $hImage2, 0 - ($left_indent * $scale), 0 - ($top_indent * $scale), ($pos[2] * $scale) + $left_indent, ($pos[3] * $scale) + $top_indent)
    _GDIPLus_GraphicsDrawImageRect($hGraphic, $hImage2, 0, 0, ($tar_rightx - $tar_leftx) * $scale, ($tar_righty - $tar_lefty) * $scale)
    $CLSID = _GDIPlus_EncodersGetCLSID($Ext)

    ; Set TIFF parameters
    $tParams = _GDIPlus_ParamInit(2)
    $tData = DllStructCreate("int ColorDepth;int Compression")
    DllStructSetData($tData, "ColorDepth", $giTIFColorDepth)
    DllStructSetData($tData, "Compression", $giTIFCompression)
    _GDIPlus_ParamAdd($tParams, $GDIP_EPGCOLORDEPTH, 1, $GDIP_EPTLONG, DllStructGetPtr($tData, "ColorDepth"))
    _GDIPlus_ParamAdd($tParams, $GDIP_EPGCOMPRESSION, 1, $GDIP_EPTLONG, DllStructGetPtr($tData, "Compression"))
    If IsDllStruct($tParams) Then $pParams = DllStructGetPtr($tParams)

    ; Save TIFF and cleanup
    _GDIPlus_ImageSaveToFileEx($hImage1, $sOutImage, $CLSID, $pParams)
    _GDIPlus_ImageDispose($hImage1)
    _GDIPlus_ImageDispose($hImage2)
    _GDIPlus_GraphicsDispose ($hGraphic)
    _WinAPI_DeleteObject($hBMP)
    _GDIPlus_Shutdown()
EndFunc
Link to post
Share on other sites

added language selection and process (external tesseract shellrun) window display state

#EndRegion Header
#Region Global Variables and Constants
Global $last_capture
;Global $tesseract_temp_path = "C:\"
Global $tesseract_temp_path = @TempDir & "\"

Global $tesseract_Program_file = @ProgramFilesDir & "\tesseract-OCR\tesseract.exe"

; @SW_MINIMIZE, @SW_MAXIMIZE, @SW_HIDE
Global $cstTesseractProcessShow = @SW_HIDE

Global $LanguageOption = "eng"
;Global $LanguageOption = "fra"
]

in each of the three function ( _TesseractScreenCapture, _TesseractWinCapture, _TesseractControlCapture)

...
;                   $show_capture       - Display screenshot and text captures
;                                           (for debugging purposes).
;                                           0 = do not display the screenshot taken (default)
;                                           1 = display the screenshot taken and exit
;                   $Language           - The language used for recognition by default "eng". Based on Tesseract reference
;                                           "eng" = English (default)
;                                           "fra" = French (need the package)
; Return values .:  On Success  - Returns an array of text that was captured.
;                   On Failure  - Returns an empty array.

func _TesseractWinCapture($win_title, $win_text = "", $get_last_capture = 0, $delimiter = "", $cleanup = 1, $scale = 2, $left_indent = 0, $top_indent = 0, $right_indent = 0, $bottom_indent = 0, $show_capture = 0, $Language = $LanguageOption)

;ShellExecuteWait(@ProgramFilesDir & "\tesseract-OCR\tesseract.exe", $capture_filename & " " & $ocr_filename)
    ShellExecuteWait( $tesseract_Program_file, $capture_filename & " " & $ocr_filename & " -l " & $Language, "", "open", $cstTesseractProcessShow )

It could be interesting to always return the result in the same kind of variable (Array) instead of array if $delimiter is specified and string if not and eventually add a parameter to return string if wanted because it's a bit disturbing for waiting an array, testing size to know if something is returned or error for in fact receiving a string. At least explain clearly in function header comment.

A big thanks for this library

Link to post
Share on other sites
  • 3 months later...

Thank for the code. It is very nice. I notice something that might be useful.

I try it on Windows 7. The result is very poor. Then I try it again on Windows XP. The result is excellent.

After checking the image, I think the "ClearType" is the cause. I have seen some replys say that the output is not comeout. I got the same problem on Windows 7.

Use Windows XP with ClearType off. :huh2:

Edited by ktech
Link to post
Share on other sites
  • 2 months later...

cant figure out why but tesseract keeps crashing with "cannot open file " error, im on win7 anyone had to deal with this error?

im using tesseract 3 with the script changes from this thread.... script seems to be working fine no errors just erroring on tesseracts side of things... i turned uaf off too, no luck there, anyone got any ideas?

Link to post
Share on other sites

i read a tutorial for this udf and i get the error

>"C:\Programme\AutoIt3\SciTE\..\autoit3.exe" /ErrorStdOut "C:\ord\445.au3"

C:\ord\SAPIListBox.au3 (96) : ==> Variable must be of type "Object".:

$SAPIListBox.SpeechEnabled = $toggle

$SAPIListBox^ ERROR

>Exit code: 1 Time: 0.464

#include <GuiConstantsEx.au3>
#include "SAPIListBox.au3"
dim $msg, $wortbox_array[5] = ["Up","Down","Left","Right","Wait"]
$fester = GUICreate("Beispiel zur Spracherkennung",400,300)
$wortbox = _GUICtrlSAPIListBox_Create(10, 10, 380, 280)
GUISetState()
_GUICtrlSAPIListBox_EnableSpeech($wortbox, 1)
_GUICtrlSAPIListBox_AddArray($wortbox, $wortbox_array)
While 1
$msg = GUIGetMsg()
$mausposition = MouseGetPos()

if _GUICtrlSAPIListBox_CurSelChanged($wortbox) = True Then
 
  if StringCompare(_GUICtrlSAPIListBox_GetText($wortbox), "Up") = 0 Then
   MouseMove($mausposition[0],$mausposition[1]-50)
  EndIf
 
  if StringCompare(_GUICtrlSAPIListBox_GetText($wortbox), "Down") = 0 Then
   MouseMove($mausposition[0],$mausposition[1]+50)
  EndIf
 
  if StringCompare(_GUICtrlSAPIListBox_GetText($wortbox), "Left") = 0 Then
   MouseMove($mausposition[0]-50,$mausposition[1])
  EndIf
 
  if StringCompare(_GUICtrlSAPIListBox_GetText($wortbox), "Right") = 0 Then
   MouseMove($mausposition[0]+50,$mausposition[1])
  EndIf
EndIf

switch $msg
case $GUI_Event_Close
  ExitLoop
EndSwitch
WEnd
Link to post
Share on other sites
  • 2 months later...

Its not nice to automate this lib for usage of multiple windows installations because there is no CLIENT COORDS indentations, only window indentation and since you never know how big are the borders or titlebar of each user it can cause big issues.

Does anyone have idea how to turn this into client coords indentation?

Link to post
Share on other sites
  • 4 weeks later...
  • Moderators

All,

A memory leak problem with this UDF has been identified and solved here. I have PM'd the UDF author, but you might like to amend your existing copies until he releases a new version. :)

M23

Public_Domain.png.2d871819fcb9957cf44f4514551a2935.png Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind

Open spoiler to see my UDFs:

Spoiler

ArrayMultiColSort ---- Sort arrays on multiple columns
ChooseFileFolder ---- Single and multiple selections from specified path treeview listing
Date_Time_Convert -- Easily convert date/time formats, including the language used
ExtMsgBox --------- A highly customisable replacement for MsgBox
GUIExtender -------- Extend and retract multiple sections within a GUI
GUIFrame ---------- Subdivide GUIs into many adjustable frames
GUIListViewEx ------- Insert, delete, move, drag, sort, edit and colour ListView items
GUITreeViewEx ------ Check/clear parent and child checkboxes in a TreeView
Marquee ----------- Scrolling tickertape GUIs
NoFocusLines ------- Remove the dotted focus lines from buttons, sliders, radios and checkboxes
Notify ------------- Small notifications on the edge of the display
Scrollbars ----------Automatically sized scrollbars with a single command
StringSize ---------- Automatically size controls to fit text
Toast -------------- Small GUIs which pop out of the notification area

 

Link to post
Share on other sites

First I want to thank, great work.

I have a small rectangle region on my desktop with text in it (and nothing else), background is mostly white. Now I want to read out this and store the text in an array or so.

Which function do I have to use for that? There are also so much arguments and I don't know what they do.

Link to post
Share on other sites
  • 5 weeks later...

Amazing job Sean, well done !

But I have a problem, not with the script itself, but with the OCR software. That's what I get in the error file (.txt) :

read_tif_image:Error:Illegal image format:Compression

So, what shall I do to debug tesseract 2.0.1 ?

Even if I've managed to correct the problem by using the latest release of the OCR, I would like to try the 2.0.1 version, because I don't appreciate the quick view of the CMD interface... Unless the CLI may appear with a correct tesseract 2.0.1 execution too ! In that case, how can I hide it during the whole process ? Yes, I'm a noob...

Thanks a lot for the future answers !

Edited by grandMOJ
Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Recently Browsing   0 members

    No registered users viewing this page.

×
×
  • Create New...