Jump to content

Autoit OCR without 3rd party software.


civilcalc
 Share

Recommended Posts

I downloaded this library from the post from 12 November of dgm5555. I was trying a simple example as shown below but can't really get it working. First I ran the learnFonts() and then the learnCharsWithWord() functions. The OCRFontData.txt file is there and has som data in it. When I type "test" in a word document (arial 10px which is learned) I get the autolearn box asking me to type in first "typ" then "e", before returning a value. Have I done something wrong? Using windows 7 and an unzipped version of Auto It v3.3.14.2

#include <OCR.au3>
#include <_PixelGetColor.au3>

$test = mouseOCR()

MsgBox(0,"",$test)

 

Link to comment
Share on other sites

  • 2 months later...
  • 4 weeks later...
  • 1 year later...

Old thread, but I recently found myself needing a robust OCR function where the font used is just all over the place and regular OCR options (not even through AutoIt) were having a hard time.

So far this has been excellent, except that identifying the character in the "Please identify this pattern" was cumbersome.  Separating uppercase from lowercase is very difficult in that odd text window, and I'd continually have to refer to the text-so-far shown there, and the source image, to see what the proper character would be.  In addition, sometimes I'd see a character there that did not seem like it should have been there at all.. e.g. something that looked like an 'l', but there was no 'l' in the text. ( Turns out it saw an 'h' as an 'l' and something else because the arc touching the stem was too narrow there - had to adjust the tolerance option. )

As such, I combined this post on drawing a rectangle on the screen:


( Note that their code has the X, Y and Width/Height parameters for GUICreate swapped - easy fix. )

With the OCR code as follows:

$hWin = _GUI_Transparent_Client($left + $blockStart - 4, $top - 4, ($blockEnd - $blockStart + 8), $grabHeight + 8, 2, 0xFFFF00)

                ; And display it
                $userResponse = InputBox("Unknown Character", "Please identify this pattern" & @CR & "(or just OK to skip learning it):-" & @CR & $data & @CR & @CR & $image, $letter, "", $boxWidth, $boxHeight, @DesktopWidth - $boxWidth, @DesktopHeight - $boxHeight,20)
                GUIDelete($hWin)

The top and bottom lines are the major changes besides the added includes and the two functions mentioned in that post, all added near the top.

This ends up drawing a bright yellow outline around the presumed character on screen.  The dialog asking for identification keeps focus.  As a result, you can just keep an eye on the outline character and type the correct one(s) very quickly without having to refer to the ~ and # display.

Note that is just a quick solution for my use case and isn't suitable for an adjusted release where I'd imagine setting the margin, line width, and line color would be useful - as well as dealing with changing screen content if the OCR is run on the snapshot but the screen changed since then.

I hope this helps someone all the same and/or somebody takes the idea and adjusts the script's code appropriately :)

 

Edit: Another minor change, but huge impact on performance (short of overhauling it entirely), in "checkForMatch", change the StringInStr function call to be case sensitive by adding '1' as a parameter.

$matchLoc = StringInStr($database, $matchString & @CR,1)

 

Edited by kamiquasi
Link to comment
Share on other sites

  • 3 months later...
  • 7 months later...

Nice startup, initiation and ideas @civilcalc & nice rework @LoneWolf92 !

I think some threads don't die so unless someone can point out another post with a better OCR solution.

I want to see if I can figure out if I can get  @kamiquasi   nice graphical highlight working. If you're still around, can you post the full code? I think there is something wrong with the logic of trying to highlight the section of the text after it is already converted to text. 

 

I tried a test calling the mouseOCR () function with the below code to see if I could clean up what kamiquasi recommended...        

Around line 751 I replaced this original code from:

$userResponse = InputBox("Unknown Character", "Please identify this pattern" & @CR & "(or just OK to skip learning it):-" & @CR & $data & @CR & @CR & $image, $letter, "", $boxWidth, $boxHeight, @DesktopWidth - $boxWidth, @DesktopHeight - $boxHeight)

To

 

; Create a highlighted box around the unrecognized text
$hilightX = ($left + $blockStart - 4)    
$hilightY = ($top - 4) + $topRowInLine   
$hilightW = ($blockEnd - $blockStart + 8)
$hilightH = ($bottomRowInLine - $topRowInLine)*2 + 2
$hWin = _GUI_Transparent_Client($hilightW, $hilightH, $hilightX, $hilightY, 2, 0xFFFF00) ; w,h,x,y,thickness,color

; And display the input box to ask what it might have found
$userResponse = InputBox("Unknown Character: '" & $letter & "'", _      ; Title
                         "Please identify this pattern" & @CR & _       ; Prompt
                         "(or just OK to skip learning it):-" & @CR & _
                         $data & @CR & @CR & $image, _
                         $letter, _  ;Default Value
                         "", $boxWidth, $boxHeight, _                   ; Width, Height
                         @DesktopWidth - $boxWidth, _                   ; Left
                         @DesktopHeight - $boxHeight, _                 ; Top
                         20)                                            ; Timeout
GUIDelete($hWin)

 

And it highlights the incorrect section of my screen. It isn't the section that I moused over to make the OCR selection

 

Hmmm let me know if anyone has an idea if this can work.

 

Thanks!

SOLVED:

Turns out the sample code at 

had the x,y,w,h reversed and it should be w,h,x,y where this would work:

; Create a highlighted box around the unrecognized text

$hilightX = ($left + $blockStart - 4)
$hilightY = ($top - 4) ; Good
$hilightW = ($blockEnd - $blockStart + 8);
$hilightH = ($grabHeight + 8);
$hWin = _GUI_Transparent_Client($hilightW, $hilightH, $hilightX, $hilightY, 2, 0xFFFF00) ; w,h,x,y, thick,color

; And display the input box to ask what it might have found
$userResponse = InputBox("Unknown Character: '" & $letter & "'", _        ; Title
                         "Please identify this pattern" & @CR & _          ; Prompt
                         "(or just OK to skip learning it):-" & @CR & _
                         $data & @CR & @CR & $image, _
                         $letter, _  ;Default Value
                         "", $boxWidth, _                                 ; Width
                         $boxHeight, _                                      ; Height
                         @DesktopWidth - $boxWidth, _                      ; Left
                         @DesktopHeight - $boxHeight, _                 ; Top
                         20)                                             ; Timeout
GUIDelete($hWin)

 

 

Edited by NassauSky
Box coordinates were wrong
Link to comment
Share on other sites

  • 2 weeks later...

Thank you for this great OCR Reader. I wrote a recheck OCR file funtion. With this function you can recheck all entries in your OCR File again.

This is my code:

#include "OCR.au3" ;or add the code below directly into OCR.au3

#include <Array.au3>
#include <File.au3>

; Start code by autoscript.com user Beonn, function _CheckOCRFile
Func _CheckOCRFile($fontFile="")
    If $fontFile = "" Then $fontFile = @ScriptDir & "\OCRFontData.txt"

    Local $fileOld = FileOpen($fontFile, 0)
    Local $fileNew = FileOpen($fontFile & ".tmp", 2)

    ; Check if file opened for reading OK
    If $fileOld = -1 Then
        MsgBox(0, "Error", "Unable to open data file.")
        Exit
    EndIf
    $_startdb = 0
    For $i = 1 To _FileCountLines($fontFile)
        Local $line = FileReadLine($fileOld, $i)
        Select
            Case @error = -1
                ExitLoop
            Case Not $_startdb
                If StringInStr($line,@TAB & "-99") Then $_startdb = 1
                FileWriteLine($fileNew, $line)
            Case Not StringInStr($line,"err" & @TAB & "!") And $_startdb
                $_dbreturn = StringSplit($line, @TAB)
                If IsArray($_dbreturn) Then
                    $letter = $_dbreturn[1]
                    $saveWindowForFont = $_dbreturn[3]
                EndIf
                $_dbreturn = StringSplit($_dbreturn[2], "|")
                ;_ArrayDisplay($_dbreturn, "DB return", Default, 8)
                If IsArray($_dbreturn) Then
                    $image = ""
                    $_len = 0
                    For $j = 1 To UBound($_dbreturn) - 1
                        $_lentemp = StringLen(_NumberToBinary(Abs(Number($_dbreturn[$j]))))
                        If $_lentemp > $_len Then $_len = $_lentemp
                    Next
                    For $j = 1 To $_len
                        For $k = 1 To UBound($_dbreturn) - 1
                            If StringMid(StringReverse(_NumberToBinary(Abs(Number($_dbreturn[$k])))), $j, 1) = 1 Then
                                $image = $image & "#"
                            Else
                                $image = $image & "~"
                            EndIf
                        Next
                        $image = $image & @CRLF
                    Next
                Else
                    ContinueLoop
                EndIf
                ; Now calculate the required size of the msgbox to display the pattern
                $boxWidth = UBound($_dbreturn) * 8 + 40
                If $boxWidth < 200 Then $boxWidth = 200
                If $boxWidth > @DesktopWidth Then $boxWidth = @DesktopWidth
                $boxHeight = $_len * 13 + 120
                If $boxHeight < 500 Then $boxHeight = 500
                If $boxHeight > @DesktopHeight Then $boxHeight = @DesktopHeight
                ; And display it
                $userResponse = InputBox("Control Character", "Please identify this pattern" & @CR & "(Empty string or Cancel will delete this from Fontfile):-" & @CR & @CR & $image, $letter, "", $boxWidth, $boxHeight, @DesktopWidth - $boxWidth, @DesktopHeight - $boxHeight - 50) ;-50 for Windows Taskbar
                If Not @error = 1 And $userResponse <> "" Then
                If $letter = $userResponse Then
                    ; User didn't change anything
                    FileWriteLine($fileNew, $line)
                Else
                    ;The data was incorrect
                    $letter = $userResponse
                    $pattern = _ArrayToString($_dbreturn, "|", 1)
                    FileWriteLine($fileNew, $letter & $pattern & $saveWindowForFont & @CRLF)
                EndIf
            EndIf
            Case Else
                FileWriteLine($fileNew, $line)
        EndSelect
    Next

    FileClose($fileOld)
    FileClose($fileNew)

    ; Now replace old file with new

    FileMove ( $fontFile & ".tmp", $fontFile, 1 )
    cleanFontDataFile($fontFile)
EndFunc
; End code by autoscript.com user Beonn, function _CheckOCRFile

; Start code added by autoscript.com user Beonn, function _NumberToBinary from https://www.autoitscript.com/forum/topic/90056-decimal-to-binary-number-converter/
; =================================================================================================
; Func _NumberToBinary($iNumber)
;
; Converts a 32-bit signed # to a binary bit string. (Limitation due to AutoIT functionality)
;   NOTE: range for 32-bit signed values is -2147483648 to 2147483647!
;       Anything outside the range will return an empty string!
;
; $iNumber = # to convert, obviously
;
; Returns:
;   Success: Binary bit string
;   Failure: "" and @error set
;
; Author: Ascend4nt, with help from picaxe (Changing 'If BitAND/Else' to just one line)
;   See it @ http://www.autoitscript.com/forum/index.php?showtopic=90056
; =================================================================================================
Func _NumberToBinary($iNumber)
    Local $sBinString = ""
    ; Maximum 32-bit # range is -2147483648 to 2147483647
    If $iNumber<-2147483648 Or $iNumber>2147483647 Then Return SetError(1,0,"")

    ; Convert to a 32-bit unsigned integer. We can't work on signed #'s
    $iUnsignedNumber=BitAND($iNumber,0x7FFFFFFF)

    ; Cycle through each bit, shifting to the right until 0
    Do
        $sBinString = BitAND($iUnsignedNumber, 1) & $sBinString
        $iUnsignedNumber = BitShift($iUnsignedNumber, 1)
    Until Not $iUnsignedNumber

    ; Was it a negative #? Put the sign bit on top, and pad the bits that aren't set
    If $iNumber<0 Then Return '1' & StringRight("000000000000000000000000000000" & $sBinString,31)

    Return $sBinString
EndFunc   ;==>_NumberToBinary
; End code added by autoscript.com user Beonn, function _NumberToBinary

 

Edited by Beonn
Link to comment
Share on other sites

  • 2 weeks later...
  • 4 weeks later...
  • 7 months later...

Hello everyone ... so I'm trying to take a stab at attempting to do a screenshot based on the windows handle but i'm NOT 100% sure it is working correctly as it is not returning anything but 0

If someone can take a look and advise me as to what I did wrong, that would be creately appericated.

FYI - I modified the <_PixelGetColor.au3> file and made 2 small adjustment to the OCR.au3 file

Func _OCR($hwnd, $left, $top, $right, $bottom, $searchColour = 0x000000, $searchColourVariation = 100, $fontFile = "", $ocrTrainChar = "", $ocrOptions = $ocrDefaultOptions)

;also adjusted $vRegion as follows
global $vRegion = _PixelGetColor_CaptureRegion($hwnd, $vDC, $left, $top, $right + 1, $bottom + 1, $hDll

 

Func _PixelGetColor_CaptureRegion($hwnd, $iPixelGetColor_MemoryContext, $Top_X = 0, $Top_Y = 0, $Bottom_X = -1, $Bottom_Y = -1, $fCursor = False, $hDll = "gdi32.dll")
    local $IsBMP, $IsUser32, $Handle = $Hwnd
    loca $__BMP_SEARCH = 0x00CC0020
    If $hwnd = "" Then ; If no handle is passed then screencapture
        Local $Right = $Bottom_X = -1 ? -1 : $Top_X + $Bottom_X - 1
        Local $Bottom = $Bottom_Y = -1 ? -1 : $Top_Y + $Bottom_Y - 1
        Local $hBMP = _ScreenCapture_Capture("", $Top_X, $Top_Y, $Right, $Bottom, False)
        If @error Then Return SetError(1, 0, 0)
        If not $IsBMP Then Return SetError(0, 0, $hBMP)
        Local $BMP = _GDIPlus_BitmapCreateFromHBITMAP($hBMP)
        If @error Then Return SetError(1, 0, 0)
        _WinAPI_DeleteObject($hBMP)
        Return SetError(0, 0, $BMP)
    else
        If Not IsHWnd($Handle) Then $Handle = HWnd($Hwnd)
        If @error Then
            $Handle = WinGetHandle($Hwnd)
            If @error Then
                ConsoleWrite("! _HandleCapture error: Handle error!")
                Return SetError(1, 0, 0)
            EndIf
        EndIf
        Local $hDC = _WinAPI_GetDC($Handle)
        Local $hCDC = _WinAPI_CreateCompatibleDC($hDC)
        If $Bottom_X = -1 Then $Bottom_X = _WinAPI_GetWindowWidth($Handle)
        If $Bottom_Y = -1 Then $Bottom_Y = _WinAPI_GetWindowHeight($Handle)
        If $IsUser32 Then
            Local $hBMP = _WinAPI_CreateCompatibleBitmap($hDC, _WinAPI_GetWindowWidth($Handle), _WinAPI_GetWindowHeight($Handle))
            _WinAPI_SelectObject($hCDC, $hBMP)
            DllCall("User32.dll", "int", "PrintWindow", "hwnd", $Handle, "hwnd", $hCDC, "int", 0)
            Local $tempBMP = _GDIPlus_BitmapCreateFromHBITMAP($hBMP)
            _WinAPI_DeleteObject($hBMP)
            Local $BMP = _GDIPlus_BitmapCloneArea($tempBMP, $Top_X, $Top_Y, $Bottom_X, $Bottom_Y, $GDIP_PXF24RGB)
            _GDIPlus_BitmapDispose($tempBMP)
        Else
            Local $hBMP = _WinAPI_CreateCompatibleBitmap($hDC, $Bottom_X, $Bottom_Y)
            _WinAPI_SelectObject($hCDC, $hBMP)
            _WinAPI_BitBlt($hCDC, 0, 0, $Bottom_X, $Bottom_Y, $hDC, $Top_X, $Top_Y, $__BMP_SEARCH)
            Local $BMP = _GDIPlus_BitmapCreateFromHBITMAP($hBMP)
            _WinAPI_DeleteObject($hBMP)
        EndIf
        _WinAPI_ReleaseDC($Handle, $hDC)
        _WinAPI_DeleteDC($hCDC)
        Local $hBMP = _GDIPlus_BitmapCreateHBITMAPFromBitmap($BMP)
        _GDIPlus_BitmapDispose($BMP)
    endif
    DllCall($hDll, "hwnd", "SelectObject", "int", $iPixelGetColor_MemoryContext, "hwnd", $hBMP)
    Return $hBMP
EndFunc

the handle screencapture is created to Author: LTNhanSt94 from HandleImgSearch.au3

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...