Jump to content

Real OCR in AU3 - in a few lines.


ptrex
 Share

Recommended Posts

@S0789300

Invalid Class string Error 1 means that you don't have the proper objects installed in your machine.

Please verify your installation.

See Post 80 for the correct download link and install again if needed.

Rgds

ptrex

Edited by ptrex
Link to comment
Share on other sites

Sadly I have attempted to install from post 80 three times now. Nothing changes in my MODI directory and the install only contains 2 files: an HTML document containing the license, and the MODI help document, nothing more - total of 478 kb. I'm thinking I might have to re-install office entirely... :)

Link to comment
Share on other sites

  • Moderators

It would be nice if someone could confirm that this works with the link on post #80.

Thanks :)

And your issue for confirming it would be?....

Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.

Link to comment
Share on other sites

  • Moderators

@SmOke_N

He's just a beginner and doesn't know where to start.

I did already PM'd him an example on how to get started.

Regards,

ptrex

That's a good point, I saw 08 and thought that was the year, when in fact it's the day.

Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.

Link to comment
Share on other sites

ptrex being the super nice guy he is. :)

Wrote this for me.

#include <Array.au3>

$output = _OCR("C:\_\snapgrab.bmp")

_ArrayDisplay($output)

;=================================   OCR   =======================================
; Function Name:    
; Description:  Searches a bmp file for all recognizable characters and returns them in an array
; Requires:    Microsoft Word must be installed on system & <Array.au3>
; Parameters:      $file   bmp file to search
;                  
; Syntax:        _OCR($file)
; Returns:    $Array[1] = 0 on failure, $Array on success
;
;===============================================================================
Func _OCR($file)
    Dim $miDoc, $Doc, $str, $oWord, $sArray[500]

    $oMyError = ObjEvent("AutoIt.Error","_CoMErrFunc")
    $miDoc = ObjCreate("MODI.Document")
    $miDocView = ObjCreate("MiDocViewer.MiDocView")
    $miDoc.Create($file)
    $miDoc.Ocr(9, True, False)
    $MiDocView.Document = $miDoc
    $MiDocView.SetScale (0.75, 0.75)
    
    $file = FileOpen(@ScriptDir & "\OCR_text.txt", 1)

    
    $i = 0
    For $oWord in $miDoc.Images(0).Layout.Words
        $str = $str & $oWord.text & @CrLf
            ConsoleWrite($oWord.text & @CRLF)
            FileWriteLine($file,$oWord.text & @CRLF)
        $sArray [$i] = $oWord.text
        $i += 1
    Next
    
    Return $sArray
    
    FileClose($file)

The only problem is that I get an error here "$miDoc.Ocr(9, True, False)" if the image I am scanning has no words in it at all.

If I remove that line I get the same error here "For $oWord in $miDoc.Images(0).Layout.Words"

How can I make it so that it will continue the script if there is no text found?

edit: and I can't for the life of me figure out how to get this to write the text maintaining it's structure instead of putting 1 word per line in the text file...

Edited by jebus495
Link to comment
Share on other sites

Thanks ptrex for this. I pieced together a simple Reading tool based off code in this thread and code from others.

#include <GUIConstantsEx.au3>
#include <WindowsConstants.au3>
#include <SendMessage.au3>
#include <ScreenCapture.au3>
;#include <GDIPlus.au3>
;#include <Array.au3>
Global $objSpeak, $terminate, $gui, $btnBack, $btnForward, $btnRepeat, $int, $lastint

$win = _SetRegion()
$img = @ScriptDir & "\temp.bmp"
_TakePicture($win[0],$win[1]+16,$win[0]+$win[2],$win[1]+16+$win[3],$img)
$readthis = _OCR($img)
FileDelete($img)
_PlayBack($readthis)

Func _OCR($imagefile)
    If Not FileExists($imagefile) Then Return 0
    
    Const $miLANG_CZECH = 5
    Const $miLANG_DANISH = 6
    Const $miLANG_DUTCH = 19
    Const $miLANG_ENGLISH = 9
    Const $miLANG_FINNISH = 11
    Const $miLANG_FRENCH = 12
    Const $miLANG_GERMAN = 7
    Const $miLANG_GREEK = 8
    Const $miLANG_HUNGARIAN = 14
    Const $miLANG_ITALIAN = 16
    Const $miLANG_JAPANESE = 17
    Const $miLANG_KOREAN = 18
    Const $miLANG_NORWEGIAN = 20
    Const $miLANG_POLISH = 21
    Const $miLANG_PORTUGUESE = 22
    Const $miLANG_RUSSIAN = 25
    Const $miLANG_SPANISH = 10
    Const $miLANG_SWEDISH = 29
    Const $miLANG_TURKISH = 31
    Const $miLANG_SYSDEFAULT = 2048
    Const $miLANG_CHINESE_SIMPLIFIED = 2052
    Const $miLANG_CHINESE_TRADITIONAL = 1028
    
    $MODI = ObjCreate("MODI.Document")
    $MODI.Create($imagefile)
    $MODI.Ocr($miLANG_ENGLISH, True, False)

    Dim $line = ""
    Dim $sentences[1]

    For $oWord in $MODI.Images(0).Layout.Words
        $word = $oWord.text
        $line &= $word & " "
        If StringInstr($word,".") or StringInstr($word,":") Or StringInstr($word,"!") Then
            $sentences[UBound($sentences)-1]=$line
            ReDim $sentences[UBound($sentences)+1]
            $line = ""
        EndIf
    Next
    
    Return $sentences
EndFunc

Func _SetRegion()
    $gui = GUICreate("OCR Region Highlighter", 320, 240, -1, -1, 0x00050000)
        GUISetBkColor(0xA6CAF0)
        WinSetTrans($gui,"",100)
        GUISetCursor(9)
    GUISetState(@SW_SHOW)

    While 1
        $nMsg = GUIGetMsg(1)
        Switch $nMsg[0]
            Case $GUI_EVENT_CLOSE
                ExitLoop
            Case $GUI_EVENT_PRIMARYDOWN
                $hWnd = _SendMessage($nMsg[1], $WM_SYSCOMMAND, 0xF012, 2,1)
        EndSwitch
    WEnd

    $winpos = WinGetPos($gui)
    GUISetCursor(1)
    GUIDelete($gui)
    Return $winpos
EndFunc

Func _TakePicture($x, $y, $width, $height, $temp)
    Local $hBitmap1,$hImage1
    _GDIPlus_Startup ()
    $hBitmap1 = _ScreenCapture_Capture ("",$x, $y, $width, $height)
    $hImage1 = _GDIPlus_BitmapCreateFromHBITMAP ($hBitmap1)
    _GDIPlus_ImageSaveToFile ($hImage1, $temp)
    _GDIPlus_ImageDispose ($hImage1)
    _WinAPI_DeleteObject ($hBitmap1)
EndFunc

Func _TalkOBJ($text)
    If Not $objSpeak Then $objSpeak= ObjCreate("SAPI.SpVoice")
    GUISetCursor(1,1)
    $title = WinGetTitle($gui)
    WinSetTitle($title,"",$title & " - Speaking")
    $objSpeak.Speak($text)
    WinSetTitle($title & " - Speaking","",$title)
    GUISetCursor(2)
EndFunc

Func _PlayBack($text)
    HotKeySet("{esc}","_Esc")
    $int = 0
    $lastint = UBound($text)-1
    $talk = True

    $title = "OCR Playback Controls"
    $gui = GUICreate($title, 411, 178, -1, -1)
    $edit = GUICtrlCreateEdit("", 40, 24, 337, 89, 0x0801)
    $btnBack = GUICtrlCreateButton("<", 56, 136, 97, 33, 0)
        GUICtrlSetFont(-1,14,1000)
    $btnRepeat = GUICtrlCreateButton("Repeat", 160, 136, 97, 33, 0)
        GUICtrlSetFont(-1,10,400)
    $btnForward = GUICtrlCreateButton(">", 264, 136, 97, 33, 0)
        GUICtrlSetFont(-1,14,1000)
    GUISetState(@SW_SHOW)

    While 1
        If $terminate Then Exit
        $nMsg = GUIGetMsg()
        Switch $nMsg
            Case $GUI_EVENT_CLOSE
                Exit
            Case $btnBack
                If $int >= 1 Then $int -= 1
                $talk = True
            Case $btnRepeat
                $talk = True
            Case $btnForward
                If $int <= $lastint-1 Then $int += 1
                $talk = True
        EndSwitch
        If $talk = True Then
            GuiCtrlSetData($edit,$text[$int])
            _TalkOBJ($text[$int])
            $talk=False
        EndIf
    WEnd
EndFunc

Func _Esc()
    $terminate = True
EndFunc

Func _BtnPress($int)
    Switch $int
        Case 0
            GUICtrlSetState($btnBack,$GUI_DISABLE)
        Case $lastint
            GUICtrlSetState($btnForward,$GUI_DISABLE)
        Case Else
            GUICtrlSetState($btnForward,$GUI_ENABLE)
            GUICtrlSetState($btnBack,$GUI_ENABLE)
    EndSwitch
EndFunc
Edited by spudw2k
Link to comment
Share on other sites

Is this the way all OCR API's work?

My document heading comes out 163 words deep in the array.

The top-leftmost words of the doc are farther down than that.

Can the OCR characteristics be changed to present a top-down line-oriented format?

Or, does one have to start comparing rectangle x/y coordinates for each word and manually reconstruct the document?

Edited by Spiff59
Link to comment
Share on other sites

  • 1 month later...

Hey I got a slight update on my front at least. I got everything working, including a dynamic screenshot using the new mini cap exe. And everything works including the OCR, however there are hiccups that I thought someone could lend some insight on:

The image I end up wanting to OCR is rather small, and I set the coordinates manually (hard coded). It reads the text fine (has some slight issues with 'c' and 'l' together sometimes (making a 'd' or an 'a&') - the main problem comes with the fact that the window I want to OCR is containing a list of words that might change from time to time. I have written a script to detect if the text has changed and re-OCR so thats all fine. However, the OCR script (using the modi) sometimes doesn't output as seperate elements in an array - that is if my list is something like this:

Apple Orange

Grape Pear

Monkey's Uncle

It will sometimes detect:

Apple Orange Grape Pear Monkeys Uncle

in one element... and I need it to output in the list format which it normally does. It seems to be able to output each word as a separate array element when the list is longer, rather than shorter. The problem only seems to happen when the list only has 2 or 3 items in it (max is 5 and it handles it perfectly with 4 or 5 100% of the time).

Any thoughts?

Basically, is there a better way to make it recognize that different words in the list are "down" further and therefore need to be placed in a different element in the array?

Edited by S0789300
Link to comment
Share on other sites

  • 3 weeks later...
  • 4 weeks later...

Hey guys

Been using Ocr in au3 for a while now ... long enough to figure its not 100%

Currently trying to improve the quality so heres my question :

If i have the font file used in writing what i need to read (and its not a public font) how can i use this to improve ocr quality ?

Thanks

P.S Ran into a spot of trouble ... any way i can make ocr recognise "Æ" ? all i get is "/E" or worse no matter the size or the backround

Edited by siriom
Link to comment
Share on other sites

Hi!

Has anyone got this work on Windows 7? I cannot get it to work on my Windows 7 and Office 2007 system

Win-7 has a native OCR component (see Control_Panel + Program_&_features & Turn_windows_features_On_Off).

But, I don't know how to use it...

Edited by Michel Claveau
Link to comment
Share on other sites

  • 3 weeks later...

Thanks ptrex for this. I pieced together a simple Reading tool based off code in this thread and code from others.

#include <GUIConstantsEx.au3>
#include <WindowsConstants.au3>
#include <SendMessage.au3>
#include <ScreenCapture.au3>
;#include <GDIPlus.au3>
;#include <Array.au3>
Global $objSpeak, $terminate, $gui, $btnBack, $btnForward, $btnRepeat, $int, $lastint

$win = _SetRegion()
$img = @ScriptDir & "\temp.bmp"
_TakePicture($win[0],$win[1]+16,$win[0]+$win[2],$win[1]+16+$win[3],$img)
$readthis = _OCR($img)
FileDelete($img)
_PlayBack($readthis)

Func _OCR($imagefile)
    If Not FileExists($imagefile) Then Return 0
    
    Const $miLANG_CZECH = 5
    Const $miLANG_DANISH = 6
    Const $miLANG_DUTCH = 19
    Const $miLANG_ENGLISH = 9
    Const $miLANG_FINNISH = 11
    Const $miLANG_FRENCH = 12
    Const $miLANG_GERMAN = 7
    Const $miLANG_GREEK = 8
    Const $miLANG_HUNGARIAN = 14
    Const $miLANG_ITALIAN = 16
    Const $miLANG_JAPANESE = 17
    Const $miLANG_KOREAN = 18
    Const $miLANG_NORWEGIAN = 20
    Const $miLANG_POLISH = 21
    Const $miLANG_PORTUGUESE = 22
    Const $miLANG_RUSSIAN = 25
    Const $miLANG_SPANISH = 10
    Const $miLANG_SWEDISH = 29
    Const $miLANG_TURKISH = 31
    Const $miLANG_SYSDEFAULT = 2048
    Const $miLANG_CHINESE_SIMPLIFIED = 2052
    Const $miLANG_CHINESE_TRADITIONAL = 1028
    
    $MODI = ObjCreate("MODI.Document")
    $MODI.Create($imagefile)
    $MODI.Ocr($miLANG_ENGLISH, True, False)

    Dim $line = ""
    Dim $sentences[1]

    For $oWord in $MODI.Images(0).Layout.Words
        $word = $oWord.text
        $line &= $word & " "
        If StringInstr($word,".") or StringInstr($word,":") Or StringInstr($word,"!") Then
            $sentences[UBound($sentences)-1]=$line
            ReDim $sentences[UBound($sentences)+1]
            $line = ""
        EndIf
    Next
    
    Return $sentences
EndFunc

Func _SetRegion()
    $gui = GUICreate("OCR Region Highlighter", 320, 240, -1, -1, 0x00050000)
        GUISetBkColor(0xA6CAF0)
        WinSetTrans($gui,"",100)
        GUISetCursor(9)
    GUISetState(@SW_SHOW)

    While 1
        $nMsg = GUIGetMsg(1)
        Switch $nMsg[0]
            Case $GUI_EVENT_CLOSE
                ExitLoop
            Case $GUI_EVENT_PRIMARYDOWN
                $hWnd = _SendMessage($nMsg[1], $WM_SYSCOMMAND, 0xF012, 2,1)
        EndSwitch
    WEnd

    $winpos = WinGetPos($gui)
    GUISetCursor(1)
    GUIDelete($gui)
    Return $winpos
EndFunc

Func _TakePicture($x, $y, $width, $height, $temp)
    Local $hBitmap1,$hImage1
    _GDIPlus_Startup ()
    $hBitmap1 = _ScreenCapture_Capture ("",$x, $y, $width, $height)
    $hImage1 = _GDIPlus_BitmapCreateFromHBITMAP ($hBitmap1)
    _GDIPlus_ImageSaveToFile ($hImage1, $temp)
    _GDIPlus_ImageDispose ($hImage1)
    _WinAPI_DeleteObject ($hBitmap1)
EndFunc

Func _TalkOBJ($text)
    If Not $objSpeak Then $objSpeak= ObjCreate("SAPI.SpVoice")
    GUISetCursor(1,1)
    $title = WinGetTitle($gui)
    WinSetTitle($title,"",$title & " - Speaking")
    $objSpeak.Speak($text)
    WinSetTitle($title & " - Speaking","",$title)
    GUISetCursor(2)
EndFunc

Func _PlayBack($text)
    HotKeySet("{esc}","_Esc")
    $int = 0
    $lastint = UBound($text)-1
    $talk = True

    $title = "OCR Playback Controls"
    $gui = GUICreate($title, 411, 178, -1, -1)
    $edit = GUICtrlCreateEdit("", 40, 24, 337, 89, 0x0801)
    $btnBack = GUICtrlCreateButton("<", 56, 136, 97, 33, 0)
        GUICtrlSetFont(-1,14,1000)
    $btnRepeat = GUICtrlCreateButton("Repeat", 160, 136, 97, 33, 0)
        GUICtrlSetFont(-1,10,400)
    $btnForward = GUICtrlCreateButton(">", 264, 136, 97, 33, 0)
        GUICtrlSetFont(-1,14,1000)
    GUISetState(@SW_SHOW)

    While 1
        If $terminate Then Exit
        $nMsg = GUIGetMsg()
        Switch $nMsg
            Case $GUI_EVENT_CLOSE
                Exit
            Case $btnBack
                If $int >= 1 Then $int -= 1
                $talk = True
            Case $btnRepeat
                $talk = True
            Case $btnForward
                If $int <= $lastint-1 Then $int += 1
                $talk = True
        EndSwitch
        If $talk = True Then
            GuiCtrlSetData($edit,$text[$int])
            _TalkOBJ($text[$int])
            $talk=False
        EndIf
    WEnd
EndFunc

Func _Esc()
    $terminate = True
EndFunc

Func _BtnPress($int)
    Switch $int
        Case 0
            GUICtrlSetState($btnBack,$GUI_DISABLE)
        Case $lastint
            GUICtrlSetState($btnForward,$GUI_DISABLE)
        Case Else
            GUICtrlSetState($btnForward,$GUI_ENABLE)
            GUICtrlSetState($btnBack,$GUI_ENABLE)
    EndSwitch
EndFunc
This in responce to #91. I didn't see any instructions but looking at the code, you are supposed to press the "Esc" key to activate the OCR. When I do, I get an error. Am I doing something wrong?

Line 43

$MODI.Create($imagefile)

$MODI^ERROR

Error: Variable must be of type "Object".

CAN YOU PLEASE HELP ME?

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...