Jump to content

Real OCR in AU3 - in a few lines.


Recommended Posts

  • 5 months later...

Hello Folks,

I am facing an issue when trying to use this fonction.

First of all, thank you Ptrex: your code was really helpful.

Unfortunatly, it crashes after a while.

Here below is my code to reproduce the crash.

I only get the "classic" error message stating that autoit crashed and I can report the issue to microsoft.

I don't even get the popup I asked for that could intercept the problem and tell me more about the error code.

Any pointer would be really helpful.

I am using Windows XP service Pack 3 and tried with these different versions of office:

Office 2007

Office 2007 SP2

I realize this thread is quite old, I hope someone is still monitoring it.


#include <Array.au3>

Global Const $miLANG_CZECH = 5
Global Const $miLANG_DANISH = 6
Global Const $miLANG_DUTCH = 19
Global Const $miLANG_ENGLISH = 9
Global Const $miLANG_FINNISH = 11
Global Const $miLANG_FRENCH = 12
Global Const $miLANG_GERMAN = 7
Global Const $miLANG_GREEK = 8
Global Const $miLANG_HUNGARIAN = 14
Global Const $miLANG_ITALIAN = 16
Global Const $miLANG_JAPANESE = 17
Global Const $miLANG_KOREAN = 18
Global Const $miLANG_NORWEGIAN = 20
Global Const $miLANG_POLISH = 21
Global Const $miLANG_PORTUGUESE = 22
Global Const $miLANG_RUSSIAN = 25
Global Const $miLANG_SPANISH = 10
Global Const $miLANG_SWEDISH = 29
Global Const $miLANG_TURKISH = 31
Global Const $miLANG_CHINESE_SIMPLIFIED = 2052

Dim $screenshot_dir=@MyDocumentsDir & "\screenshots\"
Dim $screenshot_file="screenshot.tif"

Global $Error = False

Dim $i=0

; loop to use ocr function some times (using 2 seconds pause in between)
; this eventually lead to a crash
While $i<1000
    Dim $oMyError = ObjEvent("AutoIt.Error","OCRErrFunc")
    $sArray = OCRGet($screenshot_dir & $screenshot_file, $miLANG_ENGLISH)

    ; _ArrayDisplay($sArray, "OCR Result")

Func OCRGet($Image, $Lang=9)
    Local $sArray[1], $oWord
    Local $miDoc = ObjCreate("MODI.Document")
    If @error Then Return SetError(1)
    $miDoc.Ocr($Lang, True, False)
    If @error Then Return SetError(2)
    Return $sArray

;------------------------------ This is a COM Error handler --------------------------------
Func OCRErrFunc()
  $HexNumber = hex($oMyError.number, 8)
  Msgbox(0,"COM Error Test","We intercepted a COM Error !"       & @CRLF  & @CRLF & _
             "err.description is: "    & @TAB & $oMyError.description    & @CRLF & _
             "err.windescription:"     & @TAB & $oMyError.windescription & @CRLF & _
             "err.number is: "         & @TAB & $HexNumber              & @CRLF & _
             "err.lastdllerror is: "   & @TAB & $oMyError.lastdllerror   & @CRLF & _
             "err.scriptline is: "     & @TAB & $oMyError.scriptline     & @CRLF & _
             "err.source is: "         & @TAB & $oMyError.source         & @CRLF & _
             "err.helpfile is: "       & @TAB & $oMyError.helpfile       & @CRLF & _
             "err.helpcontext is: "    & @TAB & $oMyError.helpcontext)
  SetError(1)  ; to check for after this function returns
Edited by Eguun
Link to comment
Share on other sites


I managed to get around the crash by running the OCR as an external script from my main script.

When it crashes I make my main script check for the error message and simply click the button.

It's all good now


Link to comment
Share on other sites

  • 5 months later...

Hi everyone!

I ran into trouble when running the OCR MODI autoit script. MODI object couldn't be found even though I have MODI installed and working.

The problem was that AutoIt was compiling the script in x64, when MODI runs in x86. This also affected to running the script straight from AutoIt Scite (F5).


From AutoIt Scite

Open AutoIt compile (ctrl+F7) -> select tab "Aut2Exe" -> select "Compile X86 version."

I am using Win7 X64 with Office 2007.

AutoIt Scite Version 2.27 (Jun 24 2011 17:46:25).

I hope this saves someone from trouble of finding the error :graduated:

Since office 2010 is making it's way to users and it doesn't support MODI anymore, here is how you can use MODI with Office 2010:



Link to comment
Share on other sites

  • 9 months later...

Can someone please help me out.

I copy and paste the script, run it and I am getting "We intercepted a com error"

The line with the offending code is "$miDoc.ocr($miLan_English, True, False)

Pleas help!, I really need this


Edited by AutoitMike
Link to comment
Share on other sites

Link to comment
Share on other sites

Link to comment
Share on other sites


Thanks very much fir the link to Kilhian's code.

It works , albeit dropping occasional letters.

I am able to find and study many of the _GDI functions, however, I cant seem to find documentation on how "$miDoc.Ocr($miLANG_ENGLISH, True, False)" actually works or how I could ever derive to this line of code, so I dont know what "True, False" is referring to.

If you could point me to this I would very much appreciate it.


Link to comment
Share on other sites


Maybe this can give you a clue.

expression.OCR(LangId, OCROrientImage, OCRStraightenImage)

OCROrientImage Optional Boolean. Specifies whether the OCR engine attempts to determine the orientation of the page. Default is true.

OCRStraightenImage Optional Boolean. Specifies whether the OCR engine attempts to "de-skew" the page to correct for small angles of misalignment from the vertical. Default is true.



Link to comment
Share on other sites

  • 1 month later...


I found the documentation on Microsofts site searching "Modi.document"

I see that there is a function OnOcrProgress that returns the progress of the OCR "Progress" but it seems that this can only work in VB. A sample code script is given:

Private Sub mmiDoc_OnOCRProgress(ByVal Progress As Long, Cancel As Boolean) ;mmidoc is previously created as a document object

' Cancel if user has clicked the Cancel button.

If mblnCancel Then

Cancel = True

End If

' Indicate progress on the ProgressBar control

pbrOCRProgress.Value = Progress

End Sub

The actual use is defind as: expression.OnOCRProgress(Progress, Cancel) where expression is a document object.

This is called periodically during an optical character recognition (OCR) operation. However, I cant make it work in AutoIt

Is it possible?



Link to comment
Share on other sites

  • 11 months later...

Is Optical Character Recognition (OCR) installed in the Installation Options for Office 2010?

You can check this via the Programs and Features Control Panel Applet and Clicking the Change button after selecting Office in the list.

Link to comment
Share on other sites

Link to comment
Share on other sites

  • 1 year later...
Yes it does work using selecting the screen region.
The only downside of this library is that is relies 100 % on MS Office.
So if you don't have the proper version installed it will not find the needed COM object.
But I am still using this frequently up till now.
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

  • Recently Browsing   0 members

    • No registered users viewing this page.
  • Create New...