Jump to content

OCR help


Recommended Posts

This Meathod worked with an Exact match to the output "asdf478 tomar x wawaw".

;=================================   OCR   =======================================
; Function Name:    
; Description:                      Searches a bmp file for all recognizable characters and returns them in an array
; Requires:                         Microsoft Word must be installed on system & <Array.au3>
; Parameters:       $file           bmp file to search
;                   
; Syntax:         _OCR($file)
; Author(s):        ofLight
; Returns:      $Array[1] = 0 on failure, $Array on success
;
; EG:           _PixelShow_Virtual(25,25,25,25)
;               Sleep(1000)
;               $output = _OCR("C:\ofLight\Current AU3 Scripts\Render.bmp")
;               _ArrayDisplay($output)
;===============================================================================
Func _OCR($file)
    Dim $miDoc, $Doc, $str, $oWord, $sArray[500]

    $oMyError = ObjEvent("AutoIt.Error","_CoMErrFunc")
    $miDoc = ObjCreate("MODI.Document")
    $miDocView = ObjCreate("MiDocViewer.MiDocView")
    $miDoc.Create($file)
    $miDoc.Ocr(9, True, False)
    $MiDocView.Document = $miDoc
    $MiDocView.SetScale (0.75, 0.75)

    $i = 0
    For $oWord in $miDoc.Images(0).Layout.Words
        $str = $str & $oWord.text & @CrLf
            ConsoleWrite($oWord.text & @CRLF)
        $sArray [$i] = $oWord.text
        $i += 1
    Next
    
    Return $sArray
EndFunc;==========================   OCR  ========================================oÝ÷ Ø^®Ê'jYrºÇ¥µ'+yéÂjn­æ®ÞbÊ'éܶ*'Â+a¶¬¶®É%Ê­jz.¶­jëh×6;=================================   PixelShow_Virtual   =========================
; Function Name:    _PixelShow_Virtual
; Description:      Used to mouseover Content and recieve CoOrds and Pixelcolor
; Requires:         
; Parameters:       $l
;                   $t
;                   $r
;                   $b
;                   $stats          Default is Stats = 1 tooltip on, if stats = 0 tooltip off
;                   $FileName       Virtaul rendered image name, Default is Render.bmp
;                   $WinName        Rendered window title, default is "Render VD"
; Syntax:            _PixelShow_Virtual($l, $t, $r, $b, $Stats, $FileName, $WinName)
; Author(s):        
; Returns:
;===============================================================================
Func _PixelShow_Virtual($l=20,$t=20,$r=20,$b=20,$Stats=1,$FileName = ".\Render.bmp",$WinName = "Render VD")
    Local $FileSize = 0
    Global $GUIRenderPic
    $xy = MouseGetPos()
    Sleep(40)
    _ScreenCapture_Capture ($FileName, $xy[0]-$l, $xy[1]-$t, $xy[0]+$r, $xy[1]+$B)
    ;If FileGetSize($FileName) <> $FileSize Then
        If WinExists($WinName) Then
            GUICtrlSetImage($GUIRenderPic,$FileName)
            GUISetState(@SW_SHOW)           
        Else
            $GUIRendersize = _ImageGetSize($FileName)
            $GUIRenderDisp = GUICreate($WinName, $GUIRendersize[0], $GUIRendersize[1], (@DesktopWidth-5) - $GUIRendersize[0], 145, $WS_POPUP)
            $GUIRenderPic = GUICtrlCreatePic($FileName,0,0,$GUIRendersize[0],$GUIRendersize[1])
            GUISetState(@SW_SHOW)
        EndIf
        $FileSize = FileGetSize($FileName)
    ;EndIf
    If $Stats=1 Then 
        $Loc = WinGetPos($WinName)
        ToolTip("Color = "&PixelGetColor($Loc[0]+($Loc[2]/2),$Loc[1]+($Loc[3]/2))&@LF&"X: "&$xy[0]&"   "&"Y: "&$xy[1])
    EndIf
EndFunc;==========================   PixelShow_Virtual   =========================

There is always a butthead in the crowd, no matter how hard one tries to keep them out.......Volly

Link to comment
Share on other sites

Wow really...

Old Ptrex method...

It's crazy cause I did not even try it, because usually textract sdk outperforms it.

Thats awesome, THX.

:D

Hey, is there a way to get that OCR func to give back just the work, instead of the array.

I know it was designed with big documents in mind, I just need 1 word snippets.

Edited by Oldschool
Link to comment
Share on other sites

Sorry, but can you tell me exactly what you did, cause I can't even get Modi to read any of it....

#include <Array.au3>
$file = "C:\imageNEW.bmp"
_OCR($file)
Func _OCR($file)
    Dim $miDoc, $Doc, $str, $oWord, $sArray[500]

    $oMyError = ObjEvent("AutoIt.Error","_CoMErrFunc")
    $miDoc = ObjCreate("MODI.Document")
    $miDocView = ObjCreate("MiDocViewer.MiDocView")
    $miDoc.Create($file)
    $miDoc.Ocr(9, True, False)
    $MiDocView.Document = $miDoc
    $MiDocView.SetScale (0.75, 0.75)

    $i = 0
    For $oWord in $miDoc.Images(0).Layout.Words
        $str = $str & $oWord.text & @CrLf
            ConsoleWrite($oWord.text & @CRLF)
        $sArray [$i] = $oWord.text
        $i += 1
    Next
   
    _ArrayDisplay($sArray)
EndFunc;==========================   OCR  ========================================

Produces a COM error...

Also _ImageGetSize is an unknown function.....

Which include is that from??

Edited by Oldschool
Link to comment
Share on other sites

Ya Ptrex Idea for the OCR saved me biggtime :D , as well as Paulia's Screencap Funcs

This is a working example, you Must have MS Word installed(or the DLL) on your box to use this. For the example its just run the script and put your mouse over the Text. If it doesnt think it knows what its seeing you will get Err 80020009, just means u need to align your capture box up better with the text you are trying to read.

#include <Array.au3>
#include <ScreenCapture.au3>

Global $oMyError

Sleep(3000)
_PixelShow_Virtual(55,15,55,15)
Sleep(1000)

$output = _OCR(".\Render.bmp")

_ArrayDisplay($output)

Func _OCR($file)
    Dim $miDoc, $Doc, $str, $oWord, $sArray[500]

    $oMyError = ObjEvent("AutoIt.Error","_CoMErrFunc")
    $miDoc = ObjCreate("MODI.Document")
    $miDocView = ObjCreate("MiDocViewer.MiDocView")
    $miDoc.Create($file)
    $miDoc.Ocr(9, True, False)
    $MiDocView.Document = $miDoc
    $MiDocView.SetScale (0.75, 0.75)

    $i = 0
    For $oWord in $miDoc.Images(0).Layout.Words
        $str = $str & $oWord.text & @CrLf
            ConsoleWrite($oWord.text & @CRLF)
        $sArray [$i] = $oWord.text
        $i += 1
    Next
    
    Return $sArray
EndFunc;==========================   OCR  ========================================

Func _PixelShow_Virtual($l=20,$t=20,$r=20,$b=20,$Stats=1,$FileName = ".\Render.bmp",$WinName = "Rendered Image 0014")
    Local $FileSize = 0
    Global $GUIRenderPic
    $xy = MouseGetPos()
    Sleep(40)
    _ScreenCapture_Capture ($FileName, $xy[0]-$l, $xy[1]-$t, $xy[0]+$r, $xy[1]+$B)
    ;If FileGetSize($FileName) <> $FileSize Then
        If WinExists($WinName) Then
            GUICtrlSetImage($GUIRenderPic,$FileName)
            GUISetState(@SW_SHOW)           
        Else
            $GUIRendersize = _ImageGetSize($FileName)
            $GUIRenderDisp = GUICreate($WinName, $GUIRendersize[0], $GUIRendersize[1], (@DesktopWidth-5) - $GUIRendersize[0], 145, $WS_POPUP)
            $GUIRenderPic = GUICtrlCreatePic($FileName,0,0,$GUIRendersize[0],$GUIRendersize[1])
            GUISetState(@SW_SHOW)
        EndIf
        $FileSize = FileGetSize($FileName)
    ;EndIf
    If $Stats=1 Then 
        $Loc = WinGetPos($WinName)
        ToolTip("Color = "&PixelGetColor($Loc[0]+($Loc[2]/2),$Loc[1]+($Loc[3]/2))&@LF&"X: "&$xy[0]&"   "&"Y: "&$xy[1])
    EndIf
EndFunc;==========================   PixelShow_Virtual   =========================

Func _ImageGetSize($sFile);SUB Function
    Local $sHeader = _FileReadAtOffsetHEX($sFile, 1, 24); Get header bytes
    Local $asIdent = StringSplit("FFD8 424D 89504E470D0A1A 4749463839 4749463837 4949 4D4D", " ")
    Local $anSize = ""
    For $i = 1 To $asIdent[0]
        If StringInStr($sHeader, $asIdent[$i]) = 1 Then
            Select
                Case $i = 2; BMP
                    $anSize = _ImageGetSizeSimple($sHeader, 19, 23, 0)
                    ExitLoop
            EndSelect
        EndIf
    Next
    If Not IsArray($anSize) Then SetError(1)
    Return ($anSize)
EndFunc 

Func _FileReadAtOffsetHEX($sFile, $nOffset, $nBytes);SUB Function
    Local $hFile = FileOpen($sFile, 0)
    Local $sTempStr = ""
    FileRead($hFile, $nOffset - 1)
    For $i = $nOffset To $nOffset + $nBytes - 1
        $sTempStr = $sTempStr & Hex(Asc(FileRead($hFile, 1)), 2)
    Next
    FileClose($hFile)
    Return ($sTempStr)
EndFunc  

Func _ImageGetSizeSimple($sHeader, $nXoff, $nYoff, $nByteOrder);SUB Function
    Local $anSize[2]
    $anSize[0] = _Dec(StringMid($sHeader, $nXoff * 2 - 1, 4), $nByteOrder)
    $anSize[1] = _Dec(StringMid($sHeader, $nYoff * 2 - 1, 4), $nByteOrder)
    Return ($anSize)
EndFunc  

Func _Dec($sHexStr, $nByteOrder);SUB Function
    If $nByteOrder Then Return (Dec($sHexStr))
    Local $sTempStr = ""
    While StringLen($sHexStr) > 0
        $sTempStr = $sTempStr & StringRight($sHexStr, 2)
        $sHexStr = StringTrimRight($sHexStr, 2)
    WEnd
    Return (Dec($sTempStr))
EndFunc

Func _CoMErrFunc();SUB Function
  $HexNumber=hex($oMyError.number,8)
  Msgbox(0,"COM Error Test","We intercepted a COM Error !"       & @CRLF  & @CRLF & _
             "err.description is: "    & @TAB & $oMyError.description    & @CRLF & _
             "err.windescription:"     & @TAB & $oMyError.windescription & @CRLF & _
             "err.number is: "         & @TAB & $HexNumber              & @CRLF & _
             "err.lastdllerror is: "   & @TAB & $oMyError.lastdllerror   & @CRLF & _
             "err.scriptline is: "     & @TAB & $oMyError.scriptline     & @CRLF & _
             "err.source is: "         & @TAB & $oMyError.source         & @CRLF & _
            "err.helpfile is: "       & @TAB & $oMyError.helpfile       & @CRLF & _
             "err.helpcontext is: "    & @TAB & $oMyError.helpcontext _
            )
  SetError(1)  ; to check for after this function returns
Endfunc

There is always a butthead in the crowd, no matter how hard one tries to keep them out.......Volly

Link to comment
Share on other sites

Ya Ptrex Idea for the OCR saved me biggtime :D , as well as Paulia's Screencap Funcs

This is a working example, you Must have MS Word installed(or the DLL) on your box to use this. For the example its just run the script and put your mouse over the Text. If it doesnt think it knows what its seeing you will get Err 80020009, just means u need to align your capture box up better with the text you are trying to read.

That worked, but giving it the actual .bmp without using mouseover capture COM errors out.

Why do you think that happens?

When I capture the screen the images i'm working with actually have different color backgrounds. So I can't really use this method...or at least I have not figures out yet how to process all of it in memory.

I do some post processing on the original captures to turn them into this kind of text...

Does it work on this .bmp on your end?

imageNEW.bmp

Link to comment
Share on other sites

I think i figured it out...

The way the code is now, it has to have at least 2 words visible, or it COM errors out, because there are no array entries.

Help me strip the array out of that please...

in textract it's easy:

$img = "C:\imageNEW.bmp"
    $oOCR.Init
    $oUtput = $oOCR.ReadFile($img)
    $Text =  $oOCR.Text
    $oOCR.Term
    Return $Text

Where can I find the Modi API description?

Edited by Oldschool
Link to comment
Share on other sites

Im not sure where to get the Modi description, sorry its been to long since i messed with this :D.

Mine seems to works even if theres just one word visible it sets it as $Array[0] , and no errors. Can i see a Screenshot or image of what you are trying to read ? I believe i did encounter some problems if the total search area was very small, but dont remember the specifics.

EDIT: sorry didnt see previous post, im an blind, lemme test it

Edited by ofLight

There is always a butthead in the crowd, no matter how hard one tries to keep them out.......Volly

Link to comment
Share on other sites

ok, Just replacing that File "imageNEW.bmp" with the "render.bmp" does not work for me either. I believe the total search area needs to be a minimum (i dont remember exactly what), But it works reading the letters once you are searching a larger area.

There is always a butthead in the crowd, no matter how hard one tries to keep them out.......Volly

Link to comment
Share on other sites

ok, Just replacing that File "imageNEW.bmp" with the "render.bmp" does not work for me either. I believe the total search area needs to be a minimum (i dont remember exactly what), But it works reading the letters once you are searching a larger area.

Yep, that's exactly it, I got it working.

The image just has to be big enough about 50x30 pixels...

$file = "C:\imageNEW.bmp"
_OCR($file)

Func _OCR($file)
    $oMyError = ObjEvent("AutoIt.Error","_CoMErrFunc")
    $miDoc = ObjCreate("MODI.Document")
    $miDoc.Create($file)
    $miDoc.Ocr(9, True, False)
    $var = $miDoc.Images(0).Layout.Text
    MsgBox(0,"", $var)
EndFunc

How about that for efficient...

There are a couple of problems with it still though. Like it thinks that felps1122 is feIpsll22, but you just can't win it all now can you. :D

Link to comment
Share on other sites

@All

Good to see it is still alive. :P

regards,

ptrex

.....PTREX!! PTREX!!

PTREX!! :D PTREX!!

.....PTREX!! PTREX!!

Hey, have you ever noticed that a some words make Modi crash and AutoIt crash.

!>16:15:01 AutoIT3.exe ended.rc:-1073741819

Like a a word "d.lali" for example...

I think it's the same kind of thing as with image size, if it's too small it's a no go...

Link to comment
Share on other sites

Whoa, you are right. I just tried it out and it is throwin up the COM error for me as well. I entered "d.lali" into an edit box and it crashed however when i try just "dlali" it does not, would be interesting to know exactly why its doing that, unfortunately I don't know Modi.

For my personal Uses, I generally remove the COM error checking (I would rather get NO response than have my script halt). For monitoring a chatlog I haven't seen any missed messages.

There is always a butthead in the crowd, no matter how hard one tries to keep them out.......Volly

Link to comment
Share on other sites

#include <Array.au3>
#include <ScreenCapture.au3>

Global $oMyError

Sleep(3000)
_PixelShow_Virtual(55,15,55,15)
Sleep(1000)

$output = _OCR(".\Render.bmp")

_ArrayDisplay($output)

Func _OCR($file)
    Dim $miDoc, $Doc, $str, $oWord, $sArray[500]

    $oMyError = ObjEvent("AutoIt.Error","_CoMErrFunc")
    $miDoc = ObjCreate("MODI.Document")
    $miDocView = ObjCreate("MiDocViewer.MiDocView")
    $miDoc.Create($file)
    $miDoc.Ocr(9, True, False)
    $MiDocView.Document = $miDoc
    $MiDocView.SetScale (0.75, 0.75)

    $i = 0
    For $oWord in $miDoc.Images(0).Layout.Words
        $str = $str & $oWord.text & @CrLf
            ConsoleWrite($oWord.text & @CRLF)
        $sArray [$i] = $oWord.text
        $i += 1
    Next
   
    Return $sArray
EndFunc;==========================   OCR  ========================================

Func _PixelShow_Virtual($l=20,$t=20,$r=20,$b=20,$Stats=1,$FileName = ".\Render.bmp",$WinName = "Rendered Image 0014")
    Local $FileSize = 0
    Global $GUIRenderPic
    $xy = MouseGetPos()
    Sleep(40)
    _ScreenCapture_Capture ($FileName, $xy[0]-$l, $xy[1]-$t, $xy[0]+$r, $xy[1]+$b)
   ;If FileGetSize($FileName) <> $FileSize Then
        If WinExists($WinName) Then
            GUICtrlSetImage($GUIRenderPic,$FileName)
            GUISetState(@SW_SHOW)        
        Else
            $GUIRendersize = _ImageGetSize($FileName)
            $GUIRenderDisp = GUICreate($WinName, $GUIRendersize[0], $GUIRendersize[1], (@DesktopWidth-5) - $GUIRendersize[0], 145, $WS_POPUP)
            $GUIRenderPic = GUICtrlCreatePic($FileName,0,0,$GUIRendersize[0],$GUIRendersize[1])
            GUISetState(@SW_SHOW)
        EndIf
        $FileSize = FileGetSize($FileName)
   ;EndIf
    If $Stats=1 Then
        $Loc = WinGetPos($WinName)
        ToolTip("Color = "&PixelGetColor($Loc[0]+($Loc[2]/2),$Loc[1]+($Loc[3]/2))&@LF&"X: "&$xy[0]&"   "&"Y: "&$xy[1])
    EndIf
EndFunc;==========================   PixelShow_Virtual   =========================

Func _ImageGetSize($sFile);SUB Function
    Local $sHeader = _FileReadAtOffsetHEX($sFile, 1, 24); Get header bytes
    Local $asIdent = StringSplit("FFD8 424D 89504E470D0A1A 4749463839 4749463837 4949 4D4D", " ")
    Local $anSize = ""
    For $i = 1 To $asIdent[0]
        If StringInStr($sHeader, $asIdent[$i]) = 1 Then
            Select
                Case $i = 2; BMP
                    $anSize = _ImageGetSizeSimple($sHeader, 19, 23, 0)
                    ExitLoop
            EndSelect
        EndIf
    Next
    If Not IsArray($anSize) Then SetError(1)
    Return ($anSize)
EndFunc

Func _FileReadAtOffsetHEX($sFile, $nOffset, $nBytes);SUB Function
    Local $hFile = FileOpen($sFile, 0)
    Local $sTempStr = ""
    FileRead($hFile, $nOffset - 1)
    For $i = $nOffset To $nOffset + $nBytes - 1
        $sTempStr = $sTempStr & Hex(Asc(FileRead($hFile, 1)), 2)
    Next
    FileClose($hFile)
    Return ($sTempStr)
EndFunc 

Func _ImageGetSizeSimple($sHeader, $nXoff, $nYoff, $nByteOrder);SUB Function
    Local $anSize[2]
    $anSize[0] = _Dec(StringMid($sHeader, $nXoff * 2 - 1, 4), $nByteOrder)
    $anSize[1] = _Dec(StringMid($sHeader, $nYoff * 2 - 1, 4), $nByteOrder)
    Return ($anSize)
EndFunc 

Func _Dec($sHexStr, $nByteOrder);SUB Function
    If $nByteOrder Then Return (Dec($sHexStr))
    Local $sTempStr = ""
    While StringLen($sHexStr) > 0
        $sTempStr = $sTempStr & StringRight($sHexStr, 2)
        $sHexStr = StringTrimRight($sHexStr, 2)
    WEnd
    Return (Dec($sTempStr))
EndFunc

Func _CoMErrFunc();SUB Function
  $HexNumber=hex($oMyError.number,8)
  Msgbox(0,"COM Error Test","We intercepted a COM Error !"     & @CRLF  & @CRLF & _
             "err.description is: " & @TAB & $oMyError.description  & @CRLF & _
             "err.windescription:"   & @TAB & $oMyError.windescription & @CRLF & _
             "err.number is: "       & @TAB & $HexNumber              & @CRLF & _
             "err.lastdllerror is: "   & @TAB & $oMyError.lastdllerror   & @CRLF & _
             "err.scriptline is: "   & @TAB & $oMyError.scriptline   & @CRLF & _
             "err.source is: "       & @TAB & $oMyError.source       & @CRLF & _
            "err.helpfile is: "    & @TAB & $oMyError.helpfile     & @CRLF & _
             "err.helpcontext is: " & @TAB & $oMyError.helpcontext _
            )
  SetError(1) ; to check for after this function returns
Endfunc

I have tried your script and i must say it work flawless ! thanks guys, that's what i was searching for..

Now the question is.. is it possible to modify it, so it writes the results in a file ? Like results.ini ? and when the search is done it will clear it?

I've tried to work with the array.au3, but since i'm a novice coder, i wasn't able to let it do it.

Can you guys pls help me ? :D

Thanks in advance

Link to comment
Share on other sites

I have tried your script and i must say it work flawless ! thanks guys, that's what i was searching for..

Now the question is.. is it possible to modify it, so it writes the results in a file ? Like results.ini ? and when the search is done it will clear it?

I've tried to work with the array.au3, but since i'm a novice coder, i wasn't able to let it do it.

Can you guys pls help me ? :D

Thanks in advance

You have to do something like this:

Func _OCR($file)
    $oMyError = ObjEvent("AutoIt.Error","_CoMErrFunc")
    $miDoc = ObjCreate("MODI.Document")
    $miDoc.Create($file)
    $miDoc.Ocr(9, True, False)
    $var = $miDoc.Images(0).Layout.Text
    IniWrite(@ScriptDir & "\my.ini", "SectionName", "LineID", $var)
EndFunc
Link to comment
Share on other sites

Whoa, you are right. I just tried it out and it is throwin up the COM error for me as well. I entered "d.lali" into an edit box and it crashed however when i try just "dlali" it does not, would be interesting to know exactly why its doing that, unfortunately I don't know Modi.

For my personal Uses, I generally remove the COM error checking (I would rather get NO response than have my script halt). For monitoring a chatlog I haven't seen any missed messages.

Yeah, I figured out how to get around this...

It happens when there is less than 5 Chars.

Link to comment
Share on other sites

Hello,

I am trying to read text out of a Java window with autoit and have found this thread. I am a newbie.

I get the this error message ...

Line 37 (File I:Progamme-Börse\AutoIT Skripte\OCRFunc):

$miDoc.Create($file)

$mIDoc^ ERROR

Error: Variable must be of type Object.

OK

when I run the _OCR Function.

Can you please tell me why this happens and how I can get it to work!

Thank you!

Cornelius

P.S. This is the code I am running:

#include <Array.au3>

Global $oMyError

Dim $output, $file

msgbox(0,"Start","")

Sleep(2000)

$output = _OCR("I:\Dokumente und Einstellungen\Cornelius\Eigene Dateien\Rendered.bmp")

_ArrayDisplay($output)

Func _OCR($file)

Dim $miDoc, $Doc, $str, $oWord, $sArray[500]

$oMyError = ObjEvent("AutoIt.Error","_CoMErrFunc")

$miDoc = ObjCreate("MODI.Document")

$miDocView = ObjCreate("MiDocViewer.MiDocView")

$miDoc.Create($file)

$miDoc.Ocr(9, True, False)

$MiDocView.Document = $miDoc

$MiDocView.SetScale (0.75, 0.75)

$i = 0

For $oWord in $miDoc.Images(0).Layout.Words

$str = $str & $oWord.text & @CrLf

ConsoleWrite($oWord.text & @CRLF)

$sArray [$i] = $oWord.text

$i += 1

Next

Return $sArray

EndFunc;========================== OCR ========================================

Edited by cnolte
Link to comment
Share on other sites

Hello,

I am trying to read text out of a Java window with autoit and have found this thread. I am a newbie.

I get the this error message ...

Line 37 (File I:Progamme-Börse\AutoIT Skripte\OCRFunc):

$miDoc.Create($file)

$mIDoc^ ERROR

Error: Variable must be of type Object.

when I run the _OCR Function.

Can you please tell me why this happens and how I can get it to work!

Do you have MODI Viewer installed?

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...