Sign in to follow this  
Followers 0
Penny

Best all-round compatible OCR lib around?

14 posts in this topic

I can't get tesseract to work on Seven, is there a good OCR core, other than MODI (which pretty much sucks) that works with all- xp, vista and seven?

Share this post


Link to post
Share on other sites



do other good ocr engines even exist?

Share this post


Link to post
Share on other sites

I can't get tesseract to work on Seven, is there a good OCR core, other than MODI (which pretty much sucks) that works with all- xp, vista and seven?

Exactly what can't you get to work?

Share this post


Link to post
Share on other sites

because I don't know how to convert the JPEG provided by _ScreenCapture_Capture into a TIFF file that can be handled with Tesseract.. is there an easy way to do that?

Share this post


Link to post
Share on other sites

because I don't know how to convert the JPEG provided by _ScreenCapture_Capture into a TIFF file that can be handled with Tesseract.. is there an easy way to do that?

Of course there is. I know this can be tricky as I myself couldn't get my head around.

You don't need to convert it to a tiff file, use the _TesseractScreenFind function to capture the screen area and scan for the text you're looking for all at the same time.

Let me know if you need more help. I'll remote into my work pc and dig out my code.

Share this post


Link to post
Share on other sites

nevermind, thing is I hated how that script you're talking about works, so I decided to make it for myself. I made this function to convert the pics to tiffs and it works as far as I can tell.

Func ConvertImageToTiff($path,$path_out)
    Local $image
    Local $CLSID

    _GDIPlus_Startup()

    $image = _GDIPlus_BitmapCreateFromFile($path)

    If Not $image Then
        ;todo: error msg, exit
        msgbox(0,"","stuff")
        Exit
    EndIf

    #Region Tiff Parameters
    Local $TIFColorDepth = 24
    Local $TIFCompression = $GDIP_EVTCOMPRESSIONNONE
    Local $tData
    Local $tParams
    Local $pParams

    $CLSID = _GDIPlus_EncodersGetCLSID("TIFF")

    $tParams = _GDIPlus_ParamInit(2)
    $tData = DllStructCreate("int ColorDepth;int Compression")

    DllStructSetData($tData, "ColorDepth", $TIFColorDepth)
    DllStructSetData($tData, "Compression", $TIFCompression)

    _GDIPlus_ParamAdd($tParams, $GDIP_EPGCOLORDEPTH, 1, $GDIP_EPTLONG, DllStructGetPtr($tData, "ColorDepth"))
    _GDIPlus_ParamAdd($tParams, $GDIP_EPGCOMPRESSION, 1, $GDIP_EPTLONG, DllStructGetPtr($tData, "Compression"))

    If IsDllStruct($tParams) Then
        $pParams = DllStructGetPtr($tParams)
    EndIf
    #EndRegion
    ;

    _GDIPlus_ImageSaveToFileEx($image, $path_out, $CLSID, $pParams)
    _GDIPlus_ImageDispose($image)
    _GDIPlus_Shutdown()
EndFunc

Share this post


Link to post
Share on other sites

#7 ·  Posted (edited)

is there any way to get around Tesseract's "I don't like the spacebar" thing? I can't really use files from anywhere but the root of the script, it's so annoying!

maybe by making folders with names longer than 8 chars into ABCDEF~1 ? because it works with @TempDir, which is addressed as C:\DOCUME~1\FB\CONFIG~1\Temp in autoit..

Edited by Penny

Share this post


Link to post
Share on other sites

#9 ·  Posted (edited)

This is my take on using Tesseract:

Opt("MustDeclareVars", 1)

#include-once
#include <File.au3>
#include <ScreenCapture.au3>
#include <GDIPlus.au3>

Global $Script_DebugMode = False
Global $Tesseract_Exe_XP = @ProgramFilesDir & "\tesseract\tesseract.exe"
Global $Tesseract_Exe_VISTA_SEVEN = @ProgramFilesDir & " (x86)\tesseract\tesseract.exe"
Global $LogFiles_Path = @ScriptDir & "\logs\"
Global $TempFiles_Path = @ScriptDir & "\temp\"

Func GetTesseractExePath()
    ;todo: @OSArch?..
    If @OSVersion = "WIN_XP" Then
        Return $Tesseract_Exe_XP
    ElseIf @OSVersion = "WIN_VISTA" or @OSVersion = "WIN_SEVEN" Then
        Return $Tesseract_Exe_VISTA_SEVEN
    Else
        ;TODO: I know - this function shouldn't be like this, but I was finishing it up.
        Exit
    EndIf
EndFunc

Func OCR_Rect($left,$top,$right,$bottom,$lineArray)
    Local $prefix = "ocr"

    If $Script_DebugMode Then
        $prefix &= "_debug_" & @HOUR & "-" & @MIN & "_" & @SEC & "__"
    EndIf

    Local $path_jpg = _TempFile(@TempDir,$prefix,".jpg")
    Local $path_tif = _TempFile(@TempDir,$prefix,".tif")

    Local $image = _ScreenCapture_Capture($path_jpg,$left,$top,$right,$bottom,False)

    ConvertImageToTiff($path_jpg,$path_tif)

    Return OCR_Capture_Tesseract($path_tif,$lineArray)
EndFunc

Func OCR_Capture_Tesseract($path,$lineArray=False)
    Local $i = 0
    Local $plain_text = ""
    Local $lines[90]
    Local $path_ocr = StringLeft($path, Max(StringLen($path) - 4,1))

    ShellExecuteWait(GetTesseractExePath(), $path & " " & $path_ocr)

    If $Script_DebugMode Then
        FileCopy(@TempDir & "\ocr*.*",CheckDirExists($TempFiles_Path) & "*.*")
    EndIf

    If Not $lineArray Then
        $plain_text = FileRead($path_ocr & ".txt")
    Else
        _FileReadToArray($path_ocr & ".txt", $lines)
    EndIf

    If FileExists(@ScriptDir & "\tesseract.log") Then
        FileMove(@ScriptDir & "\tesseract.log",CheckDirExists($LogFiles_Path) & "ocr.log",1)
    EndIf

    FileDelete(@TempDir & "\ocr*.*")

    If Not $lineArray Then
        Return $plain_text
    EndIf

    ReDim $lines[$lines[0]+1]
    Return $lines
EndFunc

;Deprecated, see OCR_Capture_Tesseract
Func OCR_Capture_MODI($path,$lineArray)
    Local $modi
    Local $i = 0
    Local $str
    Local $lines[90]
    Local Const $LANG_ENG = 9

    $modi = ObjCreate("MODI.Document")
    $modi.Create($path)
    $modi.Ocr($LANG_ENG, True, True)

    If @error Then
        If Not $lineArray Then
            return ""
        Else
            $lines[0] = 0
            return $lines
        EndIf
    EndIf

    Local $rId = -1
    Local $j = 0

    For $i = 0 To $modi.Images(0).Layout.Words.Count - 1
        Dim $word_rId = $modi.Images(0).Layout.Words($i).RegionId

        If $rId <> $word_rId Then
            If $j >= 0 Then
                $lines[$j] = StringLeft($lines[$j],Max(StringLen($lines[$j]) - 1,0))
            EndIf

            $j += 1
        EndIf

        $lines[$j] &= $modi.Images(0).Layout.Words($i).Text & " "
        $str &=  $modi.Images(0).Layout.Words($i).Text & " "

        $rId = $word_rId
    Next

    If Not $lineArray Then
        return StringLeft($str,Max(StringLen($str) - 1,0))
    EndIf

    $lines[$j] = StringLeft($lines[$j],Max(StringLen($lines[$j]) - 1,0))
    $lines[0] = $j

    ReDim $lines[$j + 1]

    return $lines
EndFunc

Func ConvertImageToTiff($path,$path_out)
    Local $image
    Local $CLSID

    _GDIPlus_Startup()

    $image = _GDIPlus_BitmapCreateFromFile($path)

    #Region Tiff Parameters
    Local $TIFColorDepth = 24
    Local $TIFCompression = $GDIP_EVTCOMPRESSIONNONE
    Local $CLSID
    Local $tData
    Local $tParams
    Local $pParams

    $CLSID = _GDIPlus_EncodersGetCLSID("TIFF")

    $tParams = _GDIPlus_ParamInit(2)
    $tData = DllStructCreate("int ColorDepth;int Compression")

    DllStructSetData($tData, "ColorDepth", $TIFColorDepth)
    DllStructSetData($tData, "Compression", $TIFCompression)

    _GDIPlus_ParamAdd($tParams, $GDIP_EPGCOLORDEPTH, 1, $GDIP_EPTLONG, DllStructGetPtr($tData, "ColorDepth"))
    _GDIPlus_ParamAdd($tParams, $GDIP_EPGCOMPRESSION, 1, $GDIP_EPTLONG, DllStructGetPtr($tData, "Compression"))

    If IsDllStruct($tParams) Then
        $pParams = DllStructGetPtr($tParams)
    EndIf
    #EndRegion
    ;

    _GDIPlus_ImageSaveToFileEx($image, $path_out, $CLSID, $pParams)
    _GDIPlus_ImageDispose($image)
    _GDIPlus_Shutdown()
EndFunc

Func Max($n1,$n2)
    If $n1 > $n2 Then
        Return $n1
    EndIf

    Return $n2
EndFunc

Func CheckDirExists($dir)
    If Not FileExists($dir) Then
        DirCreate($dir)
    EndIf

    Return $dir
EndFunc

that should work standalone, just OCR_Rect(left,top,right,bottom,false) and it takes a pic, makes it into a tiff, and scans it through tesseract.

Edited by Penny

Share this post


Link to post
Share on other sites

#10 ·  Posted (edited)

Penny,

It's a shame you're struggling with this. Although the technique you're using is obviously working, I think you should still try what I recommended. For your convenience here's my simple 1 line for Tesseract

$iPos = _TesseractScreenFind("Found: 1", 1, 0, "", 1, 3, 480, 1009, 975, 750, 0)

If $iPos > 0 Then
....<do something>....
End If

Basically what the 1st line is doing is bringing back the location of the string I'm searching for i.e "Found:1". The numbers 480, 1009, 975, 750 are the amount of pixels(I think) that you move in from Left, Top, Right, Bottom respectively of the captured image. It's a bit like when you crop an image by drawing a square around it and selecting crop.

The last parameter 0 is if you want to see the image it captures in memory. Set it to 1 and you'll see it and once you've got the area you're interested in then set it back to 0.

Also, the 6th parameter is helpful in that it zooms the captured area. So if the text you're searching for is small you can increase this value until you get the desired results. I found 3 worked well for me but you'll have to experiment.

Let me know if you need further help.

Edited by IndyUK

Share this post


Link to post
Share on other sites

Why would I get this error?

Tesseract Open Source OCR Engine

read_tif_image:Error:Illegal image format:Compression

Tessedit:Error:Read of file failed:C:\Users\Dario\AppData\Local\Temp\ocr_debug_16-00_43__myielbp.tif

Signal_exit 31 ABORT. LocCode: 3 AbortCode: 3

I'm trying the script in Seven now, and it gives me this error when trying to tesseract the file, but I can open the file in any image editor... any idea why this might be?

Share this post


Link to post
Share on other sites

apparently, Seven ignores the GDI params I give the tiff file, and it compresses it as LZW and with a colordepth of 24, no matter what I try to force, should I assume this bug is AutoIt's fault, or Seven's fault?

Share this post


Link to post
Share on other sites

apparently, Seven ignores the GDI params I give the tiff file, and it compresses it as LZW and with a colordepth of 24, no matter what I try to force, should I assume this bug is AutoIt's fault, or Seven's fault?

maybe there's a way around this that I'm somehow overlooking?

Share this post


Link to post
Share on other sites

maybe there's a way around this that I'm somehow overlooking?

I was looking at GDI+ earlier for saving B/W TIFF CCITT3 images and I suspect, based on playing with it, that if the intent is to reduce the color depth, especially to B/W you must first create a buffer, then reduce the depth by moving the pixels to the buffer with the indexed 1BPP property, and then feed that to the encoder.

(All of which has more to do with the code you are using then what you are doing calling it.)

It appears that the encoders do not provide for the color depth reduction and in some ways that sort of makes sense.

Especially with B/W conversions, it is fairly probable that there are many ways to approach the process and that it could be included in a separate function in GDI+ (though it is apparently not, so it is left to the outside programmer).

For further reference (respects granted to the linked parties and their efforts):

http://www.codeguru.com/forum/archive/index.php/t-194793.html

Sigmason

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0