Jump to content

Help with a GOOD OCR


Recommended Posts

Hi all,

currently we have an application with lots of embedded strings.

And i am trying to read the particular string in the whole application .

I tried with http://www.autoitscript.com/forum/index.ph...50608&st=45

but was unsuccessful

Tried ScreenOCR and failed.

Can somebody help me with this

It would be nice if somebody can provide us info on how to do this.

With out cordinates the OCR should scan the whole application for the particular string.

Thanks

Sandi

Ptex expecting a good response from you :)

Link to comment
Share on other sites

maybe try to make screenshot of the target area, then invoke external ocr program like abby finereader with the "picture"?

also : where are these stringe "embedded" ? in a dialog label or in a resource view display? maybe you can post a screenshot ?

Edited by nobbe
Link to comment
Share on other sites

maybe try to make screenshot of the target area, then invoke external ocr program like abby finereader with the "picture"?

also : where are these stringe "embedded" ? in a dialog label or in a resource view display? maybe you can post a screenshot ?

image.bmp

Link to comment
Share on other sites

hm

no need for ocr here, just get the info from the "label control" itself?? try au3info to "spy" on it for a start

OK,

THe problem we have is the menus are desinged using Photoshop.

So the buttons in the image are recognised as a single entity/

So for example i want to search for only Server,AU3Info gives me the whole toolbar class.

At this point i believe OCR will be benefitial.

And when we start testing localization also.

Please check the attachments.

Thanks

Sandi

Link to comment
Share on other sites

i dont understand? the "image.bmp" you provided looks like a regular windows dialog to me? or is it a SAMPLE picture only of a dialog "to be" ?? then maybe use abby finereader again for OCR?? (im confused now..)

Link to comment
Share on other sites

i dont understand? the "image.bmp" you provided looks like a regular windows dialog to me? or is it a SAMPLE picture only of a dialog "to be" ?? then maybe use abby finereader again for OCR?? (im confused now..)

Forget abot that.

Check the screenshot that i have newly attached...

This will definetly give u clear idea

Link to comment
Share on other sites

ok now the "listview like picture with the error" - is this an IMAGE of something or is it a real dialog box now?

abby finereader reads this from the picture (quality can shurely be enhanced it not used as JPEG)

CODE
Error |B JDF ]ob_33 Pnnting 6/23/20085:01:... D EFI PDF Output 1 Spooled B JDF]üb_38 Pnnting 6/24/200810:50... D EFI PDF Output 1 Spooled JDF ]ob_44 Pnnting 6/24/2008 1 1 :05. . . EFI PDF Output 1 Prmted .B JDF job 58 Pnnting 6/24/200811:32... 0 EFI PDF Output 1 Prmted El Sunset 5unset.]pg 6/24/2008 11:32... 0 EFI PDF Output 1 ^rmted B Sunset 5unset.]pg 6/24/200811:32... 0 EFI PDF Output 1 ^rmted S Sunset 5un^et.]pg 6/24/200811:32... 0 EFI PDF Output 1 Prmted S Sunset 5un^et.]pg 6/24/200811:32... 0 EFI PDF Output 1 Spooled B XFlow]ob_65 Add marks(lFL) 6/24/2008 12:07... D EFI PDF Output 1 Spooled B untitled doc-l.pdf 6/24/200812:07... D EFI PDF Output 1

Edited by nobbe
Link to comment
Share on other sites

thats about as good as my finereader gets it (from your quality) can you give a better quality picture or is only this quality??

CODE
Error '+ JDF ]üb_33 Pnnting 6/23/20085:01:... D EFI PDF Output 1

Spooled B JDF]üb_38 Pnnting 6/24/200810:50... D EFI PDF Output 1

Spooled JDF ]ob_44 Pnnting 6/24/2008 1 1 :05. . . EFI PDF Output 1

Prmted B JDF ]ob_58 Pnnting 6/24/200811:32... 0 EFI PDF Output 1

Prmted B Sunset 5unset.]pg 6/24/200811:32... 0 EFI PDF Output 1

Prmted B Sunset 5unset.]pg 6/24/200811:32... 0 EFI PDF Output 1

Prmted S Sunset 5un^et.]pg 6/24/200811:32... 0 EFI PDF Output 1

Prmted S Sunset 5unset.]pg 6/24/200811:32... 0 EFI PDF Output 1

Prmted S Sunset 5unset.]pg 6/24/200811:32... 0 EFI PDF Output 1

Spooled - XFlow]ob_65 Add marks(lFL) 6/24/2008 12:07... D EFI PDF Output 1

Spooled B untitled doc-l.pdf 6/24/200812:07... D EFI PDF Output 1

Page_l D

Link to comment
Share on other sites

I just used this on tthe first image u posted, it worked , returned Every word with not one spelling mistake.

#include <Array.au3>


$output = _OCR("C:\image.bmp")

_ArrayDisplay($output)

;=================================   OCR   =======================================
; Function Name:    
; Description:                      Searches a bmp file for all recognizable characters and returns them in an array
; Requires:                         Microsoft Word must be installed on system & <Array.au3>
; Parameters:       $file           bmp file to search
;                   
; Syntax:         _OCR($file)
; Returns:      $Array[1] = 0 on failure, $Array on success
;
;===============================================================================
Func _OCR($file)
    Dim $miDoc, $Doc, $str, $oWord, $sArray[500]

    $oMyError = ObjEvent("AutoIt.Error","_CoMErrFunc")
    $miDoc = ObjCreate("MODI.Document")
    $miDocView = ObjCreate("MiDocViewer.MiDocView")
    $miDoc.Create($file)
    $miDoc.Ocr(9, True, False)
    $MiDocView.Document = $miDoc
    $MiDocView.SetScale (0.75, 0.75)

    $i = 0
    For $oWord in $miDoc.Images(0).Layout.Words
        $str = $str & $oWord.text & @CrLf
            ConsoleWrite($oWord.text & @CRLF)
        $sArray [$i] = $oWord.text
        $i += 1
    Next
    
    Return $sArray
EndFunc;==========================   OCR  ========================================

There is always a butthead in the crowd, no matter how hard one tries to keep them out.......Volly

Link to comment
Share on other sites

running it with the listviwe sample it gives

CODE
Error

Spooled

Spooled

Printed

Printed

Printed

Printed

Printed

Printed

Spooled

Spooled

r

JDF

job_33

Printing

6/23/2008

5:01:...

El

EFI

PDF

Output

+J

JDF

job_38

Printing

6/24/2008

10:50.,.

El

EFI

PDF

Output

JDF

job_44

Printing

6/24/2008

11:05...

EFI

PDF

Output

LJ

JDF

job_58

Printing

6/24/

2008

11:32...

E

EFI

PDF

Output

Sunset

Sunset.jpg

6/24/2008

11:32...

E

EFI

PDF

Output

*

Sunset

Sunset.jpg

6/24/2008

11:32...

E

EFI

PDF

Output

[1

Sunset

Sunset.jpg

6/24/2008

11:32...

21

EFI

PDF

Output

r+

Sunset

Sunset.jpg

6/24/2008

11:32...

21

EFI

PDF

Output

Sunset

Sunset.jpg

6/24/2008

11:32...

21

EFI

PDF

Output

XFlow

job_os

Add

marks(1FL)

6/24/2008

12:07...

El

EFI

PDF

Output

untitled

doc-i

.pdl

6/24/2008

12:07...

El

EFI

PDF

Output

Page_i

El

Link to comment
Share on other sites

if you dont want to parse all the Garbage crated by the color changes and false lines, do a search of a Smaller area. Define the exact coOrds within the Image you want and save them to a Temp.BMP and read that? I guess to help any further I would haveto understand exactly what you need to Read and then what you want to do with that information, but Ptrex old method for OCR works as far as i can see.

There is always a butthead in the crowd, no matter how hard one tries to keep them out.......Volly

Link to comment
Share on other sites

running it with the listviwe sample it gives

CODE
Error

Spooled

Spooled

Printed

Printed

Printed

Printed

Printed

Printed

Spooled

Spooled

r

JDF

job_33

Printing

6/23/2008

5:01:...

El

EFI

PDF

Output

+J

JDF

job_38

Printing

6/24/2008

10:50.,.

El

EFI

PDF

Output

JDF

job_44

Printing

6/24/2008

11:05...

EFI

PDF

Output

LJ

JDF

job_58

Printing

6/24/

2008

11:32...

E

EFI

PDF

Output

Sunset

Sunset.jpg

6/24/2008

11:32...

E

EFI

PDF

Output

*

Sunset

Sunset.jpg

6/24/2008

11:32...

E

EFI

PDF

Output

[1

Sunset

Sunset.jpg

6/24/2008

11:32...

21

EFI

PDF

Output

r+

Sunset

Sunset.jpg

6/24/2008

11:32...

21

EFI

PDF

Output

Sunset

Sunset.jpg

6/24/2008

11:32...

21

EFI

PDF

Output

XFlow

job_os

Add

marks(1FL)

6/24/2008

12:07...

El

EFI

PDF

Output

untitled

doc-i

.pdl

6/24/2008

12:07...

El

EFI

PDF

Output

Page_i

El

Script worked beautifully.

But there is one small problem.

For example if i have a bmp image with text "HE ME",after running the script it lands up with "HEME".The space is gone.

I converted the whole array to string and tried.But still it the same.

Any idea why it could be.

Thansk

Link to comment
Share on other sites

For example if i have a bmp image with text "HE ME",after running the script it lands up with "HEME".The space is gone.

I converted the whole array to string and tried.But still it the same.

well i guess its time now for finetuning and testing - and of course you cannot expect a machine to do the same intelligent reading as a human interprets it!

good luck

Link to comment
Share on other sites

ofLight,

beautiful code, works perfectly...

Easiest OCR func ever find on forum, is possible to have output in a file text ?

(maybe all words in unique row ?)

thank you all,

m.

..

Ok this was with a .bmp file.

Now is there any way that we can run OCR on top of any application which is running.?

Something like Testcomplete ,running OCR straight on top of an application.

I am still trying to figure out something like that.

Link to comment
Share on other sites

Now is there any way that we can run OCR on top of any application which is running.?

Something like Testcomplete ,running OCR straight on top of an application.

maybe take screenshot of active window, then use the ocr on it?

Link to comment
Share on other sites

Now is there any way that we can run OCR on top of any application which is running.?

maybe take screenshot of active window, then use the ocr on it?

That is basically how I often use it, this is one way to accomplish that. Two things to keep in mind when setting this up in a loop are the BMP file size(if the total file size is To small Modi will fail to read it), and _ScreenCapture_Capture as of the last time I used it still had the small memory leak(works great for single captures but may want to use a modified version for looping a continual OCR output).

#include <Array.au3>
#include <ScreenCapture.au3>

Sleep(1000)
$Loc = WinGetPos("")
_ScreenCapture_Capture(".\Render.bmp",$Loc[0],$Loc[1],$Loc[0]+200,$Loc[1]+30)
Sleep(500) ;give file time to be created
$output = _OCR(".\Render.bmp")
_ArrayDisplay($output)

There is always a butthead in the crowd, no matter how hard one tries to keep them out.......Volly

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...