Jump to content

Working OCR without Office


schoel
 Share

Recommended Posts

Here is an OCR script I wrote before I knew about this forum. I noticed this has done before, this one doesn't require Office though. I don't own a copy of it myself so this might be useful in that case. I'm not sure if it is as good as the one requiring Office but it works fairly well for me. It does require the program gocr047 to be in the same catalogue as the script.

gocr047 is released under GNU Public License and can be found here:

gocr047.exe

You can also download it and read about it from their homepage http://jocr.sourceforge.net/

The function ReadText is called with upper left and lower right coordinates of the box which you want to scan for letters, just like PixelChecksum. Any found text is returned by the function. The script creates a directory Temp in the catalogue from which it is run and also creates and deletes a file called temp.pnm in the Temp directory.

#include <Constants.au3>

Func ReadText($x1, $y1, $x2, $y2)
    
    DirCreate("Temp")

    Local $writeString = ""
    Local $gocrOutput = ""
    
    $nrOfCols = $x2 - $x1 + 1
    $nrOfRows = $y2 - $y1 + 1

    $tempFile = FileOpen("Temp\temp.pnm", 2)
    FileWriteLine($tempFile, "P3")
    FileWriteLine($tempFile, $nrOfCols & " " & $nrOfRows)
    FileWriteLine($tempFile, "255")

    For $y = $y1 To $y2
        For $x = $x1 To $x2
            $color = PixelGetColor($x, $y)
            $red = BitShift(BitAND($color, 16711680), 16)
            $green = BitShift(BitAND($color, 65280), 8)
            $blue = BitAND($color, 255)
            $writeString &= " " & $red & " " & $green & " " & $blue
        Next
        FileWriteLine($tempFile, $writeString)
        $writeString = ""
    Next

    FileClose($tempFile)

    $gocr = Run("gocr047.exe -i Temp\temp.pnm", "", @SW_HIDE, 2)

    While 1
        $gocrOutput &= StdoutRead($gocr)
        If @error Then ExitLoop
    WEnd
        
    FileDelete("Temp\temp.pnm") 

    Return $gocrOutput
EndFunc
Link to comment
Share on other sites

@schoel

This OCR works partially, I mean that it works with clear text but not really with little text, colors...

But I think you can make it better, I'm waiting for you ^_^

Cheers, FireFox.

Edited by FireFox
Link to comment
Share on other sites

This OCR works partially ... but not really with little text, colors

In working with other OCR attempts, I've often thought it would be of great benefit to run a preprocessor to clean up the graphic image by, at minimum, converting it to black and white. Surprisingly, I haven't been able to find any such utility. It would need to take pixels of less than 60% brightness and make them black -- above 60%, make them white.

Is anyone aware of an appropriate utility? Has anyone tried such a thing?

Link to comment
Share on other sites

In working with other OCR attempts, I've often thought it would be of great benefit to run a preprocessor to clean up the graphic image by, at minimum, converting it to black and white. Surprisingly, I haven't been able to find any such utility. It would need to take pixels of less than 60% brightness and make them black -- above 60%, make them white.

Is anyone aware of an appropriate utility? Has anyone tried such a thing?

Try and have a look at http://www.autoitscript.com/forum/index.ph...;hl=imagemagick should be able to do what you need (I did brightness and BW correction not long ago with it and worked fine)

Link to comment
Share on other sites

@Sunaj

Thanks. That looks interesting -- with a lot of capabilities. Hopefully, I'll be able to find some simple utility that can just convert pixel-by-pixel a BMP file. That would be my preference. If not, I guess I'll have to give ImageMagick a try.

Link to comment
Share on other sites

Hopefully, I'll be able to find some simple utility that can just convert pixel-by-pixel a BMP file.

This might help : http://www.autoitscript.com/forum/index.php?showtopic=91302

Nevermind, I missunderstood, just use _ScreenCapture functions and this time it might be useful ^_^

Cheers, FireFox.

Edited by FireFox
Link to comment
Share on other sites

If you get the RGB value of a pixel then you could apply a generic formula:

$Brightness = (0.3 * $Red) + (0.59 * $Green) + (0.11 * $Blue)

You can then set your threshold (between 0 and 255) for what is considered "white" and what is considered "black".

WBD

EDIT: According to Michael Jackson, it don't matter if you're black or white. Presumably he's never tried using OCR himself.

Edited by WideBoyDixon
Link to comment
Share on other sites

EDIT: According to Michael Jackson, it don't matter if you're black or white. Presumably he's never tried using OCR himself.

^_^

And on a side note, good job on this. I will be testing this very soon... ;)

Link to comment
Share on other sites

Using this script:-

;
#include <Constants.au3>
#include <Misc.au3>
; http://www.autoitscript.com/forum/index.php?s=&showtopic=93901&view=findpost&p=674577
HotKeySet("{ESC}", "endscript")

Local $aSP, $aEP 

While 1
    Sleep(250)
    If _IsPressed("01") Then
        $aSP = MouseGetPos();mouse Start Position
        Do
            Sleep(10)
        Until Not _IsPressed("01")
        $aEP = MouseGetPos();mouse End Position
        ConsoleWrite( ReadText($aSP[0], $aSP[1] , $aEP[0],$aEP[1]) & @CRLF)
    EndIf
WEnd

Func endscript()
    Exit
EndFunc  ;==>endscript

;ConsoleWrite( ReadText(308, 496, 486, 513) & @CRLF)

Func ReadText($x1, $y1, $x2, $y2)
    
    DirCreate("Temp")

    Local $writeString = ""
    Local $gocrOutput = ""
    
    $nrOfCols = $x2 - $x1 + 1
    $nrOfRows = $y2 - $y1 + 1

    $tempFile = FileOpen("Temp\temp.pnm", 2)
    FileWriteLine($tempFile, "P3")
    FileWriteLine($tempFile, $nrOfCols & " " & $nrOfRows)
    FileWriteLine($tempFile, "255")

    For $y = $y1 To $y2
        For $x = $x1 To $x2
            $color = PixelGetColor($x, $y)
            $red = BitShift(BitAND($color, 16711680), 16)
            $green = BitShift(BitAND($color, 65280), 8)
            $blue = BitAND($color, 255)
            $writeString &= " " & $red & " " & $green & " " & $blue
        Next
        FileWriteLine($tempFile, $writeString)
        $writeString = ""
    Next

    FileClose($tempFile)

    $gocr = Run("gocr047.exe -i Temp\temp.pnm", "", @SW_HIDE, 2)

    While 1
        $gocrOutput &= StdoutRead($gocr)
        If @error Then ExitLoop
    WEnd
        
    FileDelete("Temp\temp.pnm") 

    Return $gocrOutput
EndFunc

A box was drawn within the Dos "Command Prompt" window. The highlighted text became black on white. This is the result:-

}gocr047.exe -h
Optical Character Recognition --- gocr 0.46 20081022
Copyright <C} 2001-2008 Joerg Schulenburg
released under the GNU General Public License
using: gocr [options] pnm  ile  ame  # use - for stdin
options <see gocr manual pages for more details}:
-h, --help
-i name  - input image file <pnm,pgm,pbm,ppm,pcx,...}
-o name  - output file  <redirection of stdout}
-e name  - logging file <redirection of stderr}
-x name  - progress output to fifo <see manual}
-p name  - database path including final slash <default is ./db/}
-f fmt   - output format <IS08859  TeX HTML XML UTP8 ASCII}
-l num   - threshold grey leuel 0<160<=255 <0 = autodetect}
-d num   - dust_size <remoue small clusters, -1 = autodetect}
-s num   - spacewidth/dots <0 = autodetect}
-u num   - uerbose <see manual page}
-c string - list of chars <debugging, see manual}
-C string - char filter <ex. hexdigits: 0-9A-Px, only ASCII}
-m num   - operation modes <bitpattern, see manual}
-a num  ualue of certainty <in percent, 0..100, default=95}
examples:
gocr -m 4 text1.pbm          # do layout analyzis
gocr -m 130 -p ./database/ text1.pbm  # extend database
djpeg -pnm -gray text.jpg  gocr -   # use jpeg-file uia pipe

webpage: http://jocr.sourceforge.net/

This is what it should be. The text was copied with a right click on the highlighted text in the Command Prompt window:-

>gocr047.exe -h
 Optical Character Recognition --- gocr 0.46 20081022
 Copyright (C) 2001-2008 Joerg Schulenburg
 released under the GNU General Public License
 using: gocr [options] pnm_file_name  # use - for stdin
 options (see gocr manual pages for more details):
 -h, --help
 -i name   - input image file (pnm,pgm,pbm,ppm,pcx,...)
 -o name   - output file  (redirection of stdout)
 -e name   - logging file (redirection of stderr)
 -x name   - progress output to fifo (see manual)
 -p name   - database path including final slash (default is ./db/)
 -f fmt - output format (ISO8859_1 TeX HTML XML UTF8 ASCII)
 -l num - threshold grey level 0<160<=255 (0 = autodetect)
 -d num - dust_size (remove small clusters, -1 = autodetect)
 -s num - spacewidth/dots (0 = autodetect)
 -v num - verbose (see manual page)
 -c string - list of chars (debugging, see manual)
 -C string - char filter (ex. hexdigits: 0-9A-Fx, only ASCII)
 -m num - operation modes (bitpattern, see manual)
 -a num   value of certainty (in percent, 0..100, default=95)
 examples:
        gocr -m 4 text1.pbm                # do layout analyzis
        gocr -m 130 -p ./database/ text1.pbm  # extend database
        djpeg -pnm -gray text.jpg | gocr -  # use jpeg-file via pipe

 webpage: http://jocr.sourceforge.net/

This script does not work well when drawing the box around text and the text becomes highlighted, white on black.

And sometimes, when a box is drawn around the Windows Explorer desktop icon, the "i" is missing in Windows.

This comment in the dos help output, "# use jpeg-file via pipe", would not make gocr047.exe dependent on only pnm type image files.

Link to comment
Share on other sites

If you get the RGB value of a pixel then you could apply a generic formula:

$Brightness = (0.3 * $Red) + (0.59 * $Green) + (0.11 * $Blue)

You can then set your threshold (between 0 and 255) for what is considered "white" and what is considered "black".

WBD

EDIT: According to Michael Jackson, it don't matter if you're black or white. Presumably he's never tried using OCR himself.

That's exactly what the first script does. It collects the RGB value of each pixel in the area determined by the coordinates. I was hoping that the OCR is smart enough to do mentioned preprocessing on its own.

This comment in the dos help output, "# use jpeg-file via pipe", would not make gocr047.exe dependent on only pnm type image files.

I believe that comment is mostly meant for linux systems (it was originally coded for linux) but I'm sure there is a port of djpeg for windows as well. However, making your own conversion of any format to pnm is not very hard, since the pnm format is very easy.
Link to comment
Share on other sites

I was hoping that the OCR is smart enough to do mentioned preprocessing on its own.

Just a thought, but would Tesseract be a better OCR engine for this script? I don't know OCR well enough to know if it's comparable, but Google seems to think it has merit.
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...