Jump to content

[solved] Best approach to graphics specialized operation


jchd
 Share

Recommended Posts

I'm getting nowhere trying to at least choose the right set of functions to code a routine graphics task. Searching only [disap]pointed me to fancy drawings I don't need of slow approaches.

Some background (pun intended): I need to perform routine OCR on .PNG graphics I download. I've no choice of file format, so it is PNG, but I managed to ask the server for RGB-greyscale PNG. By this I mean that the graphics are 3x8 bit deep, but with R=G=B so it's a greyscale image, no simpler choice either. The pictures are graphics representations of grayscale text with one image = one line of text, which I need to OCR.

For doing so, I've built (manually, a very tedious work) a greyscale representation of every input character I going to encounter. I've encoded the greyscale as negative and rotated 90° clockwise, so that for any given matching pixel , the input + the char mask will equal 255 (for any R or B or G component, but one of RGB is enough to test since they are equal). Needless to say, I don't want to display anything.

Now which set of functions should I use to load a .PNG and go picking pixels efficiently to compare rows. I created my mask fontmap as follows:

Local $Arial[80][9][12] = [ _
    [[ 'a', 5 ], _
        [0,0,70,240,189,19,208,55], _
        [0,0,210,29,80,110,57,204], _
        [0,0,206,12,68,114,0,250], _
        [0,0,48,97,21,110,29,206], _
        [0,0,229,197,242,191,223,70] _
    ], _
    [[ 'b', 5 ], _
        [0,0,242,184,255,255,208,242,255,255], _
        [0,0,21,116,2,2,85,23], _
        [0,0,153,29,0,0,0,150], _
        [0,0,140,112,4,2,106,146], _
        [0,0,25,155,246,248,163,29] _
    ], _
...

where I currently allow for 80 chars, 1 description entry (character then # of "rows") followed by up to 12 vertical "rows" of up to 12 points each (rest is 0). My scan rows are turned 90°, as can bee seen looking at the lowercase 'b' example. The leftmost 2 zeros are the descending part of characters.

Givent that every character uses a unique combination of grey shades (even only the first vertical is almost unambiguous in the charset), finding the character is easy by scanning the array. That part is no problem at all.

My main question is: how do I get my .PNG loaded (without GUI showing it) and how do I access pixels in it most efficiently? My images (text lines) can be 110 to 1220 pixel wide but always 12 pixel high. A full line can hold many characters (variable pitch) and I need to process the incomming images as fast as possible (well, rather say reasonably fast). I'm afraid that choosing the wrong function(s) to perform pixel operations could make the process very slow.

You see my needs are rather basic and low-level, but what I found here and on MSDN is more oriented toward higher-level graphics. A recent post by Malkey demonstrating how to load an image and convert it into an array takes 150ms on average for loading and converting to array a 1220x12 PNG. Scanning it will probably take at least twice that time and that makes me asking if a significantly faster way is available.

Edited by jchd

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Link to comment
Share on other sites

Are you saying that loading it to an array is slow or that comparing the arrays are slow? Could you show some images and example code (if you've gotten that far)?

Link to comment
Share on other sites

I tried OCR a while back but used tesseract UDF by seangriffin.

It was very simple and just does it from the screen so dosent metter about the image format.

I found I could get it to proccess a single charachter or a whole line, part of a line, page, etc.

Perhaps this isnt what you need as it requires third party components (a dll etc...) but maybe the code might inspire you.

Edited by JohnOne

AutoIt Absolute Beginners    Require a serial    Pause Script    Video Tutorials by Morthawt   ipify 

Monkey's are, like, natures humans.

Link to comment
Share on other sites

Thanks both of you.

Diner time was beneficial and I now build a 1220x12 array from a 12x1220 image (yes I'm transposing it on the fly to minimize later index juggling) in 64ms average (4.37µs per pixel). I still can unroll the height inner loop into a series of 12 reads and also try working with 12 DllStructs in // instead of one to try further minimize pixel access time. Then I'll merge the character recognition in the pixel grabbing loop to avoid manipulating the same data twice, so there is some room for improvements.

(Just tried: unrolling the loop timed to 48ms on my slow PC)

JohnOne: thanks, but I really need to process those images on the fly and there's no "recognition" per se, just compare and match or not as I've the exact font that gets used and there is no ambiguity about the position of characters (nor style, color, ...). To be honest I know this greyscale font was intended to fool OCR :(

Sample image (needs x8 or x10 zoom to see most, not all, halftone greys clearly)

I would have bet that working with bitmaps was a bit simpler under Windows (I never do such things, I'm a quiche eater). No wonder this OS is a snail and a cycle and memory hog -- or rather a software ploy to make you buy always new hardware!

post-44800-12709452582217_thumb.png

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Link to comment
Share on other sites

Sorry for getting at it again.

While I have some success using the following canvas:

I now need to make it practical, i.e. faster than it is. That means that I need to rotate the incoming image 90° clockwise so it becomes a 12xNNN image (not NNNx12), invert RGB then I can grab rows of 3x12=36 bytes at a time via a DLLStruct("byte[36]") and compare chunks of 36 bytes as binary variants.

I've spent indecent time in MSDN trying to understand how to use low-level graphics functions getDC, BitBlt, but it seems that MS is trying to promote class methods (OO) and "demote" flat API. I've found some references to BitBlt in code samples using the search, but that's always in the context of displaying images, not manipulating pixels inside like I want to do.

In short, my new question is: how can I read a 24-bit RGB PNG (no ICM), rotate it 90° clockwise and get a pointer to the 24-bit RGB data buffer, using the most efficient low-level Windows functions?

Large edit:

Problem fully solved after finding this post about relevant GDI+ operations.

The OCR now works at an average speed of 90µs/character (and 0% failure rate) after optimizing font mask char order, quite good!

Edited by jchd

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...