Jump to content
Sign in to follow this  
qwertylol

Optical character recognition GUI

Recommended Posts

qwertylol

to easily generate definition for OCR, there must be a gui program, this is what it should do.

Select parts of the screen with a click and drag function.

A control to Enlarge part of that screen capture to per pixel level, define areas wanted for definition for that enlargement,

tick off area to not have the definition generated,

what this means is that this OCR search function can also search for graphics, because

when you selection " a " , the white part within the a area is not what we consider to be "a"

the user will have to round up in a square with on the "a" inside and tick off the small squares not wanted to have its definition

genertated.

then it generate the definition !

For this to work, what existing OCR project should be used?

it must be able to generate definition for any graphics, and then definition file has to be valid for later improved OCR search program.

Share this post


Link to post
Share on other sites
qwertylol

because this approach, any graphics or characters can be defined with the definition valid for any OCR scan method, this will save everybody time.

Share this post


Link to post
Share on other sites
qwertylol

this is what I mean.

the red part is to tell the definition generate what to not generate definition for .

example.bmp

Share this post


Link to post
Share on other sites
qwertylol

This is another problem:

this is an arrow to be recognized, yet - the arrow doesn't always point that way - how can a OCR scan search for a arrow like that, yet it be point in different direction as well as - maybe the arrow showed up as large this time ?

shape_recognition.bmp

Share this post


Link to post
Share on other sites
Insolence

Well you're making it way too complicated. You don't need to manually select each character or enlarge/manipulate any graphics.

Just lay out a table of the font: a-zA-Z0-9 and some punctuation. If there isn't spacing between the characters, you're going to have a hard time defining each.


"I thoroughly disapprove of duels. If a man should challenge me, I would take him kindly and forgivingly by the hand and lead him to a quiet place and kill him." - Mark TwainPatient: "It hurts when I do $var_"Doctor: "Don't do $var_" - Lar.

Share this post


Link to post
Share on other sites
qwertylol

there is a fundamental problem:

when the screenshot in question is stored in a variable, how do you do pixelgetcolor ?

Share this post


Link to post
Share on other sites
qwertylol

there is a fundamental problem:

when the screenshot in question is stored in a variable, how do you do pixelgetcolor ?

is there no way to search a screenshot stored in a variable?

Share this post


Link to post
Share on other sites
KageKhan

I have already written an OCR that will scan an image for a letter and then write a file of its positions and then you can also verify it using another script I wrote. Some of this requires a bit of manual work... Sorry, not entirely automated. When you run the script the gui should explain it a bit. You NEED TO MAKE A FOLDER CALLED "Letters" IN THE SCRIPTS DIRECTORY. Don't have the picture of the letter zoomed in when you scan it. That will generate a character file with its positions where the first line is the number of the rest of the lines and the rest of the lines are an x value then a y value... so to demonstrate..

4

0 ;; Just a comment this is an x value

0 ;; this is a y value, comments like these cannot be in the character file though.

1

1

2

3

3

4

So that just records the positions in a file... Then you have another script (REQUIRES TO BE IN THE SAME DIRECTORY AS PREVIOUS SCRIPT TO HAVE ACCESS TO THE CHARACTER DATA FILES) that can paint the letter in MS paint if you have it open. You just press the draw function in MS Paint type in the letter you want drawn (it has to be scanned and in the letters folder first).. Once you have it drawn (or you can open up the picture from the original scanning) you can find the letter too. IT DOESN'T MENTION THIS IN THE GUI, YOU MUST PRESS CONTROL + C TO SET THE COLOR OF THE LETTER YOU ARE SEARCHING FOR! You can also set a value to step over some positions.. anything over 6 gives you false recognitions. Then just press Control + F to find the letter. I know.. a bit complicated but it works. The only REAL problem is the time it takes to scan.. if anyone can look at my code and speed up the recognition process I'd be VERY grateful.. Can't make a proper bot that takes so long to scan for words lol.

OCRtool.au3

PaintALetter.au3

Share this post


Link to post
Share on other sites
qwertylol

a reply is welcomed :)

I am reading your stuff.

Share this post


Link to post
Share on other sites
qwertylol

KageKhan, how do you take a screenshot, load it into a variable, and then scan it?

I can't seem to find anyone who knows how to do it

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

×