Jump to content

Theory craft. Making my OWN OCR how to start out?


Recommended Posts

OCR = Optical Character Reading

 

Hello all, I am theory crafting here for now and all help is appreciated

First of all I would like to remind everyone that i am not a skilled programmer and I am even a slow and poor learner. so if you explain me something please make sure I can understand, I would really be thankful for that.

 

1. I have a program that allowed browse through a digital online brochure / stock and It always used to work perfectly with pixelchecksum() conversion to database and I got what I wanted

but things have changed negatively for me.

They Have changed the look of the digital brochure.

1. They have made fonts more transparent

2.  They have made the font much much smaller often letters are 7pixels high and 5 pixels width

for example the letter Afflicted looks like this;  keep in mind that this is 1600% increased   this image is originally 84x9 pix which is very small

f52d58329449836.jpg

3. They have added a pesky background which messes my autoit Pixelchecksum beyond useable

So my while project / Program is halted to a stop and I already spend more then 1000 hrs in it I have to change my program

I have done some decent analyzing on the screenshots / digital brochure and came to a conclusion that there is always enough RGB Difference between my font and background

So it seems I need to proper filter it, Train it ( i do not mind spending lotsa time into it)

 

My goal is to able to read text as quick as possible , therefor I am looking into my own personal lightweight OCR

I would like it to able to read a page of 50 items of lets say average of 30/35 words each line

and my goal is to achieve this under 500MS  but i would not weep if it would take 1000 ms

speed is very important to me,

What would be best approach?

making / creating a seperate .DLL or Exe for OCR

and if I make a seperate exe or .DLL how does it transfer the information instantly between .exe (or is that impossible)

it would be best for me  that my autoit.exe does not waist time on trying to OCR, my autoit already;

--------------------------------------------------------------------

handles onscreen actions (mousemovement),

other calculations ,

Writing and retrieving to database,

and interaction with a different program

-------------------------------------------------------------------

for example my theory is as follow

 

 

I make an extra ocr.exe or ocr.dll which either prepares out of itself the data upfront or does it when it is asked (having the ability on the fly and change it when

needed during the process would be extremely nice because I can switch between speed and saving Resources if needed)

question 1 =  I am not sure yet how this can be transferred between ? 

1. does it need to written in text file ,

2. can it be saved in physical memory

3. does the ocr. saves itself in some array and gives it out on request from my autoit.exe

Thank you for reading, I will show in a few hrs my theory how to basic program it for speedyness

Edited by butterfly
Link to comment
Share on other sites

I would start by searching the forum for OCR ... you will see many hits.  Also, people may want more information about your project because you could easily be talking about Captcha or something similar.  You might also be violating copyright laws for the online catalog.   I think if you explained exactly what you are trying to do (what website, what content, etc) you might get some more responses.

Build your own poker game with AutoIt: pokerlogic.au3 | Learn To Program Using FREE Tools with AutoIt

Link to comment
Share on other sites

I am reading an online catalog which is allowed by OCR, I am not interested in Captcha or something similar and Where would I need to have a captcha that would read over 500 fixed Letters words within 1 second?

I am not breaking any laws or terms, and I Like to stay far away from infractions of this forum also

Just for starters I would like to have a good answer of whether I have to run by an include

a separate .exe or separate .dll

thank you very much

Edited by butterfly
Link to comment
Share on other sites

  • Moderators

As was stated by yourself in the OP, you have no idea how even to begin this project. So why would you not take the suggestion given to you by Jfish and reading some of the many OCR threads on this forum first to see how others have done it, rather than asking someone to give you a crash course from Day 1?

"Profanity is the last vestige of the feeble mind. For the man who cannot express himself forcibly through intellect must do so through shock and awe" - Spencer W. Kimball

How to get your question answered on this forum!

Link to comment
Share on other sites

If it's an online catalog, have you tried reading it with the IE functions?

If I posted any code, assume that code was written using the latest release version unless stated otherwise. Also, if it doesn't work on XP I can't help with that because I don't have access to XP, and I'm not going to.
Give a programmer the correct code and he can do his work for a day. Teach a programmer to debug and he can do his work for a lifetime - by Chirag Gude
How to ask questions the smart way!

I hereby grant any person the right to use any code I post, that I am the original author of, on the autoitscript.com forums, unless I've specifically stated otherwise in the code or the thread post. If you do use my code all I ask, as a courtesy, is to make note of where you got it from.

Back up and restore Windows user files _Array.au3 - Modified array functions that include support for 2D arrays.  -  ColorChooser - An add-on for SciTE that pops up a color dialog so you can select and paste a color code into a script.  -  Customizable Splashscreen GUI w/Progress Bar - Create a custom "splash screen" GUI with a progress bar and custom label.  -  _FileGetProperty - Retrieve the properties of a file  -  SciTE Toolbar - A toolbar demo for use with the SciTE editor  -  GUIRegisterMsg demo - Demo script to show how to use the Windows messages to interact with controls and your GUI.  -   Latin Square password generator

Link to comment
Share on other sites

I'm sure your intent is honorable. Understand however we do have a few concerns.

I am reading an online catalog ...

Who's catalog? Name please.

 

I am not breaking any laws or terms,

Link to TOS please. No offense, but we get slapped around from time to time when certain sites find out we automate stuff or do things they do not like. (for example downloading YouTube videos. Google was not thrilled with us for that.)

I am not interested in Captcha or something similar

I get your position and respect it. Please be aware Captcha breaking can be done by the same means with what you are asking for (the same methods can be used) and that is a touchy subject here in the forum.

----------------

We are happy to try to help but we do have to be careful we do not break any rules.

Link to comment
Share on other sites

If speed is important why not write this in c.

 

I have never done any programming  outside auto-it ha ha  <3

anyway I am very disappointed in this forums attitude towards certain subjects and the unhelpfulness without giving any Single direction or clue.

I have been searching about this topic already dozen of hours with no luck and the only response I am given here is the bad cop approach

I do not want to criticize anyone here but I do not have anyone in my surroundings I can ask any single thing

....... I guess it helps me mold into something.

the good news is:

This following topic was very useful to me

'?do=embed' frameborder='0' data-embedContent>>

I have encountered it before but I overlooked the good parts of it

Although the start of it was very confusing but the 2nd page helped me out a great deal.

I believe I can finish it in 2 or 3 days in autoit for myself and maybe I can post it here . (although this first draft version will be too slow because of many issues)

Thank you anyway

Link to comment
Share on other sites

  • Developers

I have never done any programming  outside auto-it ha ha  <3

anyway I am very disappointed in this forums attitude towards certain subjects and the unhelpfulness without giving any Single direction or clue.

I have been searching about this topic already dozen of hours with no luck and the only response I am given here is the bad cop approach

I do not want to criticize anyone here but I do not have anyone in my surroundings I can ask any single thing

....... I guess it helps me mold into something.

you know this mainly has to do with your expectation and not really with the answers given ...right?  ;)

SciTE4AutoIt3 Full installer Download page   - Beta files       Read before posting     How to post scriptsource   Forum etiquette  Forum Rules 
 
Live for the present,
Dream of the future,
Learn from the past.
  :)

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

×
×
  • Create New...