Jump to content

Recommended Posts

Posted (edited)

For my own business I am looking for a custom made OCR

the reason why it needs to be OCR because there is no way of getting the data through any other means (no data interception possible)

1 reliable , 2 blazing fast,

each search is around  400x1000 pixels each time

preferable under 100ms

the good parts is,

1 The text I want to recognize is always the same font.

2 text is Always same size.

3 Appears every time on the same Height Y coordinate. (so the location to apply it can be predetermined

4. (!!! important))   the background and the text has always distinct brightness

the very bad part is

1. the font that being rendered is transparent on a background with shade variation, therefor OCR becomes very unreliable (the background color shines through the lighter parts of the text :(.    ).

2  often it occurs that there is no pixel space between letters which makes it too hard for OCR 

3. can only be read by OCR, I cannot data snatch it

So here I am going to describe the steps I want to take   and I really hope I can get some input and tips

any help or any tips is greatly appreciated.

24a956325445352.jpg

This picture is only for reference and does not shows the real project.

the background on this picture is more heavily edited then the real thing  but it is only for reference here

part 1 shows the text

par 2 shows background variation a bit

part 6 should be my result to read, the checksums that return from that I can manually convert them

### Important disclaimer ###

I need to sort out how the color thing exactly works and which form should work the quickest

What Do i know what I can use in my advantage:

the background is always brighter in RGB values then the text, I can determinate that and use that as a value, if other methods of scanning is prefered for speed then I look into that right away

so lets say for example

So for example I would like to do this

1. First I need check which form of searching is the quickest and reliable.

2. then I need to sort out what forms of conversion I can use

 a. A quick conversion to greyscale which should ?

 b. everything under or above a certain brightness becomes either false or true and from that I make some kind of checksum

 

3. Location of search optimization because Most letters (exceptions are I, R, T, and L,) are 5 width and 7 height so i can for example do this

if everything brighter then  >= XXX then

$answer = 0

else

$answer = 1

endif

(Ps I already have checked with photoshop that it's brightness of the background never give problems with the front )

110000

110000

110000

110000

110000

111111    <-- this would be the Letter L for example and the end result

and I should do some kind of checksum based on that info

i know for a fact that the background does not become darker then certain values and that near or surpassing the darkness of my text, but because of

 

Any help on this matter is very very appreciated

the Tesseract OCR yielded for me unusable results, because font size is too small and overlapping

Edited by butterfly
  • Moderators
Posted

butterfly, while your post is a riveting novel to read, I really couldn't find what you have done on your own? You say the Tesseract OCR doesn't work for you; what have you tried on your own to expand upon it? If this is something (as it seems from your post) where you just need it done for you, perhaps Freelancer.com would be your best bet.

"Profanity is the last vestige of the feeble mind. For the man who cannot express himself forcibly through intellect must do so through shock and awe" - Spencer W. Kimball

How to get your question answered on this forum!

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...