Jump to content

How to detected misspelled words?


Recommended Posts

I'm making a pickit that uses a OCR'er that works 95%+ but it tends to mis letters.

I.e It should be

+8 Intelligence

3% life steal

but it'll have (due to a background image):

+8 Intell gen e

3% li e stea

How would I make a script to detected the misspelled words? What commands should I look up/use.

Exp: I want all items with Intelligence to be picked but if theirs a word/line within 3 letters of it the item it'll be picked up aswell.

Link to comment
Share on other sites

I'm making a pickit that uses a OCR'er that works 95%+ but it tends to mis letters.

I.e It should be

+8 Intelligence

3% life steal

but it'll have (due to a background image):

+8 Intell gen e

3% li e stea

How would I make a script to detected the misspelled words? What commands should I look up/use.

Exp: I want all items with Intelligence to be picked but if theirs a word/line within 3 letters of it the item it'll be picked up aswell.

That looks difficult!

This is how I would approach it (after not a lot of thought I should warn you)

STEP 1

replace all spaces with '?' and split a line up into an array of words which end with '?'

STEP 2

Loop through the list of word, building up a new word by adding the next word but ignoring words that contain non-alpha characters. If the nex word read starts with '?' then add one letter at a time instead of all because one of the question marks could correctly be a space. At every step see if the word is in the dictionary .

EG

?Intell or Intell

?Intell?gen or Intell?gen

?Intell?gen?e or Intell?gen?e = match found

Obviously there will be some extra logic needed but that's the basic approach I would try. You just need a dictionary or use an on-line spell checking method. Whatever method you use I expect that there will still be errors because false fits will be found.

Serial port communications UDF Includes functions for binary transmission and reception.printing UDF Useful for graphs, forms, labels, reports etc.Add User Call Tips to SciTE for functions in UDFs not included with AutoIt and for your own scripts.Functions with parameters in OnEvent mode and for Hot Keys One function replaces GuiSetOnEvent, GuiCtrlSetOnEvent and HotKeySet.UDF IsConnected2 for notification of status of connected state of many urls or IPs, without slowing the script.
Link to comment
Share on other sites

Hi.

How would I make a script to detected the misspelled words? What commands should I look up/use.

Exp: I want all items with Intelligence to be picked but if theirs a word/line within 3 letters of it the item it'll be picked up aswell.

I think it's not worth to code your own spell checker. Spell checking isn't trivial. So:

95% due to *DROPPED* letters is a really bad result for OCR. Check, if you can tune your OCR with the program's settings (font type, training modes, etc), and if the scanner is giving proper contrast and resolution.

1.) get an OCR (and/or scanner) that work better

2.) after done so, use the *existing* spell checker of your word processing SW (OpenOffice, Word, ...)

Regards, Rudi.

Earth is flat, pigs can fly, and Nuclear Power is SAFE!

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...