Jump to content

Simple stemming algorithm(convert English words to its base form)


Chann
 Share

Recommended Posts

I don't know of any but the underlying data would be something like the Word Roots and Suffix tables found here (Copyright ©):

http://www.learnthat.org/pages/view/roots.html

The irregular application/prefix joining rules and different Greek, Latin & other stem sources I think would make any purely programmed solution (without such tables) impossible (unlike say Turkish or Latin).  

But of course I would love to see something proving me wrong.

Of course there are Verb lists and something could be done or most likely has been done with regular verbs listed in sites such as:

http://www.eslgold.com/grammar/verb_list.html

an old English teacher.

Edited by Jury
Link to comment
Share on other sites

Using the FreeDictonary.com this worked for the few words I tried. Might be a better way than StringTriming but I just did it quickly.

#include <Inet.au3>
$word = InputBox("Root Word", "Enter word")
$string = _INetGetSource("http://www.thefreedictionary.com/" & $word, 'True')
$string = StringTrimLeft($string, StringInStr($string, '<script type="text/javascript">word=')+36)
$string = StringTrimRight($string, StringLen($String) - StringInStr($string, "'") + 1)
MsgBox(0, "Root Word", $String)
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...