Anybody has implemented the soundex algorithm for autoIT? I'm insterested in levenstein distance algorithm, too.
Thanks,
Eusebio.
Soundex and levenstein distance algorithms
Started by
Eusebio
, Jul 20 2006 07:50 AM
2 replies to this topic
#1
Posted 20 July 2006 - 07:50 AM
#2
Posted 20 July 2006 - 07:57 AM
Soundex hasn't been. Us AutoIt users rather use the Windows alternative of speaking text (It's in v3 Scripts n Scraps.) Therefore Levenstein distance hasn't been used either.
Edit: Ofcourse, i don't know anything about any private projects that might have..
Edit: Ofcourse, i don't know anything about any private projects that might have..
Edited by Manadar, 20 July 2006 - 07:58 AM.
#3
Posted 22 August 2006 - 10:15 AM
I've created functions if anybody are interested (any comments are welcome)
AutoIt
;.......................................................................................................................................... ; This function returns an integer number which indicates the Levenshtein-Distance between the two ; argument strings or -1, if one of the argument strings is longer than the limit of 255 characters ; (255 should be more than enough for name or dictionary comparison). ; ; The Levenshtein distance is defined as the minimal number of characters you have to replace, ; insert or delete to transform sString1 into sString2. ; ; The greater the Levenshtein-Distance, the more different the strings are. ; Levenshtein-Distance is named after the Russian scientist Vladimir Levenshtein, ; who devised the algorithm in 1965. ; In its simplest form the function will take only the two strings as parameter and will calculate ; just the number of insert, replace and delete operations needed to transform sString1 into sString2. ; ; If you can't spell or pronounce Levenshtein, the metric is also sometimes called 'edit distance'. ; The Levenshtein distance algorithm has been used in: ; - Spell checking, - Speech recognition, - DNA analysis, - Plagiarism detection . ; ; Reference: [url=http://www.merriampark.com/ld.htm]http://www.merriampark.com/ld.htm[/url] ; ; I added some character 'cleaning' procedures prior to the specific Levenshtein algorithm. ; ; Eusebio Pérez Hurtado ;.......................................................................................................................................... Func _Levenshtein ($sString1, $sString2) $iStrLen1 = StringLen($sString1) $iStrLen2 = StringLen($sString2) If $iStrLen1=0 Then Return ($iStrLen2) EndIf If $iStrLen2=0 Then Return ($iStrLen1) EndIf If ($iStrLen1>255) Then Return (-1) ; see Note at end of function. If ($iStrLen2>255) Then Return (-1) ; see Note at end of function. ;.......................................................................................................................................... ; Cleanup procedures, not quite necessary, but useful. $sString1 = StringUpper($sString1) $sString1 = _StringClean($sString1,"ÄÅÃÂÁÀ","A") $sString1 = _StringClean($sString1,"ËÊÉÈ" ,"E") $sString1 = _StringClean($sString1,"ÏÎÍÌ" ,"I") $sString1 = _StringClean($sString1,"ÒÓÔÕÖ" ,"O") $sString1 = _StringClean($sString1,"ÜÛÚÙ" ,"U") $sString1 = _StringClean($sString1,"Ç","C") $sString1 = _StringClean($sString1,"Ñ","N") $sString2 = StringUpper($sString2) $sString2 = _StringClean($sString2,"ÄÅÃÂÁÀ","A") $sString2 = _StringClean($sString2,"ËÊÉÈ" ,"E") $sString2 = _StringClean($sString2,"ÏÎÍÌ" ,"I") $sString2 = _StringClean($sString2,"ÒÓÔÕÖ" ,"O") $sString2 = _StringClean($sString2,"ÜÛÚÙ" ,"U") $sString2 = _StringClean($sString2,"Ç","C") $sString2 = _StringClean($sString2,"Ñ","N") $sString1 = _StringClean($sString1,"ABCDEFGHIJKLMNOPQRSTUVWXYZ","",2) ;OjO! aquí quito los numeros también $sString2 = _StringClean($sString2,"ABCDEFGHIJKLMNOPQRSTUVWXYZ","",2) ;OjO! aquí quito los numeros también ;.......................................................................................................................................... ; The Levenshtein algorithm $iStrLen1 = StringLen($sString1) $iStrLen2 = StringLen($sString2) Dim $aArray [$iStrLen1+1][$iStrLen2+1] For $iRow=0 To $iStrLen1 $aArray[$iRow][0] = $iRow Next For $iCol=0 To $iStrLen2 $aArray[0][$iCol] = $iCol Next For $iRow=1 To $iStrLen1 For $iCol=1 To $iStrLen2 $iCost = StringMid($sString1,$iRow,1) <> Stringmid($sString2,$iCol,1) $iRowPrev = $iRow-1 $iColPrev = $iCol-1 $aArray[$iRow][$iCol] = _Min3(1+$aArray[$iRowPrev][$iCol],1+$aArray[$iRow][$iColPrev],$iCost+$aArray[$iRowPrev][$iColPrev]) Next Next $iDistance = $aArray[$iStrLen1][$iStrLen2] Return ($iDistance) EndFunc oÝ÷ Ù«¢+Ø)Õ¹}5¥¸Ì ÀÌØí¸Ä°ÀÌØí¸È°ÀÌØí¸Ì¤(íIÑÕɹÌÑ¡5¥¹¥µÕ´½Ì¹ÕµÉÌ(íÕÍ¥¼AÉè!ÕÉÑ ¼(ÀÌØíµ¥¸ÌôÀÌØí¸Ä(%ÀÌØí¸È±ÐìÀÌØíµ¥¸ÌQ¡¸ÀÌØíµ¥¸ÌôÀÌØí¸È(%ÀÌØí¸Ì±ÐìÀÌØíµ¥¸ÌQ¡¸ÀÌØíµ¥¸ÌôÀÌØí¸Ì(%IÑÕɸ ÀÌØíµ¥¸Ì¤)¹Õ¹oÝ÷ Ù«¢+Ø(츸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸(ìM½Õ¹àµ ¹¥ÁÕ± Ñ¥½¸ ͽ¸(ìT¹L¸A ѹÑÌÄÈØÄÄØÜ ÄäÄरÄÐÌÔØØÌ ÄäÈȤ(ìä5 É ÉÐ,¸=±° ¹I½ÉиIÕÍͰ(ì ÌÕÍäÑ¡9 Ñ¥½¹ °É¡¥ÙÌ ¹I½É̵¥¹¥ÍÑÉ Ñ¥½¸¡9I¤(ì¡ÁÕ±¥Í¡ä½¸-¹ÕÑ m-¹ÕÑ¡t¤¸(ì(ìIÕÍͰÌäí̵ѡ½¥ÌÕÍ ±½È¹ µÌɽ´¹± ¹° µÉ¥°ÝÍÑɸÕɽÁ½Õ¹Ñɥ̰(ìÕн̹½Ð ÁÁ±äݱ°Ñ¼µ ¹äM± Ù¥ ¹e¥¥Í ÍÕɹ µÌ(ì ¹¥Ì¹½Ð¥¹Á¹¹Ð½ÍÙÉ °Ñ¡¹¥½¹Í¥É Ñ¥½¹Ì¸(ì(ì]¥Ñ ͽչà°Ñ¡ÅÕ½ÐíͽչÅÕ½Ðì½¹ µÌ´Ñ¡Á¡½¹Ñ¥Í½Õ¹Ñ¼á а¥Ì½¸(ìQ¡¥Ì¥Ì½É С±ÀÍ¥¹¥Ð Ù½¥Ìµ½ÍÐÁɽ±µÌ½µ¥ÍÍÁ±±¥¹Ì½È ±Ñɹ ÑÍÁ±±¥¹Ì¸(ì½Èá µÁ±M¡Éµ ¸°M¡Õɵ ¸°M¡Éµ ¸ ¹M¡¥Éµ ¸ ¹M¡Õɵ ¸ É¥¹áÑ½Ñ¡È ÌÅÕ½ÐíLØÔÔÅÕ½Ðì¸(ìMÕɹ µÍ½Õ¹à¥¹á¥¹¥Ì¹½Ð ±Á¡ Ñ¥ °°ÕХ̱¥ÍÑäÑ¡±ÑÑȵ ¹µ¹ÕµÈ½¸(ì%ÍÙÉ °ÍÕɹ µÌ¡ ÙÑ¡Í µ½°Ñ¡¥È ÉÌ É ÉÉ ¹ ±Á¡ Ñ¥ ±±ää¥Ù¸¹ µ¸(ìá µÁ±èLØÔÔÉÑ¡ÕȰLØÔÔ ÑÍä°LØÔÔ ¡ ɱ̸(ì(ìIÕÍͱ°M½Õ¹à9 µµ5 Ñ¡¥¹(ìQ¡IÕÍͱ°M½Õ¹à ½ ±½É¥Ñ¡´¥ÌÍ¥¹ÁÉ¥µ É¥±ä½ÈÕÍÝ¥Ñ ¹±¥Í ¹ µÌ ¹¥Ì(ìÁ¡½¹Ñ¥ ±±ä ͹ µµ Ñ¡¥¹µÑ¡½¸Q¡ ±½É¥Ñ¡´½¹ÙÉÑÌ ¹ µÑ¼½Õȵ¡ É ÑȽ°(ìÝ¡¥ ¸ÕÍѼ¥¹Ñ¥äÅÕ¥Ù ±¹Ð¹ µÌ° ¹¥ÌÍÑÉÕÑÕÉ Ì½±±½ÝÌm-¹ÕÑ¡tè(ìĸIÑ ¥¸Ñ¡¥ÉÍбÑÑȽѡ¹ µ° ¹É½À ±°½ÕÉɹ̽°° °¤°¼°Ô°Ü°ä¥¸½Ñ¡ÈÁ½Í¥Ñ¥½¹Ì¸(ìȸÍÍ¥¸Ñ¡½±±½Ý¥¹¹ÕµÉÌѼѡɵ ¥¹¥¹±ÑÑÉÌ ÑÈÑ¡¥ÉÍÐè(ì°°À°ØôôÄ(ì°°¨°¬°Ä°Ì°à°èôôÈ(ì°ÐôôÌ(ì°ôôÐ(ì´°¸ôôÔ(ìÈôôØ(ì̸%Ñݼ½Èµ½É±ÑÑÉÌÝ¥Ñ Ñ¡Í µ½ÝÉ © ¹Ð¥¸Ñ¡½É¥¥¹ °¹ µ¡½ÉÍÑÀĤ ÌÌìÌÌìÌÌ줰(ì½µ¥Ð ±°ÕÐÑ¡¥ÉÍи(ìи ½¹ÙÉÐѼѡ½É´E±ÑÑȰ¥¥Ð°¥¥Ð°¥¥ÒHä ¥¹ÑÉ ¥±¥¹éɽÌ(ì¡¥Ñ¡É É±ÍÌÑ¡ ¸Ñ¡É¥¥Ñ̤°½ÈäɽÁÁ¥¹É¥¡Ñµ½ÍÐ¥¥ÑÌ¥ÐÑ¡É Éµ½ÉÑ¡ ¸Ñ¡É¸(ì(ì½Èá µÁ±°Ñ¡¹ µÌձȰ ÕḬ́!¥±Éа-¹ÕÑ ¹1±½å ɥٸѡÉÍÁÑ¥Ù½Ì(ìÐØÀ°ÈÀÀ° ÐÄØ°,ÔÌÀ°0ÌÀÀ¸(ì!½ÝÙȰѡ ±½É¥Ñ¡´ ±Í¼¥ÙÌÑ¡Í µ½Ì½È(ì±±Éä°¡½Í °!¥±É½¹¸°- ¹Ð ¹1 m-¹ÕÑ¡tÝ¡¥ ɹ½Ðɱ Ñ¥¸É ±¥Ñä¸(ì(ìm-¹ÕÑ¡t踸-¹ÕÑ °Q¡ÉÐ= ½µÁÕÑÈAÉ½É µµ¥¹°Y½°¸Ì°M½ÉÑ¥¹ ¹M É¡¥¹°¥Í½¸]ͱä°ÁÀÌäÄ´Ìäȸ(ì(ìÕÍ¥¼AÉè!ÕÉÑ ¼(츸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸¸)Õ¹}M½Õ¹à ÀÌØíÍMÑÉ¥¹¤(%%MÑÉ¥¹1¸ ÀÌØíÍMÑÉ¥¹¤ôÀQ¡¸($%IÑÕɸ ÅÕ½ÐìÅÕ½Ðì¤(%¹%($ÀÌØíÍMÑÉ¥¹ôMÑÉ¥¹UÁÁÈ ÀÌØíÍMÑÉ¥¹¤(($ìIÑ ¥¸Ñ¡¥ÉÍбÑÑȽѡ¹ µ¸($ÀÌØíÍ ¡ É¥ÉÍÐôMÑÉ¥¹5¥ ÀÌØíÍMÑÉ¥¹°Ä°Ä¤(($ìÍÁ¥ °ÁɵÁɽÍÍ¥¹½Èɵ ¸± ¹Õ ($ÀÌØíÍMÑÉ¥¹ôMÑÉ¥¹IÁ± ÀÌØíÍMÑÉ¥¹°ÅÕ½ÐíM ÅÕ½Ðì°ÅÕ½ÐíLÅÕ½Ðì¤ìɵ ¸ÍÁ¥ °ÅÕ½ÐíÍ ÅÕ½Ðì($ÀÌØíÍMÑÉ¥¹ôMÑÉ¥¹IÁ± ÀÌØíÍMÑÉ¥¹°ÅÕ½Ðï|ÅÕ½Ðì°ÅÕ½ÐíLÅÕ½Ðì¤ìɵ ¸ÍÁ¥ °Í¡ ÉÀµÌÅÕ½Ðï|ÅÕ½Ðì(($ìɽÀ ±°½ÕÉɹ̽°° °$°
Edited by Eusebio, 22 August 2006 - 10:32 AM.
0 user(s) are reading this topic
0 members, 0 guests, 0 anonymous users



