Jump to content

Find best match of a string


Recommended Posts

Lets say I have some raw data, extracted by pixels or OCR.

Applo

Baoon

Chee5e

Hann

and I know for sure that the words are Apple, Bacon, Cheese and Ham.

Is there a function that can compare the known word Apple with "Applo Baoon Chee5e Hann" and find the best match?

Thanks in advance ;-)

Link to comment
Share on other sites

  • Moderators

civilcalc,

This should work as long as there are not too many "nn" for "m" type errors where the corresponding letters get out of sync: :)

Global $aTestArray[5] = [4, "Applo", "Baoon", "Chee5e", "Hann"]
Global $aBaseArray[5] = [4, "Cheese", "Ham", "Apple", "Bacon"]
; Loop through our known words
For $j = 1 To $aBaseArray[0]
; What are we trying to match?
$sBaseText = $aBaseArray[$j]
; Clear the values
$iBestMatch = 0
$nBestMatch = 0
; Now loop through the unknown words
For $i = 1 To $aTestArray[0]
  ; Use the shortest length to avoid index errors
  $iLen = StringLen($sBaseText)
  If StringLen($aTestArray[$i]) < $iLen Then
   $iLen = StringLen($aTestArray[$i])
  EndIf
  ; Clear the counter
  $nMatch = 0
  ; Now compare each letter
  For $k = 0 To $iLen
   If StringMid($sBaseText, $k, 1) = StringMid($aTestArray[$i], $k, 1) Then
    ; And increase the counter if they match
    $nMatch += 100 / $iLen
   EndIf
  Next
  ; if this is the best match so far then reset the Best values
  If $nMatch > $nBestMatch Then
   $nBestMatch = $nMatch
   $iBestMatch = $i
  EndIf
Next
; And display the result
MsgBox(0, "Best Match", $sBaseText & " matches " & $aTestArray[$iBestMatch])
Next

I seem to remember something in the Examples section which did this but I cannot find it for the moment (if indeed it exists! :)). I will keep searching during the day. :mellow:

M23

Public_Domain.png.2d871819fcb9957cf44f4514551a2935.png Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind

Open spoiler to see my UDFs:

Spoiler

ArrayMultiColSort ---- Sort arrays on multiple columns
ChooseFileFolder ---- Single and multiple selections from specified path treeview listing
Date_Time_Convert -- Easily convert date/time formats, including the language used
ExtMsgBox --------- A highly customisable replacement for MsgBox
GUIExtender -------- Extend and retract multiple sections within a GUI
GUIFrame ---------- Subdivide GUIs into many adjustable frames
GUIListViewEx ------- Insert, delete, move, drag, sort, edit and colour ListView items
GUITreeViewEx ------ Check/clear parent and child checkboxes in a TreeView
Marquee ----------- Scrolling tickertape GUIs
NoFocusLines ------- Remove the dotted focus lines from buttons, sliders, radios and checkboxes
Notify ------------- Small notifications on the edge of the display
Scrollbars ----------Automatically sized scrollbars with a single command
StringSize ---------- Automatically size controls to fit text
Toast -------------- Small GUIs which pop out of the notification area

 

Link to comment
Share on other sites

$words = "Applo Baoon Chee5e Hann"
$Bestword = _Bestword("Apple", $words)
$Bestword = _Bestword("Bacon", $words)
$Bestword = _Bestword("Cheese", $words)
$Bestword = _Bestword("Ham", $words)
Func _Bestword($word, $words)
 $w = StringSplit($word, "")
 $s = StringSplit($words, " ")
 $p = 0
 $px = 0
 For $i = 1 To $s[0]
  For $j = 1 To $w[0]
   If StringInStr($s[$i], $w[$j]) Then $px += 1
  Next
  If $px > $p Then
   $p = $px
   $pi = $i
  EndIf
  $px = 0
 Next
 $Bestword = $s[$pi]
 MsgBox(262144, "", "Bestword for '" & $word & "' in" & @LF & $words & @LF & "is '" & $Bestword & "'" & @LF, 0)
 Return $Bestword
EndFunc   ;==>_Bestword

App: Au3toCmd              UDF: _SingleScript()                             

Link to comment
Share on other sites

Here is another method.

#include <String.au3>
#include <Array.au3>
#include <Math.au3>
 
; Ref:-
; http://www.autoitscript.com/forum/topic/113591-compare-strings/page__view__findpost__p__795728
 
Local $sString = "Applo dBaoon Chee5e Hann"
Local $numeros = StringSplit($sString, " ", 2)
 
Global $aBaseArray[5] = [4, "Cheese", "Ham", "Apple", "Bacon"]
 
For $i = 1 To $aBaseArray[0]
    Local $bestMatchIdx = 0, $iDist, $bestMatch = _EditDistance($numeros[0], $aBaseArray[$i])
 
    For $k = 0 To UBound($numeros) - 1
        $iDist = _EditDistance($numeros[$k], $aBaseArray[$i])
        If $iDist < $bestMatch Then
            $bestMatch = $iDist
            $bestMatchIdx = $k
        EndIf
    Next
 
    MsgBox(0, "Results", StringFormat("Best match for '%s' is '%s' with %i different, non-matching character[s].\n", $aBaseArray[$i], $numeros[$bestMatchIdx], $bestMatch))
Next
 
 
 
Func _EditDistance($s1, $s2)
    Local $m[StringLen($s1) + 1][StringLen($s2) + 1], $i, $j
    $m[0][0] = 0; boundary conditions
    For $j = 1 To StringLen($s2)
        $m[0][$j] = $m[0][$j - 1] + 1; boundary conditions
    Next
    For $i = 1 To StringLen($s1)
        $m[$i][0] = $m[$i - 1][0] + 1; boundary conditions
    Next
    For $j = 1 To StringLen($s2); outer loop
        For $i = 1 To StringLen($s1) ; inner loop
            If (StringMid($s1, $i, 1) = StringMid($s2, $j, 1)) Then
                $diag = 0;
            Else
                $diag = 1
            EndIf
            $m[$i][$j] = _Min($m[$i - 1][$j] + 1, _ ; insertion
                    (_Min($m[$i][$j - 1] + 1, _ ; deletion
                    $m[$i - 1][$j - 1] + $diag))) ; substitution
        Next
    Next
    Return $m[StringLen($s1)][StringLen($s2)] ; $m ;
EndFunc   ;==>_EditDistance
Link to comment
Share on other sites

  • Moderators

At least the author knew where it was - I had just found it here: :mellow:

M23

Public_Domain.png.2d871819fcb9957cf44f4514551a2935.png Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind

Open spoiler to see my UDFs:

Spoiler

ArrayMultiColSort ---- Sort arrays on multiple columns
ChooseFileFolder ---- Single and multiple selections from specified path treeview listing
Date_Time_Convert -- Easily convert date/time formats, including the language used
ExtMsgBox --------- A highly customisable replacement for MsgBox
GUIExtender -------- Extend and retract multiple sections within a GUI
GUIFrame ---------- Subdivide GUIs into many adjustable frames
GUIListViewEx ------- Insert, delete, move, drag, sort, edit and colour ListView items
GUITreeViewEx ------ Check/clear parent and child checkboxes in a TreeView
Marquee ----------- Scrolling tickertape GUIs
NoFocusLines ------- Remove the dotted focus lines from buttons, sliders, radios and checkboxes
Notify ------------- Small notifications on the edge of the display
Scrollbars ----------Automatically sized scrollbars with a single command
StringSize ---------- Automatically size controls to fit text
Toast -------------- Small GUIs which pop out of the notification area

 

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...