Jump to content
Wicked_Caty

Comparing strings

Recommended Posts

Wicked_Caty

I've just written a small script that compares two strings and returns the similarity of those two in %. I know of StringCompare, but I want to get a percentage and I also want to get in touch with Autoit.

Compiling doesn't cause any problems, but actually running it does. In line 20 it has a problem with the index and says "Subscript used on non-accessible variable". What's causing that problem, and how can I solve it? Thanks! And sorry for my ugly style

 

Similarity.au3

Share this post


Link to post
Share on other sites
VIP

Try this:
 

#include <MsgBoxConstants.au3>

Global $sTitle = "Similarity"
Global $tTextA = InputBox($sTitle, "Data 1: ")
Global $tTextB = InputBox($sTitle, "Data 2: ")
If $tTextA = $tTextB Then Exit MsgBox($MB_TOPMOST, $sTitle, "Smiler: 100%")
Local $nStringLenTextA = StringLen($tTextA)
Local $nStringLenTextB = StringLen($tTextB)
$tTextA = StringReplace($tTextA, " " & " ", " ")
Local $cSpit = StringInStr($tTextA, " ") ? " " : ""
Local $sSpitA = StringSplit($tTextA, $cSpit)
Local $Smiler = 0, $px = UBound($sSpitA)
For $i = 1 To $px - 1
    If StringInStr($tTextB, $sSpitA[$i]) Then $Smiler += 1
    ConsoleWrite($sSpitA[$i] & " : " & $tTextB & @CRLF)
Next
ConsoleWrite($px & @CRLF)
If $px > 3 Then $px -= 1
Local $SmilerPercent = ($Smiler * 100) / ($px - 1);$nStringLenTextB
If $SmilerPercent > 199 Then $SmilerPercent = 99
MsgBox($MB_TOPMOST, $sTitle, "Smiler: " & $SmilerPercent & "%")

Exit

 


Regards,
 

Share this post


Link to post
Share on other sites
kylomas

Wicked_Caty,

$a and $b are NOT arrays.  Try it using stringmid like this...

#include <MsgBoxConstants.au3>

$title = "Similarity"
;~ $a = InputBox($title, "Data 1")
;~ $b = InputBox($title, "Data 2")
local $a = 'abcd'

local $b = 'abce'

$n = 0
$la = StringLen($a)
$lb = StringLen($b)
$lc = ($la + $lb) / 2

If $la > $lb Then
   $l = $la
ElseIf $lb > $la Then
   $l = $lb

Else
   $l = $lc

EndIf



For $i = 1 To $l
    ConsoleWrite($i & @CRLF)
   If (stringmid($a,$i,1) = stringmid($b,$i,1)) Then
      $n = $n + 1
   EndIf

Next

$p = $n / $l * 100

MsgBox($MB_TOPMOST, $title, $a & @CRLF & $b & @CRLF & $title & ":" & @TAB & $p)

Exit

note: only strings of equal length tested...input supplied as strings, not user input

edit: also your loop should start at position "1" for strings

Edited by kylomas
additional info
  • Like 1

Forum Rules         Procedure for posting code

"I like pigs.  Dogs look up to us.  Cats look down on us.  Pigs treat us as equals."

- Sir Winston Churchill

Share this post


Link to post
Share on other sites
JohnOne

I've just written a small script that compares two strings and returns the similarity of those two in %. I know of StringCompare, but I want to get a percentage and I also want to get in touch with Autoit.

Compiling doesn't cause any problems, but actually running it does. In line 20 it has a problem with the index and says "Subscript used on non-accessible variable". What's causing that problem, and how can I solve it? Thanks! And sorry for my ugly style

 

Similarity.au3

The problem is you're trying to use a variable that is not there.

You can solve it by ensuring you do not try to access a non existent variable.

Edited by JohnOne
  • Like 1

AutoIt Absolute Beginners    Require a serial    Pause Script    Video Tutorials by Morthawt   ipify 

Monkey's are, like, natures humans.

Share this post


Link to post
Share on other sites
kylomas

Wicked_Caty,

Streamlined the code a bit.  If the string lengths are equal just use one or the other as the length value, no need to add them and divide by two.  I used a ternary expression to replace all of your length determination code...

#include <MsgBoxConstants.au3>

$title = "Similarity"
;~ $a = InputBox($title, "Data 1")
;~ $b = InputBox($title, "Data 2")
local $a = 'abcdf'

local $b = 'abce'



local $len_to_check = (stringlen($a) > stringlen($b)) ? stringlen($a) : stringlen($b), $p = 0, $n = 0

For $i = 1 To $len_to_check
   If (stringmid($a,$i,1) = stringmid($b,$i,1)) Then
      $n = $n + 1
   EndIf

Next

$p = $n / $len_to_check * 100

MsgBox($MB_TOPMOST, $title, $a & @CRLF & $b & @CRLF & $title & ":" & @TAB & $p)

Exit

Good Luck,

kylomas


Forum Rules         Procedure for posting code

"I like pigs.  Dogs look up to us.  Cats look down on us.  Pigs treat us as equals."

- Sir Winston Churchill

Share this post


Link to post
Share on other sites
jchd

jguinch,

I believe you refer to my unifuzz SQLite extension; start reading here, then have luck finding the actual post where the extension resides later in this same thread. The new search feature is terrible. For the matter of string comparison outside of SQLite, this post may give something to work with.

  • Like 1

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites
jguinch

According to the Levenshtein algorithm (https://en.wikibooks.org/wiki/Algorithm_Implementation/Strings/Levenshtein_distance#VBScript), from the VBS code :

$s1 = "My String 1"
$s2 = "My String 2 !"



Local $Levenshtein = Levenshtein($s1, $s2)

Local $iMaxLen = StringLen (  StringLen($s1) > StringLen($s2) ? $s1 : $s2 )
Local $percent = Round(($iMaxLen - $Levenshtein) * 100 / $iMaxLen, 2)

ConsoleWrite($percent & "%")


Func levenshtein( $a, $b )
    Local $i, $j, $cost, $d[1], $min1, $min2, $min3

    If StringLen( $a ) = 0 Then Return StringLen( $b )
    If StringLen( $b ) = 0 Then Return StringLen( $a )

    ReDim $d[ StringLen( $a ) + 1][ StringLen( $b ) + 1]

    For $i = 0 To StringLen( $a )
        $d[$i][0] = $i
    Next

    For $j = 0 To StringLen( $b )
        $d[ 0][$j] = $j
    Next

    For $i = 1 To StringLen( $a )
        For $j = 1 To StringLen( $b )
            $cost =  ( StringMid($a, $i, 1) = StringMid($b, $j, 1) ? 0 : 1)

            $min1 = $d[$i - 1][$j] + 1
            $min2 = $d[$i][$j - 1] + 1
            $min3 = $d[$i - 1][$j - 1] + $cost

            If $min1 <= $min2 And $min1 <= $min3 Then
                $d[$i][$j] = $min1
            ElseIf $min2 <= $min1 And $min2 <= $min3 Then
                $d[$i][$j] = $min2
            Else
                $d[$i][$j] = $min3
            EndIf
        Next
    Next

    Return $d[StringLen( $a )][StringLen( $b )]
EndFunc

 

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

  • Similar Content

    • Iceburg
      By Iceburg
      Hi everyone, I'm at best a noobie.  I have read through the Array helps, and specifically the 2D array help file, and I'm struggling to get my code working.
      I have an array that is read from a file, thats working great.  I'm trying to do some math on the array, so I can find the largest, average, lowest, day over day change %, etc.
      The array read working fine, I get 43 lines, line 0 is 44, and then I get data that looks like
      0519 $10,000
      0520 $10,001
      0521 $10,002
      The data in this array is a single 1D array, breaking it out into 2 columns so I can do the math is what I can get to happen.  
      How do I reference the array to store this data?  Second, how do I assign this data to the appropriate row/column?
      Thanks in advance.
      Dim $all_cash_amounts[UBound($aInput)][2] Dim $max_amount_in_account Dim $min_amount_in_account _FileReadToArray($LC_Check_file_path, $aInput) _ArrayDisplay($aInput) local $date = StringRegExp($aInput[1], "(\d\d\d\d)", 1) local $cash = StringRegExp($aInput[1], "\d+\s(-?[0-9\.\,]+)", 1) ConsoleWrite("Date is: " & $date & @CRLF) For $i = 1 To UBound($aInput)-1     $date = StringRegExp($aInput[$i], "(\d\d\d\d)", 1)     $all_cash_amounts[$i][2] = $date[$i][0], $cash[$i][1]      Next _ArrayDisplay($all_cash_amounts)  
    • OldGuyWalking
      By OldGuyWalking
      Given an array with multiple columns that is displayed in a listview,
       ===> What is the fastest/most efficient way to create and manage multiple filters and display results in ListView.
      I have a text file that loads into a listview that has string, numeric, and date columns.  The main file contains about 5100 rows. It's loaded into an array and (in this ListView) it's pre-filtered to display a range of rows based on a start and end date.  On the form I have menu options for various filters. (see below).
      I have options to filter on an "Air Date" column (=Today, >=Today, <=Today) and on a numeric field that is either 1 or 0 that indicate Active or Ended.

      For each filter option I have a prebuilt array that holds a subset of the main array based on a single filter.  For the list above I have the main Array and 5 additional arrays.  None of the arrays are updated since this is for "view only" purposes.  This is a short list and I could have done the filtering "live" but I have several of these forms and so kept the same functionality in each. I have another ListView that displays the complete 5100 row list with 3 filters that, when building the filters live was considerably slower than using prebuilt arrays.
      If I want to expand past simple single column filtering, using an array for each filter becomes cumbersome especially if I want to combine filters using AND & OR.
      The text file I'm working with has 16 columns. If I setup filters for 4 columns and include AND / OR capability that would require prebuilding 24 arrays to cover the various combinations.
      If using the slower method of building a filtered array in real time each time a different filter is selected is the only way to go with this then I'll live with it. It is less overhead. .
      Below is the code I'm currently using to "filter" an array.  My next change was going to add AND / OR functionality (see the info above the header for where I was going with this) .
      ; Description ...: Delete rows from an array and only keep rows that meet the crtieria of identified columns. ; ; Next Change: Add AND/OR to combine filters. Use array to hold multiple criteria and values? ; ; Local $aCriteria[][] = [["",$iColNbr1, $sOperator1, $vValue1], ["AND",$iColNbr2, $sOperator2, $vValue2], ["OR",$iColNbr3, $sOperator3, $vValue3]] ; The first set of criteria ["", $iColNbr1, $sOperator1, $vValue1] must start with a "". ; If anything is entered in that first parameter it will be ignored. ; If the first parameter in any additional criteria set is left blank, or it is not OR, it will default to AND. ; If $aArray is 1 dimension with more than one set of criteria, only the first set will be used. ; Any criteria that uses a column that is less than 0 or higher than the total number of columns in the array will return an error. ; ; Recognized data types for this function are: S (String), D (Date), N (Number). ; ; Recognized Operators are: "EQ", "NEQ", "IN", "GT", "GE", "LT", "LE", "BETWEEN". ; ****** Not all operators work with all data types. ; #FUNCTION# ==================================================================================================================== ; Name ..........: _ArrayFilter ; Description ...: Delete rows from an array and only keep rows that meet the crtieria of identified columns. ; Syntax ........: _ArrayFilter(Byref $aArray[, $iCol = 0[, $sOperator = "EQ"[, $vValue = ""[, $iOptionBase = 0]]]]) ; Parameters ....: $aArray - Array being filtered. ; $iCol - [optional] Column to filter. Default is 0. ; $sOperator - [optional] Operator. Default is "EQ". ; $vValue - [optional] Criteria to compare the column/row value against. ; $iOptionBase - [optional] Starting row. Default is 0. ; Return values .: None ; Author ........: OldGuyWalking ; Modified ......: ; Remarks .......: ; Related .......: ; Link ..........: ; Example .......: No ; =============================================================================================================================== Func _ArrayFilter(ByRef $aArray, $iCol = 0, $sOperator = "EQ", $vValue = "", $iOptionBase = 0) Local $hFunc = _ArrayFilter $vValue = StringStripWS($vValue, 3) If $vValue = "[Today]" Then $vValue = _NowCalcDate() EndIf Local $sMsg Local $sMsgHdr Local $n1 Local $sDeleteIndex Local $aDeleteIndex Local $iCnt = 0 Local $iRows Local $iColMax Local $iDim Local $sData Local $sVType Local $sDType Local $LBound Local $iDiff If $iOptionBase <> 0 Then $iOptionBase = 1 EndIf If _IsValueEmpty($aArray) Then Return SetError(1, 0, "") EndIf $iDim = UBound($aArray, $UBOUND_DIMENSIONS) If $iDim = 1 Then If $iCol <> 0 Then $iCol = 0 EndIf EndIf If $iDim = 2 Then $iColMax = UBound($aArray, $UBOUND_COLUMNS) - 1 If $iCol > $iColMax Or $iCol < 0 Then Return SetError(1, 0, "") EndIf EndIf If Not _IsBetween($iDim, 1, 2) Then ;############### MSG2 - START ############### $sMsgHdr = FuncName($hFunc) & " :Line: " & @ScriptLineNumber & " :Error= " & @error $sMsg = "Invalid Dimensioned Array. Must be a 1 or 2 dimensional array." MsgBox(0, $sMsgHdr, $sMsg) Return SetError(1, 0, "") ;############### MSG2 - END ############### EndIf ; Identify what the value is ; If it is not a String, Int, Number, or Date then skip. Select Case _DateIsValid($vValue) = 1 $sVType = "D" Case IsNumber($vValue) = 1 $sVType = "N" Case IsString($vValue) = 1 $sVType = "S" Case Else ;############### MSG2 - START ############### $sMsgHdr = FuncName($hFunc) & " :Line: " & @ScriptLineNumber & " :Error= " & @error $sMsg = "Comparison value must be a " & @CRLF & _ "1. Date in YYYY/MM/DD format " & @CRLF & _ "2. A string " & @CRLF & _ "3. A number " & @CRLF MsgBox(0, $sMsgHdr, $sMsg) Return SetError(1, 0, "") ;############### MSG2 - END ############### EndSelect $iCnt = 0 For $n1 = UBound($aArray) - 1 To $iOptionBase Step -1 If $iDim = 1 Then $sData = StringStripWS($aArray[$n1], 3) ElseIf $iDim = 2 Then $sData = StringStripWS($aArray[$n1][$iCol], 3) EndIf Select Case _DateIsValid($sData) = 1 $sDType = "D" Case IsNumber($sData) = 1 $sDType = "N" Case IsString($sData) = 1 $sDType = "S" Case Else $sDType = "U" EndSelect If _IsValueEmpty($sData) Then $sDeleteIndex &= $n1 & "," $iCnt += 1 ContinueLoop ; $sDType = $sVType EndIf If Not _IsValueEmpty($sData) And $sDType <> $sVType Then $sDeleteIndex = $sDeleteIndex & $n1 & "," $iCnt += 1 ContinueLoop EndIf Select Case $sOperator = "EQ" Switch $sDType Case "D" $iDiff = _DateDiff("D", $vValue, $sData) If $iDiff = 0 Then ContinueLoop EndIf $sDeleteIndex &= $n1 & "," $iCnt += 1 ContinueLoop Case "S" If $sData = $vValue Then ContinueLoop EndIf $sDeleteIndex &= $n1 & "," $iCnt += 1 ContinueLoop Case "N" If $sData = $vValue Then ContinueLoop EndIf $sDeleteIndex &= $n1 & "," $iCnt += 1 ContinueLoop EndSwitch Case $sOperator = "NEQ" Switch $sDType Case "D" If $sData <> $vValue Then ContinueLoop EndIf $sDeleteIndex &= $n1 & "," $iCnt += 1 ContinueLoop Case "S" If $sData <> $vValue Then ContinueLoop EndIf $sDeleteIndex &= $n1 & "," $iCnt += 1 ContinueLoop Case "N" If $sData <> $vValue Then ContinueLoop EndIf $sDeleteIndex &= $n1 & "," $iCnt += 1 ContinueLoop EndSwitch Case $sOperator = "IN" Switch $sDType Case "S" If StringInStr($sData, $vValue) Then ContinueLoop EndIf $sDeleteIndex &= $n1 & "," $iCnt += 1 ContinueLoop EndSwitch Case $sOperator = "GT" Switch $sDType Case "N" If $sData > $vValue Then ContinueLoop EndIf $sDeleteIndex &= $n1 & "," $iCnt += 1 ContinueLoop Case "D" $iDiff = _DateDiff("D", $vValue, $sData) If $iDiff > 0 Then ContinueLoop EndIf $sDeleteIndex &= $n1 & "," $iCnt += 1 ContinueLoop EndSwitch Case $sOperator = "GE" Switch $sDType Case "N" If $sData >= $vValue Then ContinueLoop EndIf $sDeleteIndex &= $n1 & "," $iCnt += 1 ContinueLoop Case "D" $iDiff = _DateDiff("D", $vValue, $sData) If $iDiff >= 0 Then ContinueLoop EndIf $sDeleteIndex &= $n1 & "," $iCnt += 1 ContinueLoop EndSwitch Case $sOperator = "LT" Switch $sDType Case "N" If $sData < $vValue Then ContinueLoop EndIf $sDeleteIndex &= $n1 & "," $iCnt += 1 ContinueLoop Case "D" $iDiff = _DateDiff("D", $vValue, $sData) If $iDiff < 0 Then ContinueLoop EndIf $sDeleteIndex &= $n1 & "," $iCnt += 1 ContinueLoop EndSwitch Case $sOperator = "LE" Switch $sDType Case "N" If $sData <= $vValue Then ContinueLoop EndIf $sDeleteIndex &= $n1 & "," $iCnt += 1 ContinueLoop Case "D" $iDiff = _DateDiff("D", $vValue, $sData) If $iDiff <= 0 Then ContinueLoop EndIf $sDeleteIndex &= $n1 & "," $iCnt += 1 ContinueLoop EndSwitch EndSelect Next If $iCnt > 0 Then _DeleteArrayRows($aArray, $sDeleteIndex) EndIf EndFunc ;==>_ArrayFilter Thanks in advance.
      OldGuyWalking
    • MyEarth
      By MyEarth
      Hello. I have a 1D array. Is made in this way:
      1. This is a line Messages: a message etc Context: a context etc 2. This is a line Messages: a message etc Context: a context etc 3. This is a line Messages: a message etc Correction: a correction etc Context: a context etc I need to make something like:
      1. This is a line|Messages: a message etc|Context: a context etc 2. This is a line|Messages: a message etc|Context: a context etc 3. This is a line|Messages: a message etc|Correction: a correction|Context: a context etc For exporting in another software. I need to split every time there is a number when the line start, can be 1. until something like 3.976. Since i don't know if there a 2 line after a number or 3 i have opened this thread. Thanks
    • FMS
      By FMS
      Hello,
      The last couple of day's I was searching on this forum for the best way to put array's inside array's.
      The best example's i found where a little outdatet (2010) whit a lot of pro's and con's.
      Now I've a big script where a lot of computations and big array's are involved, so speed is a big issue.
      Also I wanna try the script below but don't know iff speed is a problem this way or maybe there is a better way to do this.

      Does somebody know's the best way to put array's inside array's and get the data back from them?
      I've made an example of something I was thinking about.
      (maybe something totaly wrong but I'm open for sugestions)
      I'm doing it this way because I don't think I can access the data inside the array (and doing some calculations to it) some other way if the array is inside another array.
      Or is there?
      Thanks in advanced.
      #include <Array.au3> Global $aArray[Random(5,10,1)][Random(5,10,1)] Global $Holder[2][2] For $x = 0 To UBound($aArray,1) - 1 For $y = 0 To UBound($aArray,2) - 1 $aArray[$x][$y] = Round(Random(-1,1),4) Next Next $Holder[0][0] = $aArray $aArray = "" Global $get_array = $Holder[0][0] _ArrayDisplay($get_array)  
    • FMS
      By FMS
      Hello,
      I'm trying to randomly change some cells in a array on a given percentage.
      at this point I 've a array whit all 0's and want to change some cells to 1.
      I'm not shure how to do this in a good coding sort of way.
      Also maybe there is a build in function whish I'm not aware of.
      Does somebody know how I can do this in a easy way?
      I was trying to get the total count of cells and get the percentage of it.
      And was stuck when i wanna change the cell.
      please advice, thanks in advanced for your help.
      #include <Array.au3> Global $percentage = 0.2 Global $aArray[Random(10,30,1)][Random(10,30,1)] For $x = 0 To UBound($aArray,1) - 1 For $y = 0 To UBound($aArray,2) - 1 $aArray[$x][$y] = 0 Next Next _ArrayDisplay($aArray) randomize() _ArrayDisplay($aArray) Func randomize() Local $total_to_change = ((UBound($aArray,1) * UBound($aArray,2)) / 100) * $percentage ConsoleWrite( "$total_to_change = " & $total_to_change & @LF ) ConsoleWrite( "total in array = " & (UBound($aArray,1) * UBound($aArray,2)) & @LF ) EndFunc  
×

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.