# Find differences in Arrays, String; _Diff

## Recommended Posts

Note: I am in no way the author of this function; I just modified it from another.

The original function (_GetIntersection) by BugFix is located here: Compare Arrays, Strings; _GetIntersection

Benchmarks:

_Diff vs _ArrayDiff vs _Array1PullCommon

Note: _Array1PullCommon was edited to return only an array that had unique values from array 2, also the GUI statements were stripped

```Small vs Big
506  : _Diff
791  : ArrayDiff
1775 : _Array1PullCommon

Big vs Small
3141 : _Diff
1922 : ArrayDiff
1685 : _Array1PullCommon

Big vs Big
6797    : _Diff
*       : ArrayDiff
3296    : _Array1PullCommon```

Remarks: Big refers to an array with 20,000 elements, Small refers to an array with 20. Each benchmark ran 10 times and averaged.

* Took way too long and I got bored. Waited over a minute per run.

```#include <array.au3>

Dim \$Array1[19] =[1,2,3,4,5,6,7,8,9,10,9,8,7,6,5,4,3,2,1]
Dim \$Array2[5] =[1,2,3,4,5]
\$Differece = _Diff(\$Array1, \$Array2, 1)
_ArrayDisplay(\$Differece) ; Shows a count from 1-10 and back down 10-1 without the 1,2,3,4,5

\$String1="1,2,3,4,5,6,7,8,9,10,9,8,7,6,5,4,3,2,1"
\$String2="6,7,8,9,10"
\$Differece = _Diff(\$String1, \$String2, 0, ",")
_ArrayDisplay(\$Differece) ; Shows a count up 1-10 and without the 6,7,8,9,10

;=================================================
; Function Name:   _Diff(\$Set1, \$Set2 [, \$GetAll=0 [, \$Delim=Default]])
; Description::    Find values in \$Set1 that do not occur in \$Set2
; Parameter(s):    \$Set1    set 1 (1D-array or delimited string)
;                  \$Set2    set 2 (1D-array or delimited string)
;      optional:   \$GetAll  0 - only one occurence of every difference are shown (Default)
;                           1 - all differences are shown, allowing duplicates
;      optional:   \$Delim   Delimiter for strings (Default use the separator character set by Opt("GUIDataSeparatorChar") )
; Return Value(s): Succes   1D-array of values from \$Set1 that do not occur in \$Set2
;                  Failure  -1  @error  set, that was given as array, isn't 1D-array
; Note:            Comparison is case-sensitive! - i.e. Number 9 is different to string '9'!
; Author(s):       BugFix (bugfix@autoit.de) Modified by ParoXsitiC for Faster _Diff (Formally _GetIntersection)
;=================================================
Func _Diff(ByRef \$Set1, ByRef \$Set2, \$GetAll = 0, \$Delim = Default)
Local \$o1 = ObjCreate("System.Collections.ArrayList")
Local \$o2 = ObjCreate("System.Collections.ArrayList")
Local \$oUnion = ObjCreate("System.Collections.ArrayList")
Local \$oDiff1 = ObjCreate("System.Collections.ArrayList")
Local \$oDiff2 = ObjCreate("System.Collections.ArrayList")
Local \$tmp, \$i
If \$GetAll <> 1 Then \$GetAll = 0
If \$Delim = Default Then \$Delim = Opt("GUIDataSeparatorChar")
If Not IsArray(\$Set1) Then
If Not StringInStr(\$Set1, \$Delim) Then
Else
\$tmp = StringSplit(\$Set1, \$Delim, 1)
For \$i = 1 To UBound(\$tmp) - 1
Next
EndIf
Else
If UBound(\$Set1, 0) > 1 Then Return SetError(1, 0, -1)
For \$i = 0 To UBound(\$Set1) - 1
Next
EndIf
If Not IsArray(\$Set2) Then
If Not StringInStr(\$Set2, \$Delim) Then
Else
\$tmp = StringSplit(\$Set2, \$Delim, 1)
For \$i = 1 To UBound(\$tmp) - 1
Next
EndIf
Else
If UBound(\$Set2, 0) > 1 Then Return SetError(1, 0, -1)
For \$i = 0 To UBound(\$Set2) - 1
Next
EndIf
For \$tmp In \$o1
If \$o2.Contains(\$tmp) And Not \$oUnion.Contains(\$tmp) Then \$oUnion.Add(\$tmp)
Next
For \$tmp In \$o1
If \$GetAll Then
Else
If Not \$oUnion.Contains(\$tmp) And Not \$oDiff1.Contains(\$tmp) Then \$oDiff1.Add(\$tmp)
EndIf
Next

If \$oDiff1.Count <= 0 Then Return 0

Local \$aOut[\$oDiff1.Count]
\$i = 0
For \$tmp In \$oDiff1
\$aOut[\$i] = \$tmp
\$i += 1
Next
Return \$aOut
EndFunc   ;==>_Diff```
Edited by ParoXsitiC

##### Share on other sites

If you are just chopping up BugFix's routine to only return the elements in set1 that do not exist in set2, in order to speed it up, then I think you were too conservative with your scalpel. Does not this return the same result in considerably less time?

```#include <array.au3>

Dim \$Array1[19] =[1,2,3,4,5,6,7,8,9,10,9,8,7,6,5,4,3,2,1]
Dim \$Array2[5] =[1,2,3,4,5]
\$Differece = _Diff(\$Array1, \$Array2, 1)
_ArrayDisplay(\$Differece) ; Shows a count from 1-10 and back down 10-1 without the 1,2,3,4,5

\$String1="1,2,3,4,5,6,7,8,9,10,9,8,7,6,5,4,3,2,1"
\$String2="6,7,8,9,10"
\$Differece = _Diff(\$String1, \$String2, 0, ",")
_ArrayDisplay(\$Differece) ; Shows a count up 1-10 and without the 6,7,8,9,10

;=================================================
; Function Name:   _Diff(\$Set1, \$Set2 [, \$GetAll=0 [, \$Delim=Default]])
; Description::    Find values in \$Set1 that do not occur in \$Set2
; Parameter(s):    \$Set1    set 1 (1D-array or delimited string)
;                  \$Set2    set 2 (1D-array or delimited string)
;      optional:   \$GetAll  0 - only one occurence of every difference are shown (Default)
;                           1 - all differences are shown, allowing duplicates
;      optional:   \$Delim   Delimiter for strings (Default use the separator character set by Opt("GUIDataSeparatorChar") )
; Return Value(s): Succes   1D-array of values from \$Set1 that do not occur in \$Set2
;                  Failure  -1  @error  set, that was given as array, isn't 1D-array
; Note:            Comparison is case-sensitive! - i.e. Number 9 is different to string '9'!
; Author(s):       BugFix (bugfix@autoit.de) Modified by ParoXsitiC for Faster _Diff (Formally _GetIntersection)
;=================================================
Func _Diff(ByRef \$Set1, ByRef \$Set2, \$GetAll = 0, \$Delim = Default)
Local \$o1 = ObjCreate("System.Collections.ArrayList")
Local \$o2 = ObjCreate("System.Collections.ArrayList")
Local \$oDiff1 = ObjCreate("System.Collections.ArrayList")
Local \$tmp, \$i
If \$GetAll <> 1 Then \$GetAll = 0
If \$Delim = Default Then \$Delim = Opt("GUIDataSeparatorChar")
If Not IsArray(\$Set1) Then
If Not StringInStr(\$Set1, \$Delim) Then
Else
\$tmp = StringSplit(\$Set1, \$Delim, 1)
For \$i = 1 To UBound(\$tmp) - 1
Next
EndIf
Else
If UBound(\$Set1, 0) > 1 Then Return SetError(1, 0, -1)
For \$i = 0 To UBound(\$Set1) - 1
Next
EndIf

If Not IsArray(\$Set2) Then
If Not StringInStr(\$Set2, \$Delim) Then
Else
\$tmp = StringSplit(\$Set2, \$Delim, 1)
For \$i = 1 To UBound(\$tmp) - 1
Next
EndIf
Else
If UBound(\$Set2, 0) > 1 Then Return SetError(1, 0, -1)
For \$i = 0 To UBound(\$Set2) - 1
Next
EndIf

For \$tmp In \$o1
If Not \$o2.Contains(\$tmp) And (\$GetAll Or Not \$oDiff1.Contains(\$tmp)) Then \$oDiff1.Add(\$tmp)
Next

If \$oDiff1.Count <= 0 Then Return 0

Local \$aOut[\$oDiff1.Count]
\$i = 0
For \$tmp In \$oDiff1
\$aOut[\$i] = \$tmp
\$i += 1
Next
Return \$aOut
EndFunc   ;==>_Diff```

##### Share on other sites

If you are just chopping up BugFix's routine to only return the elements in set1 that do not exist in set2, in order to speed it up, then I think you were too conservative with your scalpel. Does not this return the same result in considerably less time?

Actually it was my understanding that _ArrayDiff was used a lot and I wanted it to have similar parameters where the first array had contents stripped that existed in the second array and a 1D array was returned. The original function did more than that and returned a 2D array, it was better for more wide spread use but if you just want the difference, not 2 differences and a union this function may help you out.

Thanks for your contributions, they actually did help quite a bit; here are updated benchmarks _Diff2 is your version.

```;~ Small vs Big
;~ 462  : _Diff
;~ 727  : _ArrayDiff
;~ 1730 : _Array1PullCommon
;~ 488  : _Diff2

;~ Big vs Big
;~ 6897 : _Diff
;~ 0    : _ArrayDiff
;~ 3280 : _Array1PullCommon
;~ 4254 : _Diff2

;~ Big vs Small
;~ 3159 : _Diff
;~ 1895 : _ArrayDiff
;~ 1693 : _Array1PullCommon
;~ 2552 : _Diff2```

## Create an account

Register a new account

×

• Wiki

• Back

• Git