Jump to content
Sign in to follow this  
wraithdu

_ArrayUnique - Proposed change to standard UDF

Recommended Posts

wraithdu

I found I wanted to use this function today, and I wanted to check it out to see how it was written to determine if it could be optimized a bit (I was reasonably sure it could). Most UDF functions are written for a wide use, and narrowing the parameters can lead to performance gains. The reason being that I knew I would be dealing with fairly large arrays (10K+).

Well, I really didn't like the way it was done. It created a temp array with _ArrayAdd (poor performance) and converted the input array to strings, losing the original data types on return. At first I rewrote it using arrays, but without the penalty of _ArrayAdd or losing the original data. I got a modest performance gain... actually a lot less than I was suspecting. Then I had an idea and rewrote it using a Scripting.Dictionary object. H o l y c r a p. It is retarded fast now.

I wanted to post here to get some opinions and a few more eyes on it, before submitting it to change the library function. The only major difference (aside from retaining the original data types) is that it no longer returns the count in $arr[0]. I thought this was superfluous and requires additional processing if using the array for something else after unique-ing (for example I wanted to shuffle it). This is why we have UBound() anyway.

_ArrayUnique

; #FUNCTION# ====================================================================================================================
; Name...........: _ArrayUnique
; Description ...: Returns the Unique Elements of a 1-dimensional array.
; Syntax.........: _ArrayUnique($aArray[, $iDimension = 1[, $iIdx = 0[, $iCase = 0[, $iFlags = 1]]]])
; Parameters ....: $aArray    - Input array (1D or 2D only)
;                 $iDimension  - [optional] The dimension of the array to process (only valid for 2D arrays)
;                 $iIdx     - [optional] Index at which to start scanning the input array
;                 $iCase       - [optional] Flag to indicate if string comparisons should be case sensitive
;                                | 0 - case insensitive
;                                | 1 - case sensitive
;                 $iFlags     - [optional] Set of flags, added together
;                                | 1 - Return the array count in element [0]
; Return values .: Success    - Returns a 1-dimensional array containing only the unique elements of the input array / dimension
;                 Failure     - Returns 0 and sets @error:
;                                | 1 - Input is not an array
;                                | 2 - Arrays greater than 2 dimensions are not supported
;                                | 3 - $iDimension is out of range
;                                | 4 - $iIdx is out of range
; Author ........: SmOke_N
; Modified.......: litlmike, Erik Pilsits
; Remarks .......:
; Related .......: _ArrayMax, _ArrayMin
; Link ..........:
; Example .......: Yes
; ===============================================================================================================================
Func __ArrayUnique(Const ByRef $aArray, $iDimension = 1, $iIdx = 0, $iCase = 0, $iFlags = 1)
    ; Check to see if it is valid array
    If Not IsArray($aArray) Then Return SetError(1, 0, 0)
    Local $iDims = UBound($aArray, 0)
    If $iDims > 2 Then Return SetError(2, 0, 0)
    ;
    ; checks the given dimension is valid
    If ($iDimension <= 0) Or (($iDims = 1) And ($iDimension > 1)) Or (($iDims = 2) And ($iDimension > UBound($aArray, 2))) Then Return SetError(3, 0, 0)
    ; make $iDimension an array index, note this is ignored for 1D arrays
    $iDimension -= 1
    ;
    ; check $iIdx
    If ($iIdx < 0) Or ($iIdx >= UBound($aArray)) Then Return SetError(4, 0, 0)
    ;
    ; create dictionary
    Local $oD = ObjCreate("Scripting.Dictionary")
    ; compare mode for strings
    ; 0 = binary, which is case sensitive
    ; 1 = text, which is case insensitive
    ; this expression forces either 1 or 0
    $oD.CompareMode = Number(Not $iCase)
    ;
    Local $vElem
    ; walk the input array
    For $i = $iIdx To UBound($aArray) - 1
        If $iDims = 1 Then
            ; 1D array
            $vElem = $aArray[$i]
        Else
            ; 2D array
            $vElem = $aArray[$i][$iDimension]
        EndIf
        ; add key to dictionary
        ; NOTE: accessing the value (.Item property) of a key that doesn't exist creates the key :)
        ; keys are guaranteed to be unique
        $oD.Item($vElem)
    Next
    ;
    ; return the array of unique keys
    If BitAND($iFlags, 1) = 1 Then
        Local $aTemp = $oD.Keys()
        _ArrayInsert($aTemp, 0, $oD.Count)
        Return $aTemp
    Else
        Return $oD.Keys()
    EndIf
EndFunc   ;==>__ArrayUnique

Example

#include <Array.au3>

; i had to cap this at 10000 because the original function is so slow
; my version can do 500000 in about ~6.5 seconds
$z = 10000
Local $a[$z]
For $i = 0 To $z-1
    $a[$i] = Random(0, Int($z/2), 1)
Next
;
; don't forget this array has the count returned in $b[0]
; so it will have an extra member
$t = TimerInit()
$b = _ArrayUnique($a)
ConsoleWrite(TimerDiff($t) / 1000 & @CRLF)
_ArrayDisplay($b)
;
$t = TimerInit()
$b = __ArrayUnique($a)
ConsoleWrite(TimerDiff($t) / 1000 & @CRLF)
_ArrayDisplay($b)
;
; no counter in $b[0] here
$t = TimerInit()
$b = __ArrayUnique($a, 1, 0, 0, 0)
ConsoleWrite(TimerDiff($t) / 1000 & @CRLF)
_ArrayDisplay($b)
Edited by wraithdu

Share this post


Link to post
Share on other sites
UEZ

I had similar idea last year: is using a faster code than the Scripting.Dictionary object method for 1D arrays (thanks to Yashied :graduated:).

Br,

UEZ

Edited by UEZ

Please don't send me any personal message and ask for support! I will not reply!

Selection of finest graphical examples at Codepen.io

The own fart smells best!
Her 'sikim hıyar' diyene bir avuç tuz alıp koşma!
¯\_(ツ)_/¯  ٩(●̮̮̃•̃)۶ ٩(-̮̮̃-̃)۶ૐ

Share this post


Link to post
Share on other sites
UEZ

Here a benchmark with your and my version:

#include <Array.au3>

Global $aNames[10] = ["Antonia", "Anton", "Caesar", "Dora", "Emil", "Friedrich", "Gustav", "Heinrich", "Ida", "Julius"]

Global $a[1000000]
For $I = 0 To Ubound($a) - 1
    $r = Random(0, 9, 1)
    $a[$I] = $aNames[$r]
Next

ConsoleWrite("Benchmark 1D array:" & @LF)
; no counter in $b[0] here
$t = TimerInit()
$b = ArrayUnique($a)
ConsoleWrite("UEZ: " & Round(TimerDiff($t) / 1000, 4) & " seconds" & @CRLF)
_ArraySort($b)
_ArrayDisplay($b)

; no counter in $b[0] here
$t = TimerInit()
$b = __ArrayUnique($a)
ConsoleWrite("wraithdu: " & Round(TimerDiff($t) / 1000, 4) & " seconds" & @CRLF)
_ArraySort($b)
_ArrayDisplay($b)

Exit
ConsoleWrite(@LF & "Benchmark 2D array:" & @LF)
Global $aNames[10][2] = [["Antonia", ""], ["Anton", ""], ["Caesar", 300], ["Dora", 24], ["Emil", 33], ["Friedrich", 57], ["Gustav", 53], ["Heinrich", 34], ["Ida", 13], ["Julius", 77]]
Global $a[1000000][2]
For $I = 0 To Ubound($a) - 1
    $r = Random(0, 9, 1)
    $a[$I][0] = $aNames[$r][0]
    $a[$I][1] = $aNames[$r][1]
Next

; no counter in $b[0] here
$t = TimerInit()
$b = ArrayUnique($a)
ConsoleWrite("UEZ: " & Round(TimerDiff($t) / 1000, 4) & " seconds" & @CRLF)
_ArraySort($b)
_ArrayDisplay($b)

;
; no counter in $b[0] here
$t = TimerInit()
$b = __ArrayUnique($a, 2)
ConsoleWrite("wraithdu: " & Round(TimerDiff($t) / 1000, 4) & " seconds" & @CRLF)
_ArraySort($b)
_ArrayDisplay($b)


; #FUNCTION# ====================================================================================================================
; Name...........: _ArrayUnique
; Description ...: Returns the Unique Elements of a 1-dimensional array.
; Syntax.........: _ArrayUnique($aArray[, $iDimension = 1[, $iIdx = 0[, $iCase = 0]]])
; Parameters ....: $aArray    - Input array (1D or 2D only)
;                 $iDimension  - [optional] The dimension of the array to process (only valid for 2D arrays)
;                 $iIdx     - [optional] Index at which to start scanning the input array
;                 $iCase       - [optional] Flag to indicate if string comparisons should be case sensitive
;                                | 0 - case insensitive
;                                | 1 - case sensitive
; Return values .: Success    - Returns a 1-dimensional array containing only the unique elements of the input array / dimension
;                 Failure     - Returns 0 and sets @error:
;                                | 1 - Input is not an array
;                                | 2 - Arrays greater than 2 dimensions are not supported
;                                | 3 - $iDimension is out of range
;                                | 4 - $iIdx is out of range
; Author ........: SmOke_N
; Modified.......: litlmike, Erik Pilsits
; Remarks .......:
; Related .......: _ArrayMax, _ArrayMin
; Link ..........:
; Example .......: Yes
; ===============================================================================================================================
Func __ArrayUnique($aArray, $iDimension = 1, $iIdx = 0, $iCase = 0)
    ; Check to see if it is valid array
    If Not IsArray($aArray) Then Return SetError(1, 0, 0)
    Local $iDims = UBound($aArray, 0)
    If $iDims > 2 Then Return SetError(2, 0, 0)
    ;
    ; checks the given dimension is valid
    If ($iDimension <= 0) Or (($iDims = 1) And ($iDimension > 1)) Or (($iDims = 2) And ($iDimension > UBound($aArray, 2))) Then Return SetError(3, 0, 0)
    ; make $iDimension an array index, note this is ignored for 1D arrays
    $iDimension -= 1
    ;
    ; check $iIdx
    If ($iIdx < 0) Or ($iIdx >= UBound($aArray)) Then Return SetError(4, 0, 0)
    ;
    ; create dictionary
    Local $oD = ObjCreate("Scripting.Dictionary")
    ; compare mode for strings
    ; 0 = binary, which is case sensitive
    ; 1 = text, which is case insensitive
    ; this expression forces either 1 or 0
    $oD.CompareMode = Number(Not $iCase)
    ;
    Local $vElem
    ; walk the input array
    For $i = $iIdx To UBound($aArray) - 1
        If $iDims = 1 Then
            ; 1D array
            $vElem = $aArray[$i]
        Else
            ; 2D array
            $vElem = $aArray[$i][$iDimension]
        EndIf
        ; add key to dictionary
        ; NOTE: accessing the value (.Item property) of a key that doesn't exist creates the key :)
        ; keys are guaranteed to be unique
        $oD.Item($vElem)
    Next
    ;
    ; return the array of unique keys
    Return $oD.Keys()
EndFunc   ;==>__ArrayUnique

; #FUNCTION# ============================================================================
; Name.............:    ArrayUnique
; Description ...:  Returns the Unique Elements of a 1-dimensional or 2-dimensional array.
; Syntax...........:    _ArrayUnique($aArray[, $iBase = 0, oBase = 0])
; Parameters ...:   $aArray - The Array to use
;                           $iBase  - [optional] Is the input Array 0-base or 1-base index.  0-base by default
;                           $oBase  - [optional] Is the output Array 0-base or 1-base index.  0-base by default
; Return values:    Success - Returns a 1-dimensional or 2-dimensional array containing only the unique elements
;                           Failure - Returns 0 and Sets @Error:
;                           0 - No error.
;                           1 - Returns 0 if parameter is not an array.
;                           2 - Array has more than 2 dimensions
;                           3 - Array is already unique
;                           4 - when source array is selected as one base but UBound(array) - 1 <> array[0] / array[0][0]
;                           5 - Scripting.Dictionary cannot be created for 1D array unique code
; Author .........:     UEZ 2010 for 2D-array, Yashied for 1D-array (modified by UEZ)
; Version ........:     0.96 Build 2010-11-20 Beta
; =======================================================================================
Func ArrayUnique($aArray, $iBase = 0, $oBase = 0)
    If Not IsArray($aArray) Then Return SetError(1, 0, 0) ;not an array
    If UBound($aArray, 0) > 2 Then Return SetError(2, 0, 0) ;array is greater than a 2D array
    If UBound($aArray) = $iBase + 1 Then Return SetError(3, 0, $aArray) ;array is already unique because of only 1 element
    Local $dim = UBound($aArray, 2), $i
    If $dim Then ;2D array
        If $iBase And UBound($aArray) - 1 <> $aArray[0][0] Then Return SetError(4, 0, 0)
        Local $oD = ObjCreate('Scripting.Dictionary')
        If @error Then Return SetError(5, 0, 0)
        Local $i, $j, $k = $oBase, $l, $s, $aTmp, $flag, $sSep = Chr(01)
        Local $aUnique[UBound($aArray)][$dim]
        If Not $oBase Then $flag = 2
        For $i =  $iBase To UBound($aArray) - 1
            For $j = 0 To $dim - 1
                $s &= $aArray[$i][$j] & $sSep
            Next
            If Not $oD.Exists($s) And StringLen($s) > 3 Then
                $oD.Add($s, $i)
                $aTmp = StringSplit(StringTrimRight($s, 1), $sSep, 2)
                For $l = 0 To $dim - 1
                    $aUnique[$k][$l] = $aTmp[$l]
                Next
                $k += 1
            EndIf
            $s = ""
        Next
        $oD.RemoveAll
        $oD = ""
        If $k > 0 Then
            If $oBase Then $aUnique[0][0] = $k - 1
            ReDim $aUnique[$k][$dim]
        Else
            ReDim $aUnique[1][$dim]
        EndIf
    Else ;1D array
        If $iBase And UBound($aArray) - 1 <> $aArray[0] Then Return SetError(4, 0, 0)
        Local $sData = '', $sSep = ChrW(160), $flag
        For $i = $iBase To UBound($aArray) - 1
            If Not IsDeclared($aArray[$i] & '$') Then
                Assign($aArray[$i] & '$', 0, 1)
                $sData &= $aArray[$i] & $sSep
            EndIf
        Next
        If Not $oBase Then $flag = 2
        Local $aUnique = StringSplit(StringTrimRight($sData, 1), $sSep, $flag)
    EndIf
    Return SetError(0, 0, $aUnique)
EndFunc   ;==>ArrayUnique

1D array:

UEZ: 2.3242 seconds

wraithdu: 6.3211 seconds

What about a 2D array? Your version considers only the rows in a 2D array as far as I can see or?

Br,

UEZ

Edited by UEZ

Please don't send me any personal message and ask for support! I will not reply!

Selection of finest graphical examples at Codepen.io

The own fart smells best!
Her 'sikim hıyar' diyene bir avuç tuz alıp koşma!
¯\_(ツ)_/¯  ٩(●̮̮̃•̃)۶ ٩(-̮̮̃-̃)۶ૐ

Share this post


Link to post
Share on other sites
guinness

Thanks wraithdu, I rarely use the Array UDF opting for custom functions I have in my arsenal. :graduated:

Benchmark 1D array:

UEZ: 3.2961 seconds

wraithdu: 16.2358 seconds


UDF List:

 
_AdapterConnections()_AlwaysRun()_AppMon()_AppMonEx()_ArrayFilter/_ArrayReduce_BinaryBin()_CheckMsgBox()_CmdLineRaw()_ContextMenu()_ConvertLHWebColor()/_ConvertSHWebColor()_DesktopDimensions()_DisplayPassword()_DotNet_Load()/_DotNet_Unload()_Fibonacci()_FileCompare()_FileCompareContents()_FileNameByHandle()_FilePrefix/SRE()_FindInFile()_GetBackgroundColor()/_SetBackgroundColor()_GetConrolID()_GetCtrlClass()_GetDirectoryFormat()_GetDriveMediaType()_GetFilename()/_GetFilenameExt()_GetHardwareID()_GetIP()_GetIP_Country()_GetOSLanguage()_GetSavedSource()_GetStringSize()_GetSystemPaths()_GetURLImage()_GIFImage()_GoogleWeather()_GUICtrlCreateGroup()_GUICtrlListBox_CreateArray()_GUICtrlListView_CreateArray()_GUICtrlListView_SaveCSV()_GUICtrlListView_SaveHTML()_GUICtrlListView_SaveTxt()_GUICtrlListView_SaveXML()_GUICtrlMenu_Recent()_GUICtrlMenu_SetItemImage()_GUICtrlTreeView_CreateArray()_GUIDisable()_GUIImageList_SetIconFromHandle()_GUIRegisterMsg()_GUISetIcon()_Icon_Clear()/_Icon_Set()_IdleTime()_InetGet()_InetGetGUI()_InetGetProgress()_IPDetails()_IsFileOlder()_IsGUID()_IsHex()_IsPalindrome()_IsRegKey()_IsStringRegExp()_IsSystemDrive()_IsUPX()_IsValidType()_IsWebColor()_Language()_Log()_MicrosoftInternetConnectivity()_MSDNDataType()_PathFull/GetRelative/Split()_PathSplitEx()_PrintFromArray()_ProgressSetMarquee()_ReDim()_RockPaperScissors()/_RockPaperScissorsLizardSpock()_ScrollingCredits_SelfDelete()_SelfRename()_SelfUpdate()_SendTo()_ShellAll()_ShellFile()_ShellFolder()_SingletonHWID()_SingletonPID()_Startup()_StringCompact()_StringIsValid()_StringRegExpMetaCharacters()_StringReplaceWholeWord()_StringStripChars()_Temperature()_TrialPeriod()_UKToUSDate()/_USToUKDate()_WinAPI_Create_CTL_CODE()_WinAPI_CreateGUID()_WMIDateStringToDate()/_DateToWMIDateString()Au3 script parsingAutoIt SearchAutoIt3 PortableAutoIt3WrapperToPragmaAutoItWinGetTitle()/AutoItWinSetTitle()CodingDirToHTML5FileInstallrFileReadLastChars()GeoIP databaseGUI - Only Close ButtonGUI ExamplesGUICtrlDeleteImage()GUICtrlGetBkColor()GUICtrlGetStyle()GUIEventsGUIGetBkColor()Int_Parse() & Int_TryParse()IsISBN()LockFile()Mapping CtrlIDsOOP in AutoItParseHeadersToSciTE()PasswordValidPasteBinPosts Per DayPreExpandProtect GlobalsQueue()Resource UpdateResourcesExSciTE JumpSettings INISHELLHOOKShunting-YardSignature CreatorStack()Stopwatch()StringAddLF()/StringStripLF()StringEOLToCRLF()VSCROLLWM_COPYDATAMore Examples...

Updated: 22/04/2018

Share this post


Link to post
Share on other sites
AZJIO

UEZ

Try to use "["

Global $aNames[10] = ["Antonia", "Ant[on", "Cae[sar", "Dor[a", "Emil", "Frie[drich", "Gus[tav", "Heinrich", "Ida", "Julius"]
Global $a[1000]

#include <Array.au3>
Dim $arr1[5] = [1,2,3,4,2]
$a=_ArrayUnique2($arr1)
_ArrayDisplay($a, 'Array')
Dim $arr1[5] = [4,2,3,4,2]
$a=_ArrayUnique2($arr1, 1)
_ArrayDisplay($a, 'Array')
$a=_ArrayUnique2('er|df|er')
_ArrayDisplay($a, 'Array')
$a=_ArrayUnique2('er,df,er', ',')
_ArrayDisplay($a, 'Array')

; ===============================================================================================================================
; Описание ...: Поиск и удаление дубликатов в данных
; Синтаксис.........: _ArrayUnique2($data[, $flag=-1])
; Параметр1....: $data - данные, массив или строка с разделителем
; Параметр2 ....: $flag
;      Если массив, то $flag является индексом массива от которого производить поиск
;      Если строка, то $flag является разделителем, по умолчанию "|"
; Возвращает .: Успешно  - массив без дубликатов
;      Ошибка - 0 и @error=1
; Автор ........: AZJIO
; Remarks .......: В данных не должно быть символа "[", такие данные исключаются из массива, даже
;     если не являются дубликатами, остальные спец-символы и буквы не вызывают ошибки
; ===============================================================================================================================
Func _ArrayUnique2($data, $flag=-1)
Local $k, $i, $tmp
Assign('/', 1, 1) ;для исключения пустых строк и не совпадения с локальными переменными
If IsArray($data) Then
  If $flag=-1 Then $flag=0
  $tmp=UBound($data) -1
  If $flag>$tmp Then Return SetError(1, 0, 0)
  $k=0
  For $i = $flag To $tmp
   Assign($data[$i]&'/', Eval($data[$i]&'/')+1, 1)
   If Eval($data[$i]&'/') = 1 Then
    $data[$k]=$data[$i]
    $k+=1
   EndIf
  Next
  If $k = 0 Then Return SetError(1, 0, 0)
  ReDim $data[$k]
  Return $data
Else
  If $flag=-1 Then $flag='|'
  $data=StringSplit($data, $flag)
  If Not @error Then
   $k=0
   For $i = 1 To $data[0]
    Assign($data[$i]&'/', Eval($data[$i]&'/')+1, 1)
    If Eval($data[$i]&'/') = 1 Then
     $data[$k]=$data[$i]
     $k+=1
    EndIf
   Next
   If $k = 0 Then Return SetError(1, 0, 0)
   ReDim $data[$k]
   Return $data
  Else
   Return SetError(1, 0, 0)
  EndIf
EndIf
EndFunc

Share this post


Link to post
Share on other sites
wraithdu

Here a benchmark with your and my version:

So the problem I see with this, as with the current UDF version, is that data type is now lost in the return array, since you are converting everything (implicitly) to a string. Now, I can't 100% defend my implementation either, since it only works with strings and numbers, not pointers, HWND's, or binary types.

However your Assign method does give me an idea to modify my first try at a rewrite.

What about a 2D array? Your version considers only the rows in a 2D array as far as I can see or?

I was simply copying the existing functionality for the 2D arrays. I don't care for it per say, but I didn't have a reason or suggestion to modify it to operate differently. Edited by wraithdu

Share this post


Link to post
Share on other sites
wraithdu

I had similar idea last year: is using a faster code than the Scripting.Dictionary object method for 1D arrays (thanks to Yashied :graduated:).

I there a reason you're using Assign for 1D arrays and not 2D?

Share this post


Link to post
Share on other sites
wraithdu

Assign is interesting... and very fast. Here's an adaptation of my first idea, which keeps the original data types.

Func __ArrayUnique3($aArray, $iDimension = 1, $iIdx = 0, $iCase = 0)
    ; Check to see if it is valid array
    If Not IsArray($aArray) Then Return SetError(1, 0, 0)
    Local $iDims = UBound($aArray, 0)
    If $iDims > 2 Then Return SetError(2, 0, 0)
    ;
    ; checks the given dimension is valid
    If ($iDimension <= 0) Or (($iDims = 2) And ($iDimension > UBound($aArray, 2))) Then Return SetError(3, 0, 0)
    ; make $iDimension an array index, note this is ignored for 1D arrays
    $iDimension -= 1
    ; create return array and element counter
    Local $aReturn[UBound($aArray)], $iUnique = 0, $vElem
    ; walk the input array
    For $i = $iIdx To UBound($aArray) - 1
        If $iDims = 1 Then
            ; 1D array
            $vElem = $aArray[$i]
        ElseIf $iDims = 2 Then
            ; 2D array
            $vElem = $aArray[$i][$iDimension]
        EndIf
        ; search the return array for the next value, add it if not found
        If Not IsDeclared($vElem & "$") Then
            Assign($vElem & "$", 0, 1)
            $aReturn[$iUnique] = $vElem
            $iUnique += 1
        EndIf
    Next
    ;
    ; redim the output array
    ReDim $aReturn[$iUnique]
    Return $aReturn
EndFunc   ;==>__ArrayUnique3

The question here is if a string representation of any data type is guaranteed to be unique. The answer unfortunately is no. Both numeric 0 and string "0" are the same when compared as strings, thus the Assign call will not see the difference. Whereas the scripting dictionary does make that distinction. Another wrinkle is case sensitivity. Assign / IsDeclared seem to be case insensitive.

This method is great though when your entire array is guaranteed to be the same data type (which, let's face it, should really always be the case) and you don't care about case sensitivity. It also works for some of the other AutoIt data types that have conversions to strings, such as Ptr, Hwnd, and Binary.

Edited by wraithdu

Share this post


Link to post
Share on other sites
money

Work around

local $sVar
$sVar = Hex(StringToBinary($aArray[$i]))
If Not IsDeclared($sVar & '$') Then
     Assign($sVar & '$', 0, 1)

Func _ArrayUniqueFast(ByRef Const $aArray, ByRef $aUnique, $bCaseSensitive = True)
    ;author: Yashied taken from http://www.autoitscript.com/forum/topic/122192-arraysort-and-eliminate-duplicates/page__p__848191#entry848191
    ; fixes invalid characters/case sensitivy issues
    Local $sData = '', $sSep = ChrW(160), $sVar
    For $i = 0 To UBound($aArray) - 1
        If $bCaseSensitive Then
            $sVar = Hex(StringToBinary(StringLower($aArray[$i])))
        Else
            $sVar = Hex(StringToBinary($aArray[$i]))
        EndIf
        If Not IsDeclared($sVar) Then
            Assign($sVar, 0, 1)
            $sData &= $aArray[$i] & $sSep
        EndIf
    Next
    $aUnique = StringSplit(StringTrimRight($sData, 1), $sSep)
EndFunc   ;==>_ArrayUniqueFast

Edit: Add case sensitivity flag

Edited by money

Share this post


Link to post
Share on other sites
UEZ

@wraithdu: you are right - I forgot to mention the limitations of my (Yashied's) version (long time ago when I played with ArrayUnique())!

Br,

UEZ

Edited by UEZ

Please don't send me any personal message and ask for support! I will not reply!

Selection of finest graphical examples at Codepen.io

The own fart smells best!
Her 'sikim hıyar' diyene bir avuç tuz alıp koşma!
¯\_(ツ)_/¯  ٩(●̮̮̃•̃)۶ ٩(-̮̮̃-̃)۶ૐ

Share this post


Link to post
Share on other sites
UEZ

UEZ

Try to use &quot;[&quot;

Global $aNames[10] = [&quot;Antonia&quot;, &quot;Ant[on&quot;, &quot;Cae[sar&quot;, &quot;Dor[a&quot;, &quot;Emil&quot;, &quot;Frie[drich&quot;, &quot;Gus[tav&quot;, &quot;Heinrich&quot;, &quot;Ida&quot;, &quot;Julius&quot;]
Global $a[1000]

#include &lt;Array.au3&gt;
Dim $arr1[5] = [1,2,3,4,2]
$a=_ArrayUnique2($arr1)
_ArrayDisplay($a, 'Array')
Dim $arr1[5] = [4,2,3,4,2]
$a=_ArrayUnique2($arr1, 1)
_ArrayDisplay($a, 'Array')
$a=_ArrayUnique2('er|df|er')
_ArrayDisplay($a, 'Array')
$a=_ArrayUnique2('er,df,er', ',')
_ArrayDisplay($a, 'Array')

; ===============================================================================================================================
; Описание ...: Поиск и удаление дубликатов в данных
; Синтаксис.........: _ArrayUnique2($data[, $flag=-1])
; Параметр1....: $data - данные, массив или строка с разделителем
; Параметр2 ....: $flag
;      Если массив, то $flag является индексом массива от которого производить поиск
;      Если строка, то $flag является разделителем, по умолчанию &quot;|&quot;
; Возвращает .: Успешно  - массив без дубликатов
;      Ошибка - 0 и @error=1
; Автор ........: AZJIO
; Remarks .......: В данных не должно быть символа &quot;[&quot;, такие данные исключаются из массива, даже
;     если не являются дубликатами, остальные спец-символы и буквы не вызывают ошибки
; ===============================================================================================================================
Func _ArrayUnique2($data, $flag=-1)
Local $k, $i, $tmp
Assign('/', 1, 1) ;для исключения пустых строк и не совпадения с локальными переменными
If IsArray($data) Then
  If $flag=-1 Then $flag=0
  $tmp=UBound($data) -1
  If $flag&gt;$tmp Then Return SetError(1, 0, 0)
  $k=0
  For $i = $flag To $tmp
   Assign($data[$i]&amp;'/', Eval($data[$i]&amp;'/')+1, 1)
   If Eval($data[$i]&amp;'/') = 1 Then
    $data[$k]=$data[$i]
    $k+=1
   EndIf
  Next
  If $k = 0 Then Return SetError(1, 0, 0)
  ReDim $data[$k]
  Return $data
Else
  If $flag=-1 Then $flag='|'
  $data=StringSplit($data, $flag)
  If Not @error Then
   $k=0
   For $i = 1 To $data[0]
    Assign($data[$i]&amp;'/', Eval($data[$i]&amp;'/')+1, 1)
    If Eval($data[$i]&amp;'/') = 1 Then
     $data[$k]=$data[$i]
     $k+=1
    EndIf
   Next
   If $k = 0 Then Return SetError(1, 0, 0)
   ReDim $data[$k]
   Return $data
  Else
   Return SetError(1, 0, 0)
  EndIf
EndIf
EndFunc

Yes, it will not find any duplicates when I use the strings above.

Thanks for the hint!

Br,

UEZ


Please don't send me any personal message and ask for support! I will not reply!

Selection of finest graphical examples at Codepen.io

The own fart smells best!
Her 'sikim hıyar' diyene bir avuç tuz alıp koşma!
¯\_(ツ)_/¯  ٩(●̮̮̃•̃)۶ ٩(-̮̮̃-̃)۶ૐ

Share this post


Link to post
Share on other sites
UEZ

I there a reason you're using Assign for 1D arrays and not 2D?

No, my 1st version was also a version with Scripting.Dictionary object method and extented it to 2D array version. Finally Yashied showed still a faster version and I added only the 1D array version.

When I find some time I will add the Assign version also for 2D arrays.

Br,

UEZ


Please don't send me any personal message and ask for support! I will not reply!

Selection of finest graphical examples at Codepen.io

The own fart smells best!
Her 'sikim hıyar' diyene bir avuç tuz alıp koşma!
¯\_(ツ)_/¯  ٩(●̮̮̃•̃)۶ ٩(-̮̮̃-̃)۶ૐ

Share this post


Link to post
Share on other sites
Spiff59

The question here is if a string representation of any data type is guaranteed to be unique. The answer unfortunately is no. Both numeric 0 and string "0" are the same when compared as strings, thus the Assign call will not see the difference.

One could test the data type and tack a variable suffix onto the end of the Assign() variable. So a numeric 12 would assign a variable named "12$i", a string "12" would generate "12$s", h for handle, etc. Without testing, I can't say how much the extra processing would detract from the speed benefit of the Assign() method.

Edit: This discriminates between numeric and string, leaves the output unmodified, and seems to enjoy the speed of Yashieds Assign() method:

#include <Array.au3>
Global $array[13] = ["44", "Messerschmidt", 22, "Dornier", 33, "Heinkel", "Focke-Wulf", "Junkers", "22", "Arado", 33, "Henschel", "44"]
$array = __ArrayUnique4($array)
_ArrayDisplay($array)
Global $array[8][2] = [["Paul", 22],["Mike", 33],["Dave", 44],["Bill", 22],["Fred", 66],["Carl", 77],["Luke", 33],["John", 22]]
$array = __ArrayUnique4($array, 2)
_ArrayDisplay($array)
 
Func __ArrayUnique4($aArray, $iTargetDim = 1, $iBase = 0, $iCase = 0)
    If Not IsArray($aArray) Then Return SetError(1, 0, 0)
    Local $iDims = UBound($aArray, 0)
    If $iDims > 2 Then Return SetError(2, 0, 0)
    If ($iTargetDim < 1) Or ($iTargetDim > $iDims) Then Return SetError(3, 0, 0)
Local $iDim1 = UBound($aArray, 1), $iUnique = 0, $vElem
    If $iDims = 2 Then
  Local $iDim2 = UBound($aArray, 2), $aReturn[$iDim1][$iDim2], $j
  $iTargetDim -= 1
  For $i = $iBase To $iDim1 - 1
   $vElem = $aArray[$i][$iTargetDim] & "$" & IsNumber($aArray[$i][$iTargetDim])
   If Not IsDeclared($vElem) Then
    Assign($vElem, 0, 1)
    For $j = 0 to $iDim2 - 1
     $aReturn[$iUnique][$j] = $aArray[$i][$j]
    Next
    $iUnique += 1
   EndIf
  Next
  ReDim $aReturn[$iUnique][$iDim2]
Else
  Local $aReturn[$iDim1]
  For $i = $iBase To $iDim1 - 1
   $vElem = $aArray[$i] & "$" & IsNumber($aArray[$i])
   If Not IsDeclared($vElem) Then
    Assign($vElem, 0, 1)
    $aReturn[$iUnique] = $aArray[$i]
    $iUnique += 1
   EndIf
  Next
  ReDim $aReturn[$iUnique]
EndIf
    Return $aReturn
EndFunc   ;==>__ArrayUnique4

It could probably still stand something like Money's mod to handle special chars and case.

Edited by Spiff59

Share this post


Link to post
Share on other sites
Spiff59

The Assign/IsDeclared method does continue to have a bit of a foul smell to me, as I'm sure it's something any instructor or manager I've ever had would shoot down as non-standard trickery that has no place in production.

To comment directly on the first post. Your compound edit for error condition 3 seems odd... "And $iDimension > UBound($aArray, 2)"?

Wouldn't "If ($iDimension < 1) Or ($iDimension > $iDims) Then Return SetError(3, 0, 0)" be sufficient?

Is no Scripting.Dictionary/object clean-up necessary? Like in the 2 lines in the examples in UEZ's thread from last year?

$oD.RemoveAll
$oD = ""

And, I think whether you include the array count in the first element or not makes little difference to anyone, except that in changing it to not do so makes your version a script-breaker, which I would think you would want to avoid.

Edit: remove garbage inside the codetags...

Edited by Spiff59

Share this post


Link to post
Share on other sites
wraithdu

To comment directly on the first post. Your compound edit for error condition 3 seems odd... "And $iDimension > UBound($aArray, 2)"?

Wouldn't "If ($iDimension < 1) Or ($iDimension > $iDims) Then Return SetError(3, 0, 0)" be sufficient?

$iDimension is I guess a poor choice of variable name (again, copied from the original). It really refers to the index in the 2nd dimension of a 2D array, from which you want to pull data. So while $iDims (the number of array dimensions, 1D, 2D, 3D...) cannot be greater than 2, $iDimension can be anything <= the number of 'columns' in the 2nd dimension.

Is no Scripting.Dictionary/object clean-up necessary? Like in the 2 lines in the examples in UEZ's thread from last year?

Nope. Objects are automatically cleaned up when they go out of scope.

And, I think whether you include the array count in the first element or not makes little difference to anyone, except that in changing it to not do so makes your version a script-breaker, which I would think you would want to avoid.

While I would usually agree, I really hate this counter return. I wrote this update originally because I wanted to unique the array then shuffle it. Having to get rid of the counter before shuffling is just annoying. Plus, if I pass an array to a function that returns a modified version of my array, I expect no extraneous data. StringSplit is a little different (and I'm glad you have the option of the counter return) since you submit a string and get an array back.

If this goes so far as to be submitted, we can revisit this as an option I suppose... but there would be an expensive ReDim somewhere to get rid of or add that [0] array element. Maybe to support backwards compatibility have the counter returned as the default option, but in the code it requires a call to _ArrayInsert (so sensible people like me aren't penalized :graduated: ).

Edited by wraithdu

Share this post


Link to post
Share on other sites
Spiff59

$iDimension is...

Oops, got my own wires crossed on the parm edit.

Looks like you'd have to revert to UEZ's use of the Exists method in order to keep a counter and return the array count in element 0:

Func __ArrayUnique($aArray, $iDimension = 1, $iIdx = 0, $iCase = 0)
    ; Check to see if it is valid array
    If Not IsArray($aArray) Then Return SetError(1, 0, 0)
    Local $iDims = UBound($aArray, 0)
    If $iDims > 2 Then Return SetError(2, 0, 0)
    ;
    ; checks the given dimension is valid
    If ($iDimension <= 0) Or (($iDims = 1) And ($iDimension > 1)) Or (($iDims = 2) And ($iDimension > UBound($aArray, 2))) Then Return SetError(3, 0, 0)
    ; make $iDimension an array index, note this is ignored for 1D arrays
    $iDimension -= 1
    ;
    ; check $iIdx
    If ($iIdx < 0) Or ($iIdx >= UBound($aArray)) Then Return SetError(4, 0, 0)
    ;
    ; create dictionary
    Local $oD = ObjCreate("Scripting.Dictionary")
    ; compare mode for strings
    ; 0 = binary, which is case sensitive
    ; 1 = text, which is case insensitive
    ; this expression forces either 1 or 0
    $oD.CompareMode = Number(Not $iCase)
    ;
    Local $vElem, $iUnique
    $oD.Item(Chr(0))
    ; walk the input array
    For $i = $iIdx To UBound($aArray) - 1
        If $iDims = 1 Then
            ; 1D array
            $vElem = $aArray[$i]
        Else
            ; 2D array
            $vElem = $aArray[$i][$iDimension]
        EndIf
  If Not $od.Exists($vElem) Then
   $oD.Item($vElem)
   $iUnique += 1
  EndIf
    Next
    $oD.Key(Chr(0)) = $iUnique
    ;
    ; return the array of unique keys
    Return $oD.Keys()
EndFunc   ;==>__ArrayUnique

Edit: I do have to say that Assign() trick IS blazingly fast and it looks like the kinks could be worked out of it (post #14 does handle data types). But would the Dev's consider it misuse of the language... And I still think that edit looks fishy lol

Edited by Spiff59

Share this post


Link to post
Share on other sites
wraithdu

Looks like you'd have to revert to UEZ's use of the Exists method in order to keep a counter and return the array count in element 0:

Nope, just assign the return array from the dictionary ($oD.Keys) to a temp variable, do a UBound() to get the size, insert it as element [0], and return the temp var.

I agree about the Assign method... it is fast but definitely an abuse of the language. The dictionary method is not all that slower considering the prodigious array size you have to use to see it (2s vs 6s @ 1mil elements in UEZ's test). The only shortcoming to the dictionary method I see is it only works for strings and numbers. There's surely some ugly workarounds for Ptr/Hwnd and Binary types involving VarGetType() and Number() / String(), but there's a performance penalty there as well and I don't like the potential data mangling.

Edit:

What's this all about?

$oD.Key(Chr(0)) = $iUnique

Can that really guarantee placement at the top of the return array?

Edited by wraithdu

Share this post


Link to post
Share on other sites
Spiff59

Can that really guarantee placement at the top of the return array?

Well, it certainly appears to be FIFO, so I think it ought to work fine. Whether a single nul character is unique enough to avoid conflicts with actual data passed in an array could be debated.

Share this post


Link to post
Share on other sites
wraithdu

That would be LIFO, actually. But regardless irrelevant. After looking at it again, you're storing the count as a value, and my function returns .Keys(). There's obviously no way to guarantee that the count is a unique key, so we're left with storing the array as a temp var, and inserting the count before return.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

×