Jewtus

_ArrayUnique on multiple columns

19 posts in this topic

I'm wondering if there is a UDF or a function that can get the distinct values from a 2D array using multiple columns. I'm trying setup an excel output that can be trimmed and the distinct records recorded on the excel spreadsheet.

I was messing around with _arrayunique but trying to get the distinct records for multiple columns, but I'm not really getting anywhere. Any suggestions?

This is an example of the code:

If $TrimOpt='Op1' then $aWriteVal=_ArrayUnique($AllPro,0)
If $TrimOpt='Op2' then $aWriteVal=_ArrayUnique($AllPro,1)
If $TrimOpt='Op3' then $aWriteVal=_ArrayUnique($AllPro,2)
If $TrimOpt='Op4' then $aWriteVal=_ArrayUnique($AllPro,3)
If $TrimOpt='Op5' then $aWriteVal=_ArrayUnique($AllPro,4)
If $TrimOpt='Op6' then $aWriteVal=_ArrayUnique($AllPro,5)
If $TrimOpt='Op7' then $aWriteVal=_ArrayUnique($AllPro,6)
$oExcel = ObjCreate("Excel.Application")
$oExcel.Visible = True
$oWorkbook=$oExcel.WorkBooks.Open ($filePath)
$worksheet=$oExcel.ActiveWorkbook.Worksheets("AllProcesses")
   For $g=0 to UBound($aWriteVal)-1
    $worksheet.Range("A"&$g+2).Value=$aWriteVal[$g][0]
    If UBound($aWriteVal,2) >0 then $worksheet.Range("B"&$g+2).Value=$aWriteVal[$g][1]
    If UBound($aWriteVal,2) >1 then $worksheet.Range("C"&$g+2).Value=$aWriteVal[$g][2]
    If UBound($aWriteVal,2) >2 then $worksheet.Range("D"&$g+2).Value=$aWriteVal[$g][3]
    If UBound($aWriteVal,2) >3 then $worksheet.Range("E"&$g+2).Value=$aWriteVal[$g][4]
    If UBound($aWriteVal,2) >4 then $worksheet.Range("F"&$g+2).Value=$aWriteVal[$g][5]
    If UBound($aWriteVal,2) >5 then $worksheet.Range("G"&$g+2).Value=$aWriteVal[$g][6]
    Next

Share this post


Link to post
Share on other sites



hi  Jewtus

not sure about your goal, but this simple draft function could be of help.
If you pass an 2D array loaded with whatever values, it returns that array with only unique values for each column.
(no error checking is performed if a correct array is passed to function)

#include <Array.au3>

Local $aArray[30][10]
For $i = 0 To 29
    For $x = 0 To 9
        $aArray[$i][$x] = Random(0, 50, 1)
    Next
Next
_ArrayDisplay($aArray, "Input array")

Local $aUniques = _Array2DUnique($aArray) ; returns only unique values in each column
_ArrayDisplay($aUniques, "Output array")

; this function accepts an 1D or 2D array
; and returns only unique values in each column
Func _Array2DUnique(ByRef $aSource)
    If UBound($aSource, 0) = 2 Then
        Local $aOutput[1][UBound($aSource, 2)], $aTemp
        For $iColumn = 0 To UBound($aSource, 2) - 1
            $aTemp = _ArrayUnique($aSource, $iColumn, 0, 0, 0)
            If UBound($aOutput) < UBound($aTemp) Then ReDim $aOutput[UBound($aTemp)][UBound($aSource, 2)]
            For $iRow = 0 To UBound($aTemp) - 1
                $aOutput[$iRow][$iColumn] = $aTemp[$iRow]
            Next
        Next
    Else
        Local $aOutput = _ArrayUnique($aSource)
    EndIf
    Return $aOutput
EndFunc   ;==>_Array2DUnique

small minds discuss people average minds discuss events great minds discuss ideas.... and use AutoIt....

Share this post


Link to post
Share on other sites

This is kind of what I was looking for, but I'm trying to keep each line tied together. The array is a result set from SQL so I need to make sure each line stays together but I only grab unique versions of each line.

Example:

Row1   Row2   Row3   Row4

Hello    Cruel   World   Now

Hello    Cruel   World   This

Hello    Cruel   World   Is

Hello    Cruel   World   What

Hello    Cruel   World   I

Hello    Cruel   World   Need

So if I ignore row 4 the unique record is Hello|Cruel|World 

 

The only way I have been able to make this work is perform a different SQL query dropping one column for each query, I'll just have to build 7 queries.

Share this post


Link to post
Share on other sites

So, you would remove all duplicates by checking data on only some columns and then keeping only one of those duplicated rows,
well, maybe the following function can do
just pass your array as first parameter and a string with the numbers of the columns that you want to check as comma separated values as second parameter (columns numbers are 0 based)

#include <Array.au3>

Local $aArray[6][4] = [ _
        ['Hello', 'Cruel', 'World', 'Now'], _
        ['Hello', 'Cruel', 'World', 'This'], _
        ['Hello', 'Cruel', 'World', 'Is'], _
        ['Hello', 'Cruel', 'World', 'What'], _
        ['Hello', 'Cruel', 'World', 'I'], _
        ['Hello', 'Cruel', 'World', 'Need']]

_ArrayDisplay($aArray, "Input array")
; returns only unique rows based on columns 0, 1 and 2
Local $aUniques = _ArrayUnique2D_Ex($aArray, "0,1,2")
_ArrayDisplay($aUniques, "Output array")


Func _ArrayUnique2D_Ex(ByRef $aSource, $sColumns = "*")
    ; check wanted columns
    If $sColumns = "*" Then
        Local $aColumns[UBound($aSource, 2)]
        For $i = 0 To UBound($aColumns) - 1
            $aColumns[$i] = $i
        Next
    Else
        Local $aColumns = StringSplit($sColumns, ",", 2)
    EndIf
    ; chain fields to check
    Local $aChainFileds[UBound($aSource, 1)][2]
    For $iRow = 0 To UBound($aSource, 1) - 1
        $aChainFileds[$iRow][1] = 0
        For $iField = 0 To UBound($aColumns) - 1
            $aChainFileds[$iRow][0] &= $aSource[$iRow][$aColumns[$iField]]
        Next
    Next
    ; uniqe from chain
    $aTemp = _ArrayUnique($aChainFileds, 0, 0, 0, 1) ; remove duplicate records (if any)
    Local $aUniques[UBound($aTemp)][UBound($aSource, 2)]
    $aUniques[0][0] = 0 ; pointer to next free row to fill
    If UBound($aChainFileds) <> $aTemp[0] Then ; there are some duplicate
        Local $aDuplicates[UBound($aChainFileds, 1) - $aTemp[0] + 1][UBound($aSource, 2)] ; will hold only duplicate
        $aDuplicates[0][0] = 0 ; pointer to next free row to fill

        For $iRow = 0 To UBound($aChainFileds, 1) - 1
            If Not $aChainFileds[$iRow][1] Then ; this record still not checked
                $aTemp = _ArrayFindAll($aChainFileds, $aChainFileds[$iRow][0]) ; find duplicates (if any)
                For $i = 0 To UBound($aTemp) - 1
                    $aChainFileds[$aTemp[$i]][1] = UBound($aTemp) ; mark this record as a duplicate
                Next
                $aUniques[0][0] += 1
                For $iField = 0 To UBound($aSource, 2) - 1
                    $aUniques[$aUniques[0][0]][$iField] = $aSource[$aTemp[0]][$iField]
                Next
                If UBound($aTemp) > 1 Then ; there are duplicates of this record
                    For $i = 1 To UBound($aTemp) - 1
                        $aDuplicates[0][0] += 1
                        For $iField = 0 To UBound($aSource, 2) - 1
                            $aDuplicates[$aDuplicates[0][0]][$iField] = $aSource[$aTemp[$i]][$iField]
                        Next
                    Next
                EndIf
            EndIf
        Next
        ; _ArrayDisplay($aUniques, "Those are unique elements")
        ; _ArrayDisplay($aDuplicates, "These are duplicates discarded")
    Else
        ; there are not duplicates in source array
        ; return passed array unchanged
        Return $aSource
    EndIf
    _ArrayDelete($aUniques, 0) ; remove the count row
    Return $aUniques
EndFunc   ;==>_ArrayUnique2D_Ex

small minds discuss people average minds discuss events great minds discuss ideas.... and use AutoIt....

Share this post


Link to post
Share on other sites

#5 ·  Posted (edited)

What you are explaining is also "only return the items that exist in every column"?

#include <Array.au3>

Local $aArr[6][4] = [ _
        ['Hello', 'Cruel', 'World', 'Now'], _
        ['Hello', 'Cruel', 'World', 'This'], _
        ['Hello', 'Cruel', 'World', 'Is'], _
        ['Hello', 'Cruel', 'World', 'What'], _
        ['Hello', 'Cruel', 'World', 'I'], _
        ['Hello', 'Cruel', 'World', 'Need']]

_ArrayDisplay($aArr, "Input array")
_ArrayDisplay(_ArrayCommon($aArr), "CommonValue")
_ArrayDisplay(_ArrayMultiValue($aArr), "MultiValue")

Func _ArrayCommon($aArray)

Local $aOut[1][ubound($aArray , 2)]

    for $k = 0 to ubound($aArray , 2) - 1
        $aMatch =  _ArrayFindAll($aArray , $aArray[0][$k] , 0 , 0 , 0 , 0 , $k)
        If ubound($aMatch) = UBound($aArray) Then $aOut[0][$k] = $aArray[0][$k]
    next

return $aOut

EndFunc

Func _ArrayMultiValue($aArray)

Local $aOut[ubound($aArray)][ubound($aArray , 2)]
Local $aSkip[0]

    for $i = 0 to ubound($aArray) - 1
        for $k = 0 to ubound($aArray , 2) - 1
            _ArrayFindAll($aSkip , $k)
                If @Error = 0 Then ContinueLoop
                $aMatch =  _ArrayFindAll($aArray , $aArray[$i][$k] , 0 , 0 , 0 , 0 , $k)
                    If ubound($aMatch) <> UBound($aArray) Then
                        $aOut[$i][$k] = $aArray[$i][$k]
                    Else
                        _ArrayAdd($aSkip , $k)
                    
                    EndIf
        next
    next

return $aOut

EndFunc

Edit:  Also, did the opposite and returned columns with multiple values just to learn...cleaned it up so it didnt re-search columns already determined to be all matches.

Edited by boththose

,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-.
|(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/
(_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_)
| | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) (
| | | | |)| | \ / | | | | | |)| | `--. | |) \ | |
`-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_|
'-' '-' (__) (__) (_) (__)

Share this post


Link to post
Share on other sites

#6 ·  Posted (edited)

What you are explaining is also "only return the items that exist in every column"?

#include <Array.au3>

Local $aArr[6][4] = [ _
        ['Hello', 'Cruel', 'World', 'Now'], _
        ['Hello', 'Cruel', 'World', 'This'], _
        ['Hello', 'Cruel', 'World', 'Is'], _
        ['Hello', 'Cruel', 'World', 'What'], _
        ['Hello', 'Cruel', 'World', 'I'], _
        ['Hello', 'Cruel', 'World', 'Need']]

_ArrayDisplay($aArr, "Input array")
_ArrayDisplay(_ArrayCommon($aArr), "CommonValue")
_ArrayDisplay(_ArrayMultiValue($aArr), "MultiValue")

Func _ArrayCommon($aArray)

Local $aOut[1][ubound($aArray , 2)]

    for $k = 0 to ubound($aArray , 2) - 1
        $aMatch =  _ArrayFindAll($aArray , $aArray[0][$k] , 0 , 0 , 0 , 0 , $k)
        If ubound($aMatch) = UBound($aArray) Then $aOut[0][$k] = $aArray[0][$k]
    next

return $aOut

EndFunc

Func _ArrayMultiValue($aArray)

Local $aOut[ubound($aArray)][ubound($aArray , 2)]
Local $aSkip[0]

    for $i = 0 to ubound($aArray) - 1
        for $k = 0 to ubound($aArray , 2) - 1
            _ArrayFindAll($aSkip , $k)
                If @Error = 0 Then ContinueLoop
                $aMatch =  _ArrayFindAll($aArray , $aArray[$i][$k] , 0 , 0 , 0 , 0 , $k)
                    If ubound($aMatch) <> UBound($aArray) Then
                        $aOut[$i][$k] = $aArray[$i][$k]
                    Else
                        _ArrayAdd($aSkip , $k)
                    
                    EndIf
        next
    next

return $aOut

EndFunc

Edit:  Also, did the opposite and returned columns with multiple values just to learn...cleaned it up so it didnt re-search columns already determined to be all matches.

 

your _ArrayCommon function, check if a column has the same value in all the rows and in that case it returns that value

while _ArrayMultiValue returns the whole column if one or more values are different

but I think that this is not what OP need

for example if you have in input an array like this:

Local $aArray[6][4] = [ _
        ['Hello', 'Cruel', 'World', 'Now'], _
        ['Hello', 'Nice', 'World', 'This'], _
        ['Hello', 'Cruel', 'World', 'Is'], _
        ['Hello', 'Nice', 'World', 'What'], _
        ['Hello', 'Cruel', 'World', 'I'], _
        ['Hello', 'Wonder', 'World', 'Need']]

and you want to search for duplicated data only by columns 0 and 1 for example

the function _ArrayUnique2d should return something like this:

Hello    Cruel   World    Now
Hello    Nice    World    This
Hello    Wonder  World    Need

that is: when you find more rows with the same values in column 0 and 1, then return only one occurrence of the duplicated rows (remove the duplicates and keep only one)

this is a little more complicated to achieve than what your functions are intended to do.

edit:

changed array data

Edited by Chimp

small minds discuss people average minds discuss events great minds discuss ideas.... and use AutoIt....

Share this post


Link to post
Share on other sites

for example if you have in input an array like this:

Local $aArray[6][4] = [ _
        ['Hello', 'Cruel', 'World', 'Now'], _
        ['Hello', 'Nice', 'World', 'This'], _
        ['Hello', 'Cruel', 'World', 'Is'], _
        ['Hello', 'Nice', 'World', 'What'], _
        ['Hello', 'Cruel', 'World', 'I'], _
        ['Hello', 'Wonder', 'World', 'Need']]

and you want to search for duplicated data only by columns 0 and 1 for example

the function _ArrayUnique2d should return something like this:

Hello    Cruel   World    Now
Hello    Nice    World    This
Hello    Wonder  World    Need

that is: when you find more rows with the same values in column 0 and 1, then return only one occurrence of the duplicated rows (remove the duplicates and keep only one)

 

In this case for return#1 , why would row#1 be returned rather than row#3 or row#5 ?  should the non concerned columns be totally omitted ?

Share this post


Link to post
Share on other sites

In this case for return#1 , why would row#1 be returned rather than row#3 or row#5 ?  should the non concerned columns be totally omitted ?

 

The first row of many repeated is the one that is kept, not other constraints are specified by op (however, suggestions for improvement are welcome)

anyway, in my function Is also available an array with all the discarded rows ($aDuplicates), even if not returned, it can be used if needed.


small minds discuss people average minds discuss events great minds discuss ideas.... and use AutoIt....

Share this post


Link to post
Share on other sites

Ok, Let me clear this up a little because I think my first post was a little unclear.

This is what I'm trying to do.. I have a SQL query that returns a ton of duplicate data in the first few columns because of the join (its a parent child table so its a table joined to itself). I'd like to be able to use the same result query and trim it based on the column I select. So for example, If I wanted all of the records, I could use the base result. If I only wanted the first 2 columns worth of data, all of the other columns would be deleted, then the distinct records left from the first two columns would show.

I tried to using Chimps sample, but I didn't quite understand why when I ran my array against it, it would consistently only show the first row. I did however try to use Boththose _ArrayCommon function, and its much closer to what I'm trying to do. I did some a minor mod, that seems to work... sort of. My actual result array is 7 columns wide and can have duplicates in potentially any column. 

The following code works up to field 6, which is where I would like to see two results:

#include <Array.au3>

Local $aArr[7][7] = [ _
    ['Field1','Field2','Field3','Field4','Field5','Field6','Result'], _
    ['Field1','Field2','Field3','Field4','Field5','Field6','Result2'], _
    ['Field1','Field2','Field3','Field4','Field5','Field6','Result3'], _
    ['Field1','Field2','Field3','Field4','Field5','Field6','Result4'], _
    ['Field1','Field2','Field3','Field4','Field5','Field6','Result5'], _
    ['Field1','Field2','Field3','Field4','Field5','Field6.1','Result6'], _
    ['Field1','Field2','Field3','Field4','Field5','Field6.1','Result7']]



_ArrayDisplay(_ArrayCommon($aArr,6), "CommonValue")


Func _ArrayCommon($aArray,$colNum='')
    If $colNum <> '' Then
        For $z=UBound($aArray,2) to $colNum step -1
            _ArrayColDelete($aArray,$z)
        Next
    EndIf
    Local $aOut[1][ubound($aArray , 2)]
        for $k = 0 to ubound($aArray , 2) - 1
            $aMatch =  _ArrayFindAll($aArray , $aArray[0][$k] , 0 , 0 , 0 , 0 , $k)
            If ubound($aMatch) = UBound($aArray) Then $aOut[0][$k] = $aArray[0][$k]
        next
    return $aOut
EndFunc

The result I'd like to see is:

['Field1','Field2','Field3','Field4','Field5','Field6']

['Field1','Field2','Field3','Field4','Field5','Field6.1']

Share this post


Link to post
Share on other sites

Why not just do another SQL query (or a couple of them) with a SELECT DISTINCT?


Roses are FF0000, violets are 0000FF... All my base are belong to you.

Share this post


Link to post
Share on other sites

#include <Array.au3>

Local $aArr[7][7] = [ _
    ['Field1','Field2','Field3','Field4','Field5','Field6','Result'], _
    ['Field1','Field2','Field3','Field4','Field5','Field6','Result2'], _
    ['Field1','Field2','Field3','Field4','Field5','Field6','Result3'], _
    ['Field1','Field2','Field3','Field4','Field5','Field6','Result4'], _
    ['Field1','Field2','Field3','Field4','Field5','Field6','Result5'], _
    ['Field1','Field2','Field3','Field4','Field5','Field6.1','Result6'], _
    ['Field1','Field2','Field3','Field4','Field5','Field6.1','Result7']]

_ArrayDisplay($aArr, "Input array")
_ArrayDisplay(_ArrayCommon($aArr), "CommonValue")


Func _ArrayCommon($aArray)

Local $aOut[ubound($aArray)][ubound($aArray , 2)]
Local $o = 0

for $i = 0 to ubound($aArray) - 1
    for $k = 0 to ubound($aArray , 2) - 1
        $aMatch =  _ArrayFindAll($aArray , $aArray[$i][$k] , 0 , 0 , 0 , 0 , $k)
        If ubound($aMatch) > 1 Then
          If $o = 0 Then
              $aOut[$o][$k] = $aArray[$i][$k]
              $flag = 1
              EndIf
          If $o > 0 AND $aArray[$i][$k] <> $aOut[$o - 1][$k] Then
                for $p = 0 to ubound($aArray , 2) - 1
                    $aLineMatch =  _ArrayFindAll($aArray , $aArray[$i][$p] , $i , 0 , 0 , 0 , $p)
              If ubound($aLineMatch) > 1 Then $aOut[$o][$p] = $aArray[$i][$p]
              next
              $flag = 1
          EndIf
        EndIf
    next
            If $flag = 1 Then
                $o += 1
                $flag = 0
            EndIf
Next

return $aOut

EndFunc


,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-.
|(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/
(_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_)
| | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) (
| | | | |)| | \ / | | | | | |)| | `--. | |) \ | |
`-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_|
'-' '-' (__) (__) (_) (__)

Share this post


Link to post
Share on other sites

#12 ·  Posted (edited)

 

Why not just do another SQL query (or a couple of them) with a SELECT DISTINCT?

That, or use a where or having clause to restrict the resultset, while requesting explicitly the columns needed (instead of select * from...).

SQL in general is powerful enough for most precise queries and working out useful queries is always simpler than writing tons of application code to extract useful data from a "grab all" query. This also helps to keep your application simpler and easier to code/debug/maintain.

Edited by jchd

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites

#13 ·  Posted (edited)

Ok, Let me clear this up a little because I think my first post was a little unclear.

This is what I'm trying to do.. I have a SQL query that returns a ton of duplicate data in the first few columns because of the join (its a parent child table so its a table joined to itself). I'd like to be able to use the same result query and trim it based on the column I select. So for example, If I wanted all of the records, I could use the base result. If I only wanted the first 2 columns worth of data, all of the other columns would be deleted, then the distinct records left from the first two columns would show.

I tried to using Chimps sample, but I didn't quite understand why when I ran my array against it, it would consistently only show the first row. I did however try to use Boththose _ArrayCommon function, and its much closer to what I'm trying to do. I did some a minor mod, that seems to work... sort of. My actual result array is 7 columns wide and can have duplicates in potentially any column. 

The following code works up to field 6, which is where I would like to see two results:

#include <Array.au3>

Local $aArr[7][7] = [ _
    ['Field1','Field2','Field3','Field4','Field5','Field6','Result'], _
    ['Field1','Field2','Field3','Field4','Field5','Field6','Result2'], _
    ['Field1','Field2','Field3','Field4','Field5','Field6','Result3'], _
    ['Field1','Field2','Field3','Field4','Field5','Field6','Result4'], _
    ['Field1','Field2','Field3','Field4','Field5','Field6','Result5'], _
    ['Field1','Field2','Field3','Field4','Field5','Field6.1','Result6'], _
    ['Field1','Field2','Field3','Field4','Field5','Field6.1','Result7']]



_ArrayDisplay(_ArrayCommon($aArr,6), "CommonValue")


Func _ArrayCommon($aArray,$colNum='')
    If $colNum <> '' Then
        For $z=UBound($aArray,2) to $colNum step -1
            _ArrayColDelete($aArray,$z)
        Next
    EndIf
    Local $aOut[1][ubound($aArray , 2)]
        for $k = 0 to ubound($aArray , 2) - 1
            $aMatch =  _ArrayFindAll($aArray , $aArray[0][$k] , 0 , 0 , 0 , 0 , $k)
            If ubound($aMatch) = UBound($aArray) Then $aOut[0][$k] = $aArray[0][$k]
        next
    return $aOut
EndFunc

The result I'd like to see is:

['Field1','Field2','Field3','Field4','Field5','Field6']

['Field1','Field2','Field3','Field4','Field5','Field6.1']

 

it seems to me that my function returns exactly what you ask,

the only difference from your wanted output is that it returns all the columns of the input array instead of only the ones on which is performed the "Unique" check, but it can easly modified

can you try this and say if I understand well the goal?

#include <Array.au3>

Local $aArray[7][7] = [ _
        ['Field1', 'Field2', 'Field3', 'Field4', 'Field5', 'Field6', 'Result'], _
        ['Field1', 'Field2', 'Field3', 'Field4', 'Field5', 'Field6', 'Result2'], _
        ['Field1', 'Field2', 'Field3', 'Field4', 'Field5', 'Field6', 'Result3'], _
        ['Field1', 'Field2', 'Field3', 'Field4', 'Field5', 'Field6', 'Result4'], _
        ['Field1', 'Field2', 'Field3', 'Field4', 'Field5', 'Field6', 'Result5'], _
        ['Field1', 'Field2', 'Field3', 'Field4', 'Field5', 'Field6.1', 'Result6'], _
        ['Field1', 'Field2', 'Field3', 'Field4', 'Field5', 'Field6.1', 'Result7']]

_ArrayDisplay($aArray, "Input array")
; returns only unique rows based on columns 0, 1 and 2
Local $aUniques = _ArrayUnique2D_Ex($aArray, "0,1,2,3,4,5")
_ArrayDisplay($aUniques, "Output array")


Func _ArrayUnique2D_Ex(ByRef $aSource, $sColumns = "*")
    ; check wanted columns
    If $sColumns = "*" Then
        Local $aColumns[UBound($aSource, 2)]
        For $i = 0 To UBound($aColumns) - 1
            $aColumns[$i] = $i
        Next
    Else
        Local $aColumns = StringSplit($sColumns, ",", 2)
    EndIf
    ; chain fields to check
    Local $aChainFileds[UBound($aSource, 1)][2]
    For $iRow = 0 To UBound($aSource, 1) - 1
        $aChainFileds[$iRow][1] = 0
        For $iField = 0 To UBound($aColumns) - 1
            $aChainFileds[$iRow][0] &= $aSource[$iRow][$aColumns[$iField]]
        Next
    Next
    ; uniqe from chain
    $aTemp = _ArrayUnique($aChainFileds, 0, 0, 0, 1) ; remove duplicate records (if any)
    Local $aUniques[UBound($aTemp)][UBound($aSource, 2)]
    $aUniques[0][0] = 0 ; pointer to next free row to fill
    If UBound($aChainFileds) <> $aTemp[0] Then ; there are some duplicate
        Local $aDuplicates[UBound($aChainFileds, 1) - $aTemp[0] + 1][UBound($aSource, 2)] ; will hold only duplicate
        $aDuplicates[0][0] = 0 ; pointer to next free row to fill

        For $iRow = 0 To UBound($aChainFileds, 1) - 1
            If Not $aChainFileds[$iRow][1] Then ; this record still not checked
                $aTemp = _ArrayFindAll($aChainFileds, $aChainFileds[$iRow][0]) ; find duplicates (if any)
                For $i = 0 To UBound($aTemp) - 1
                    $aChainFileds[$aTemp[$i]][1] = UBound($aTemp) ; mark this record as a duplicate
                Next
                $aUniques[0][0] += 1
                For $iField = 0 To UBound($aSource, 2) - 1
                    $aUniques[$aUniques[0][0]][$iField] = $aSource[$aTemp[0]][$iField]
                Next
                If UBound($aTemp) > 1 Then ; there are duplicates of this record
                    For $i = 1 To UBound($aTemp) - 1
                        $aDuplicates[0][0] += 1
                        For $iField = 0 To UBound($aSource, 2) - 1
                            $aDuplicates[$aDuplicates[0][0]][$iField] = $aSource[$aTemp[$i]][$iField]
                        Next
                    Next
                EndIf
            EndIf
        Next
        ; _ArrayDisplay($aUniques, "Those are unique elements")
        ; _ArrayDisplay($aDuplicates, "These are duplicates discarded")
    Else
        ; there are not duplicates in source array
        ; return passed array unchanged
        Return $aSource
    EndIf
    _ArrayDelete($aUniques, 0) ; remove the count row
    Return $aUniques
EndFunc   ;==>_ArrayUnique2D_Ex

edit:

p.s.

of course I agree with SadBunny and jchd form post #10 and #12. For sure with just an SQL query you can have only the data that you need,

anyway this function is just for "the fun in programming"

Edited by Chimp

small minds discuss people average minds discuss events great minds discuss ideas.... and use AutoIt....

Share this post


Link to post
Share on other sites

a small change to my _ArrayUnique2D_Ex() function that allows, by adding a parameter,

to choose if the returned array must contain all of the columns of the input array or only those columns relating to the check of "Uniqueness"
that is (for example):

$aUniques = _ArrayUnique2D_Ex($aArray, "0,1,2,3,4,5", False)
$aArray :       The input array
"0,1,2,3,4,5":  [optional] The columns you want for "uniqueness"
False/True :    [optional] True (default) returns same columns as in input (reurn all columns?)
                           False returns only the columns as in previous param

Here the modified version:

#include <Array.au3>

Local $aArray[7][7] = [ _
        ['Field1', 'Field2', 'Field3', 'Field4', 'Field5', 'Field6', 'Result'], _
        ['Field1', 'Field2', 'Field3', 'Field4', 'Field5', 'Field6', 'Result2'], _
        ['Field1', 'Field2', 'Field3', 'Field4', 'Field5', 'Field6', 'Result3'], _
        ['Field1', 'Field2', 'Field3', 'Field4', 'Field5', 'Field6', 'Result4'], _
        ['Field1', 'Field2', 'Field3', 'Field4', 'Field5', 'Field6', 'Result5'], _
        ['Field1', 'Field2', 'Field3', 'Field4', 'Field5', 'Field6.1', 'Result6'], _
        ['Field1', 'Field2', 'Field3', 'Field4', 'Field5', 'Field6.1', 'Result7']]

_ArrayDisplay($aArray, "Input array")
; returns only unique rows based on columns 0, 1, 2, 3, 4, 5 and returned array contains only those columns
Local $aUniques = _ArrayUnique2D_Ex($aArray, "0,1,2,3,4,5", False)
_ArrayDisplay($aUniques, "Output array")


Func _ArrayUnique2D_Ex(ByRef $aSource, $sColumns = "*", $iReturnAllCols = True)
    ; check wanted columns
    If $sColumns = "*" Then
        Local $aColumns[UBound($aSource, 2)]
        For $i = 0 To UBound($aColumns) - 1
            $aColumns[$i] = $i
        Next
    Else
        Local $aColumns = StringSplit($sColumns, ",", 2) ; NO count in element 0
    EndIf

    ; chain fields to check
    Local $aChainFileds[UBound($aSource, 1)][2]
    For $iRow = 0 To UBound($aSource, 1) - 1
        $aChainFileds[$iRow][1] = 0
        For $iField = 0 To UBound($aColumns) - 1
            $aChainFileds[$iRow][0] &= $aSource[$iRow][$aColumns[$iField]]
        Next
    Next
    ; uniqe from chain
    $aTemp = _ArrayUnique($aChainFileds, 0, 0, 0, 1) ; remove duplicate records (if any)
    If $iReturnAllCols Then
        Local $aUniques[UBound($aTemp)][UBound($aSource, 2)] ; Return all columns
    Else
        Local $aUniques[UBound($aTemp)][UBound($aColumns)] ; Return only checked columns
    EndIf
    $aUniques[0][0] = 0 ; pointer to next free row to fill
    If UBound($aChainFileds) <> $aTemp[0] Then ; there are some duplicate
        Local $aDuplicates[UBound($aChainFileds, 1) - $aTemp[0] + 1][UBound($aSource, 2)] ; will hold only duplicate
        $aDuplicates[0][0] = 0 ; pointer to next free row to fill

        For $iRow = 0 To UBound($aChainFileds, 1) - 1
            If Not $aChainFileds[$iRow][1] Then ; this record still not checked
                $aTemp = _ArrayFindAll($aChainFileds, $aChainFileds[$iRow][0]) ; find duplicates (if any)
                For $i = 0 To UBound($aTemp) - 1
                    $aChainFileds[$aTemp[$i]][1] = UBound($aTemp) ; mark this record as a duplicate
                Next
                $aUniques[0][0] += 1
                If $iReturnAllCols Then
                    For $iField = 0 To UBound($aSource, 2) - 1
                        $aUniques[$aUniques[0][0]][$iField] = $aSource[$aTemp[0]][$iField]
                    Next
                Else
                    For $iField = 0 To UBound($aColumns) - 1
                        $aUniques[$aUniques[0][0]][$iField] = $aSource[$aTemp[0]][$aColumns[$iField]]
                    Next
                EndIf
                If UBound($aTemp) > 1 Then ; there are duplicates of this record
                    For $i = 1 To UBound($aTemp) - 1
                        $aDuplicates[0][0] += 1
                        For $iField = 0 To UBound($aSource, 2) - 1
                            $aDuplicates[$aDuplicates[0][0]][$iField] = $aSource[$aTemp[$i]][$iField]
                        Next
                    Next
                EndIf
            EndIf
        Next
        ; _ArrayDisplay($aUniques, "Those are unique elements")
        ; _ArrayDisplay($aDuplicates, "These are duplicates discarded")
    Else
        ; there are not duplicates in source array
        ; return passed array unchanged
        Return $aSource
    EndIf
    _ArrayDelete($aUniques, 0) ; remove the count row
    Return $aUniques
EndFunc   ;==>_ArrayUnique2D_Ex

@boththose

It seems to me that your function has a bug and a limit:
1) the bug: If you pass the following array (note the "FieldX" in middle of column3) your function does not catch it

Local $aArr[7][7] = [ _
    ['Field1','Field2','Field3','Field4','Field5','Field6','Result'], _
    ['Field1','Field2','Field3','Field4','Field5','Field6','Result2'], _
    ['Field1','Field2','Field3','Field4','Field5','Field6','Result3'], _
    ['Field1','Field2','FieldX','Field4','Field5','Field6','Result4'], _
    ['Field1','Field2','Field3','Field4','Field5','Field6','Result5'], _
    ['Field1','Field2','Field3','Field4','Field5','Field6.1','Result6'], _
    ['Field1','Field2','Field3','Field4','Field5','Field6.1','Result7']]

2) the limit: if you want to check only few columns, say 3 and 5, you can't.


small minds discuss people average minds discuss events great minds discuss ideas.... and use AutoIt....

Share this post


Link to post
Share on other sites

#15 ·  Posted (edited)

@SadBunny

That is the way I have it right now, but I'm trying to avoid spamming queries as well as making it so if I change the schema, I don't need to update 7 queries to reflect the changes

@Chimp

I think the issue I had with your script was that when it returned the array (EX when I put '0,1,2') it returned all 7 columns when I need it to return just the 3 columns. I did take another look and I tweaked your script to make it do what I wanted:

#include <Array.au3>


Local $aArray[7][7] = [ _
        ['Field1', 'Field2', 'Field3', 'Field4', 'Field5', 'Field6', 'Result'], _
        ['Field1', 'Field2', 'Field3', 'Field4', 'Field5', 'Field6', 'Result2'], _
        ['Field1', 'Field2', 'Field3', 'Field4', 'Field5', 'Field6', 'Result3'], _
        ['Field1', 'Field2', 'Field3', 'Field4', 'Field5', 'Field6', 'Result4'], _
        ['Field1', 'Field2', 'Field3', 'Field4', 'Field5', 'Field6', 'Result5'], _
        ['Field1', 'Field2', 'Field3', 'Field4', 'Field5', 'Field6.1', 'Result6'], _
        ['Field1', 'Field2', 'Field3', 'Field4', 'Field5', 'Field6.1', 'Result7']]


_ArrayDisplay($aArray, "Input array")
; returns only unique rows based on columns 0, 1 and 2
Local $aUniques = _ArrayUnique2D_Ex($aArray, "0,1,2,3,4,5,6")
_ArrayDisplay($aUniques, "Output array")




Func _ArrayUnique2D_Ex(ByRef $aSource, $sColumns = "*")
    ; check wanted columns
    If $sColumns = "*" Then
        Local $aColumns[UBound($aSource, 2)]
        For $i = 0 To UBound($aColumns) - 1
            $aColumns[$i] = $i
        Next
    Else
        Local $aColumns = StringSplit($sColumns, ",", 2)
    EndIf
    ; chain fields to check
    Local $aChainFileds[UBound($aSource, 1)][2]
    For $iRow = 0 To UBound($aSource, 1) - 1
        $aChainFileds[$iRow][1] = 0
        For $iField = 0 To UBound($aColumns) - 1
            $aChainFileds[$iRow][0] &= $aSource[$iRow][$aColumns[$iField]]
        Next
    Next
    ; uniqe from chain
    $aTemp = _ArrayUnique($aChainFileds, 0, 0, 0, 1) ; remove duplicate records (if any)
    Local $aUniques[UBound($aTemp)][UBound($aSource, 2)]
    $aUniques[0][0] = 0 ; pointer to next free row to fill
    If UBound($aChainFileds) <> $aTemp[0] Then ; there are some duplicate
        Local $aDuplicates[UBound($aChainFileds, 1) - $aTemp[0] + 1][UBound($aSource, 2)] ; will hold only duplicate
        $aDuplicates[0][0] = 0 ; pointer to next free row to fill


        For $iRow = 0 To UBound($aChainFileds, 1) - 1
            If Not $aChainFileds[$iRow][1] Then ; this record still not checked
                $aTemp = _ArrayFindAll($aChainFileds, $aChainFileds[$iRow][0]) ; find duplicates (if any)
                For $i = 0 To UBound($aTemp) - 1
                    $aChainFileds[$aTemp[$i]][1] = UBound($aTemp) ; mark this record as a duplicate
                Next
                $aUniques[0][0] += 1
                For $iField = 0 To UBound($aSource, 2) - 1
                    $aUniques[$aUniques[0][0]][$iField] = $aSource[$aTemp[0]][$iField]
                Next
                If UBound($aTemp) > 1 Then ; there are duplicates of this record
                    For $i = 1 To UBound($aTemp) - 1
                        $aDuplicates[0][0] += 1
                        For $iField = 0 To UBound($aSource, 2) - 1
                            $aDuplicates[$aDuplicates[0][0]][$iField] = $aSource[$aTemp[$i]][$iField]
                        Next
                    Next
                EndIf
            EndIf
        Next
        ; _ArrayDisplay($aUniques, "Those are unique elements")
        ; _ArrayDisplay($aDuplicates, "These are duplicates discarded")
    Else
        ; there are not duplicates in source array
        ; return passed array unchanged
        Return $aSource
    EndIf
    _ArrayDelete($aUniques, 0) ; remove the count row
$trimOutput=StringSplit($sColumns,",")
$colNum=$trimOutput[UBound($trimOutput)-1]
If $colNum <> '' Then
For $z=UBound($aUniques,2) to $colNum+1 step -1
_ArrayColDelete($aUniques,$z)
Next
    EndIf
    Return $aUniques
EndFunc   ;==>_ArrayUnique2D_Ex
Edited by Jewtus

Share this post


Link to post
Share on other sites

If you ever need to update the schema, you're essentially forced to modify the filtering code. But that's up to you.


This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites

I posted the same time you posted....

see my post #14 above for a little improvement that allows you to check any column and return only those columns if you don't want all the other


small minds discuss people average minds discuss events great minds discuss ideas.... and use AutoIt....

Share this post


Link to post
Share on other sites

@Chimp

Ya I saw that and changed the accepted solve. Thanks!

@jchd

Why would you say that I have to modify the filtering code? If I was to update the schema, I'd also update the initial query, which would make the overall array update, and that is what I'm looking for the uniqueness on... the original array result

Share this post


Link to post
Share on other sites

Based off the explanations so far, I would think that Fieldx in that instance would not be desired, since it is just as unique as the items in the Results column.  But it works fine when there is another, since I am only finding commonalities, not uniques.  And I never saw #2 as a requirement.

Glad you made it out without Sql though, I dont even get to take guesses in those threads.


,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-.
|(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/
(_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_)
| | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) (
| | | | |)| | \ / | | | | | |)| | `--. | |) \ | |
`-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_|
'-' '-' (__) (__) (_) (__)

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now