rony2006

Number of duplicates from array

17 posts in this topic

Hello, I have a 2d array and I want to get the number of duplicates that i have in that array element.

array[0][0] = how many elements I have - this number is different all the time

array[0][11], array[1][11], array[2][11], array[1+n][11],  contains the data that I need to analyze and to see how many times a value appears and what is that value.

How I can do this please?

 

Share this post


Link to post
Share on other sites



Provide a script that can be executed that includes a sample array.

I'm too lazy to populate an array with that many dimensions, and I'm generally going to state that for the rest of the community also.

If you don't want to do that, then I'll generically answer your question:

Create a new 2D array...loop through your large array, and then loop through your new array...check if that value already exists, if not add a new value to the array where $a[n][0] is the value, and $a[n][1] is 1...if it does exist, then increment the counter in $a[n][1]


IEbyXPATH-Grab IE DOM objects by XPATH IEscriptRecord-Makings of an IE script recorder ExcelFromXML-Create Excel docs without excel installed GetAllWindowControls-Output all control data on a given window.

Share this post


Link to post
Share on other sites

#3 ·  Posted (edited)

This doesn't tell you exactly what is duplicated, but it tells you how many dupes exist. The method might give you some ideas. You could sort the flattened array and loop through it, testing for changes as you go. _ArrayFlatten can be ripped out of the ArrayWorshop UDF and combined with the standard Array UDF functions (it's the only function which has no dependencies). See my signature.

#include 'ArrayWorkshop.au3'

Local $aArray = [[1,2,2],[3,2,0],[7,4,2],[8,3,5]]

Local $aTest = $aArray
_ArrayFlatten($aTest)

Local $iItems = UBound($aTest)
_ArrayUniqueXD($aTest)

Local $iDupes = $iItems - UBound($aTest)
MsgBox(0, "Duplicated Items", $iDupes)

 

Edited by czardas

Share this post


Link to post
Share on other sites

#4 ·  Posted (edited)

#include <Array.au3>

Local $aArray = [[1,2,2],[3,2,0],[7,4,2],[2,3,5]]
Local $aFinal[0]
$sSearch = 3

For $i = 0 to ubound($aArray , 2) - 1
$aFound = _ArrayFindAll($aArray , $sSearch , 0 , 0 , 0 , 0 , $i)
    for $k = 0 to ubound($aFound) - 1
        $aFound[$k] = $aFound[$k] & "|" & $i
    next
_ArrayConcatenate($aFinal , $aFound)
Next

msgbox(0, 'Number of ' & $sSearch & "'s" , ubound($aFinal))

_ArrayDisplay($aFinal)

I kept an $aFinal in case you needed the locations for use later

edit - this might be that other thing:

edit2 - to make it more similar to the OPs request

#include <Array.au3>

Local $aArray = [[1,2,2],[3,2,0],[7,4,2],[2,3,5]]
Local $aFinal[0]
$sSearch = 2

For $i = 0 to ubound($aArray , 2) - 1
$aFound = _ArrayFindAll($aArray , $sSearch , 0 , 0 , 0 , 0 , $i)
    for $k = 0 to ubound($aFound) - 1
        $aFound[$k] = $aFound[$k] & "|" & $i
    next
_ArrayConcatenate($aFinal , $aFound)
Next

msgbox(0, 'Number of ' & $sSearch & "'s" , ubound($aFinal))

_ArrayDisplay($aFinal)  ; -----act on stuff below based off this array


for $i = 0 to ubound($aFinal) -  1
    If $i = 0 then
        $aArray[stringsplit($aFinal[$i] , "|" , 2)[0]][stringsplit($aFinal[$i] , "|" , 2)[1]] &= " - Count = " & ubound($aFinal)
    Else
        $aArray[stringsplit($aFinal[$i] , "|" , 2)[0]][stringsplit($aFinal[$i] , "|" , 2)[1]] = ""
    EndIf
Next

_ArrayDisplay($aArray)

 

Edited by iamtheky

,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-.
|(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/
(_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_)
| | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) (
| | | | |)| | \ / | | | | | |)| | `--. | |) \ | |
`-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_|
'-' '-' (__) (__) (_) (__)

Share this post


Link to post
Share on other sites

#5 ·  Posted (edited)

The funny and fast way using Scripting.Dictionary  :)

#include <Array.au3>

local $a = [[7],[1,2,2],[3,2,0],[3,2,2],[3,2,5],[7,4,2],[8,3,5],[9,3,2]]
; _ArrayDisplay($a)

Local $sd = ObjCreate("Scripting.Dictionary"), $col = 2 ; col to search in

; get the number of occurences for each unique element in col $col
For $i = 1 to $a[0][0] 
   $sd.Item($a[$i][$col]) = ($sd.Exists($a[$i][$col])=0) ? 1 : $sd.Item($a[$i][$col])+1
Next
$asd = _SDtoArray($sd)
_ArrayDisplay($asd, "all elements and occurences")

; get the duplicates in col $col and their number of occurences
For $i In $sd
   If $sd.Item($i) = 1 Then $sd.Remove($i)
Next
$asd = _SDtoArray($sd)
_ArrayDisplay($asd, "duplicates and occurences")

; get only the duplicates (0-based 1D array)
$asd = $sd.Keys
_ArrayDisplay($asd, "duplicates only")

 
Func _SDtoArray($_sd)
    Local $count = $_sd.Count
    Local $ret[$count+1][2] 
    $ret[0][0] = $count
    For $i = 1 To $count
        $ret[$i][0] = $_sd.Keys[$i-1]
        $ret[$i][1] = $_sd.Items[$i-1]
    Next
    Return $ret
EndFunc

 

Edited by mikell

Share this post


Link to post
Share on other sites

#6 ·  Posted (edited)

And another method.

#include <Array.au3>

Local $Array = [[1, 2, 2],[3, 2, 0],["Ab", "ABC", 2],[2, 3, "AB"],["DAB", "ABCD", "A C"]]

$a = _ArrayGetDuplicates($Array)

_ArrayDisplay($a, "Duplicates in Array", "", 0, Default, "Data|Repeated")

; $aArray          - A 2D array to search for duplicate data in each element.
; $iStart_Row      - [optional] Index of array row to start matching
; $iCaseSensitive  - [optional] 0 Not case sensitive (default)
;                               1 Case sensitive match only.
; Returns a 2D array: In column 1 - The data in $aArray that is duplicated only.
;                     In column 2 - The number of times that data (in Col1 same row) is duplicated.
;
Func _ArrayGetDuplicates($aArray, $iStart_Row = 0, $iCaseSensitive = 0)
    Local $sStr = "||" & _ArrayToString($aArray, "||", -1, -1, "||") & "||" ; $sStr used to get the number of duplicates.
    If $iStart_Row > 0 Then $sStr = StringRegExpReplace($sStr, "^((?:\|\|[^\|]*){" & (UBound($aArray, 2) * $iStart_Row) & "})", "")
    ConsoleWrite($sStr & @LF)
    Local $aRetArray[UBound($aArray, 1)][2]
    $iCount = 0 ; Return array index
    For $i = $iStart_Row To UBound($aArray, 1) - 1
        For $j = 0 To UBound($aArray, 2) - 1
            $sStr = StringReplace($sStr, "|" & $aArray[$i][$j] & "|", "", 0, $iCaseSensitive)
            $iNumOfElements = @extended
            If $iNumOfElements > 1 Then
                $aRetArray[$iCount][0] = $aArray[$i][$j]
                $aRetArray[$iCount][1] = $iNumOfElements
                $iCount += 1
            EndIf
        Next
    Next
    ReDim $aRetArray[$iCount][2]
    Return $aRetArray
EndFunc   ;==>_ArrayGetDuplicates

 

Edited by Malkey
Changed "0" to "$iStart_Row" in For-Next loop. More efficient. Does not alter result either ways.

Share this post


Link to post
Share on other sites

#7 ·  Posted (edited)

12 hours ago, rony2006 said:

rray[0][11], array[1][11], array[2][11], array[1+n][11],  contains the data that I need to analyze and to see how many times a value appears and what is that value.

so i think this solution is the searched one:

#include <Array.au3>

Dim $aArray[101][12]
For $i=1 to 100
    $aArray[$i][11] = Random(1,30,1)
Next
$aArray[0][0]=100

_ArrayDisplay($aArray)

_ArrayDisplay(_countUniqueElements($aArray,11), '_countUniqueElements for Col 11')

Func _countUniqueElements($aSource,$iCol=0)
    ;returns the count of each unique element in a Array
    ;autor: autobert
    Local $aUnique
    Local $b2D=UBound($aSource,2)
    if @error Then  ;1D
        $aUnique= _ArrayUnique($aSource)
    Else        ;2D
        $aUnique= _ArrayUnique($aSource,$iCol)
    EndIf
    _ArrayDelete($aUnique, 0)
    Dim $aUnique2D[UBound($aUnique)][2]
    For $x = 1 to UBound($aUnique) - 1
        $aUnique2D[$x][0] = $aUnique[$x]
        If $b2D Then
            $aCount=_ArrayFindAll($aSource,$aUnique[$x],0,0,0,0,$iCol)
        Else
            $aCount=_ArrayFindAll($aSource,$aUnique[$x])
        EndIf
        $aUnique2D[$x][1] = UBound($aCount)
    Next
    _ArraySort($aUnique2D,0,1)
    $aUnique2D[0][0]=$x-1
    Return $aUnique2D
EndFunc   ;==>_countUniqueElements^

 

Edited by AutoBert
1D + 2D array support

Share this post


Link to post
Share on other sites

@AutoBert Thank you for the last script. Is working ok.

I still have a little problem, I dont know how I can take the row data where the duplicates are the most.

So after applying your script I get:

tre444.png.bf1621c0e489d16b49037d989ee7a

 

for the above table I need a variable where I have "Alege sau scrie" and "23"

for ex in this case : var[0] = "Alege sau scrie" and var [1] = "23"

Share this post


Link to post
Share on other sites

I found 

_ArrayMaxIndex

Returns the index where the highest value occurs in a 1D or 2D array

 But I dont know why i get error msg when using:

 

MsgBox($MB_SYSTEMMODAL, 'Max Index String value', _ArrayMaxIndex(_countUniqueElements($lt1,5), 0, 1, 999,2))

 

Share this post


Link to post
Share on other sites

i think this:

MsgBox($MB_SYSTEMMODAL, 'Max Index String value', _ArrayMaxIndex(_countUniqueElements($lt1,5),1,1,-1,1))

should work without error but as you need both cols make it this way:

#include <Array.au3>

Dim $aArray[101][12]
For $i=1 to 100
    $aArray[$i][11] = Random(1,30,1)
Next
$aArray[0][0]=100

;_ArrayDisplay($aArray)

$aUnique2D=_countUniqueElements($aArray,11)
$iMaxIndex=_ArrayMaxIndex($aUnique2D,1,1,-1,1)
ConsoleWrite('[ArrayMaxMethod] First Most duplicate is '&$aUnique2D[$iMaxIndex][0]&' = '&$aUnique2D[$iMaxIndex][1]&@CRLF)
ConsoleWrite('                       Most duplicate is '&$aUnique2D[1][0]&' = '&$aUnique2D[1][1]&@CRLF)
_ArrayDisplay($aUnique2D, '_countUniqueElements for Col 11')


Func _countUniqueElements($aSource,$iCol=0)
    ;returns the count of each unique element in a Array
    ;autor: autobert
    Local $aUnique
    Local $b2D=UBound($aSource,2)
    if @error Then  ;1D
        $aUnique= _ArrayUnique($aSource)
    Else        ;2D
        $aUnique= _ArrayUnique($aSource,$iCol)
    EndIf
    _ArrayDelete($aUnique, 0)
    Dim $aUnique2D[UBound($aUnique)][2]
    For $x = 1 to UBound($aUnique) - 1
        $aUnique2D[$x][0] = $aUnique[$x]
        If $b2D Then
            $aCount=_ArrayFindAll($aSource,$aUnique[$x],0,0,0,0,$iCol)
        Else
            $aCount=_ArrayFindAll($aSource,$aUnique[$x])
        EndIf
        $aUnique2D[$x][1] = UBound($aCount)
    Next
    _ArraySort($aUnique2D,1,1,0,1)
    $aUnique2D[0][0]=$x-1
    Return $aUnique2D
EndFunc   ;==>_countUniqueElements

line 13 ouputs the first most duplicate element and the count of the duplicates into console with the _ArrayMaxIndex. I have also sorted the resulting array in the way that the first most duplicate is always the index 1, so line 14 will show same result.

Share this post


Link to post
Share on other sites

#11 ·  Posted (edited)

I didn't fully understand the question earlier. Perhaps this is the kind of result you want.

#include <Array.au3>

Local $aArray = [[1,3,2],[7,2,3],[4,0,4],[1,3,4],[7,2,3],[4,3,5],[7,2,4]]

Local $iSearchCol = 2, $aCol
$aCol = _ArrayExtract($aArray, -1, -1, $iSearchCol, $iSearchCol)
_ArraySort($aCol)

Local $sNext, $sPrevious, $sFind, $iMax = 1, $iCurr = 1
For $i = 0 To UBound($aCol) - 1
    $sNext = $aCol[$i]
    If $sNext = $sPrevious Then
        $iCurr += 1
    Else
        If $iCurr > $iMax Then
            $sFind = $sPrevious
            $iMax = $iCurr
        EndIf
        $sPrevious = $sNext
        $iCurr = 1
    EndIf
Next

If $iMax = 1 Then MsgBox(0, "Error", "There are no duplicates." & @LF & "What now?")

Local $iBound2 = UBound($aArray, 2), $aResults[$iMax][$iBound2]
$iCurr = 0
For $i = 0 To UBound($aArray) - 1
    If $aArray[$i][$iSearchCol] = $sFind Then
        For $c = 0 To $iBound2 -1
            $aResults[$iCurr][$c] = $aArray[$i][$c]
        Next
        $iCurr += 1
        If $iCurr = $iBound2 Then ExitLoop
    EndIf
Next

_ArrayDisplay($aResults)

Notice that the number 4 appears 3 times in the final column in both the original array and in the results.

Edited by czardas
MsgBox(0, "Error", "There are no duplicates." & @LF & "What now?")

Share this post


Link to post
Share on other sites

In case the context makes it valuable to have the data stored in a database, say with SQLite or some other engine, inside a table created like this:

create table T as (col1, col2, ..., col11, COL12 char);

then the wanted result is essentially a one-liner:

Local $aRows, $iRows, $iCols
_SQLiteGetTable2d(-1, "select COL12, count(*) C from T group by COL12 order by C desc;", $aRows, $iRows, $iCols)

The database approach is quickly beneficial when several different queries are needed over a significant amount of data, even more when the queries are complex.

1 person likes this

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites
17 hours ago, AutoBert said:

i think this:

MsgBox($MB_SYSTEMMODAL, 'Max Index String value', _ArrayMaxIndex(_countUniqueElements($lt1,5),1,1,-1,1))

should work without error but as you need both cols make it this way:

#include <Array.au3>

Dim $aArray[101][12]
For $i=1 to 100
    $aArray[$i][11] = Random(1,30,1)
Next
$aArray[0][0]=100

;_ArrayDisplay($aArray)

$aUnique2D=_countUniqueElements($aArray,11)
$iMaxIndex=_ArrayMaxIndex($aUnique2D,1,1,-1,1)
ConsoleWrite('[ArrayMaxMethod] First Most duplicate is '&$aUnique2D[$iMaxIndex][0]&' = '&$aUnique2D[$iMaxIndex][1]&@CRLF)
ConsoleWrite('                       Most duplicate is '&$aUnique2D[1][0]&' = '&$aUnique2D[1][1]&@CRLF)
_ArrayDisplay($aUnique2D, '_countUniqueElements for Col 11')


Func _countUniqueElements($aSource,$iCol=0)
    ;returns the count of each unique element in a Array
    ;autor: autobert
    Local $aUnique
    Local $b2D=UBound($aSource,2)
    if @error Then  ;1D
        $aUnique= _ArrayUnique($aSource)
    Else        ;2D
        $aUnique= _ArrayUnique($aSource,$iCol)
    EndIf
    _ArrayDelete($aUnique, 0)
    Dim $aUnique2D[UBound($aUnique)][2]
    For $x = 1 to UBound($aUnique) - 1
        $aUnique2D[$x][0] = $aUnique[$x]
        If $b2D Then
            $aCount=_ArrayFindAll($aSource,$aUnique[$x],0,0,0,0,$iCol)
        Else
            $aCount=_ArrayFindAll($aSource,$aUnique[$x])
        EndIf
        $aUnique2D[$x][1] = UBound($aCount)
    Next
    _ArraySort($aUnique2D,1,1,0,1)
    $aUnique2D[0][0]=$x-1
    Return $aUnique2D
EndFunc   ;==>_countUniqueElements

line 13 ouputs the first most duplicate element and the count of the duplicates into console with the _ArrayMaxIndex. I have also sorted the resulting array in the way that the first most duplicate is always the index 1, so line 14 will show same result.

 

@AutoBert Thanks, but I get the following error when I run the script:

>"C:\Program Files\AutoIt3\SciTE\..\AutoIt3.exe" "C:\Program Files\AutoIt3\SciTE\AutoIt3Wrapper\AutoIt3Wrapper.au3" /run /prod /ErrorStdOut /in "C:\Users\Marian\Desktop\v7\v6\v6\New folder\test\forum.au3" /UserParams    
+>20:54:13 Starting AutoIt3Wrapper v.16.306.1237.0 SciTE v.3.6.2.0   Keyboard:00000409  OS:WIN_7/Service Pack 1  CPU:X64 OS:X86  Environment(Language:0409)  CodePage:0  utf8.auto.check:4    # detect ascii high characters and if none found set default encoding to UTF8 and do not add BOM
+>         SciTEDir => C:\Program Files\AutoIt3\SciTE   UserDir => C:\Users\Marian\AppData\Local\AutoIt v3\SciTE\AutoIt3Wrapper   SCITE_USERHOME => C:\Users\Marian\AppData\Local\AutoIt v3\SciTE 
>Running AU3Check (3.3.12.0)  from:C:\Program Files\AutoIt3  input:C:\Users\Marian\Desktop\v7\v6\v6\New folder\test\forum.au3
+>20:54:13 AU3Check ended.rc:0
>Running:(3.3.12.0):C:\Program Files\AutoIt3\autoit3.exe "C:\Users\Marian\Desktop\v7\v6\v6\New folder\test\forum.au3"    
--> Press Ctrl+Alt+Break to Restart or Ctrl+Break to Stop
"C:\Users\Marian\Desktop\v7\v6\v6\New folder\test\forum.au3" (13) : ==> Variable subscript badly formatted.:
ConsoleWrite('[ArrayMaxMethod] First Most duplicate is '&$aUnique2D[$iMaxIndex][0]&' = '&$aUnique2D[$iMaxIndex][1]&@CRLF)
ConsoleWrite('[ArrayMaxMethod] First Most duplicate is '&$aUnique2D[^ ERROR
->20:54:15 AutoIt3.exe ended.rc:1
+>20:54:15 AutoIt3Wrapper Finished.
>Exit code: 1    Time: 2.822
 

Share this post


Link to post
Share on other sites
11 hours ago, jchd said:

In case the context makes it valuable to have the data stored in a database, say with SQLite or some other engine, inside a table created like this:

create table T as (col1, col2, ..., col11, COL12 char);

then the wanted result is essentially a one-liner:

Local $aRows, $iRows, $iCols
_SQLiteGetTable2d(-1, "select COL12, count(*) C from T group by COL12 order by C desc;", $aRows, $iRows, $iCols)

The database approach is quickly beneficial when several different queries are needed over a significant amount of data, even more when the queries are complex.

Hi, I already take the data from SQL server, make some math operations, and show it in listview.

If I dont find any solution to my request I will take the data from listview, put in SQL db and then make a query, but this is a little bit to complicated and maybe I find better ways.

Share this post


Link to post
Share on other sites

Then why not retrieve the correct information directly from the DB engine in the first place, possibly in another distinct array? That would greatly simplify your software!

Also, can't your math operations be done by the DB engine itself?


This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites

#16 ·  Posted (edited)

16 hours ago, rony2006 said:

--> Press Ctrl+Alt+Break to Restart or Ctrl+Break to Stop
"C:\Users\Marian\Desktop\v7\v6\v6\New folder\test\forum.au3" (13) : ==> Variable subscript badly formatted.:
ConsoleWrite('[ArrayMaxMethod] First Most duplicate is '&$aUnique2D[$iMaxIndex][0]&' = '&$aUnique2D[$iMaxIndex][1]&@CRLF)
ConsoleWrite('[ArrayMaxMethod] First Most duplicate is '&$aUnique2D[^ ERROR
->20:54:15 AutoIt3.exe ended.rc:1
+>20:54:15 AutoIt3Wrapper Finished.

You don't started my demo script from #10, you tried to use your own script? There must be a error occured using _ArrayMaxIndex and $iMI=-1. But: 

On 15.5.2016 at 2:03 AM, AutoBert said:

I have also sorted the resulting array in the way that the first most duplicate is always the index 1, so line 14 will show same result.

so you just can use:

$aUnique2D=_countUniqueElements($aArray,11)
ConsoleWrite('Most duplicate is '&$aUnique2D[1][0]&' = '&$aUnique2D[1][1]&@CRLF)
_ArrayDisplay($aUnique2D, '_countUniqueElements for Col 11')

Note: $aArray must be a 1 based array with at least one row.

Edited by AutoBert

Share this post


Link to post
Share on other sites

It is working ok now.

Thank you!

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now