Sign in to follow this  
Followers 0
amdogelover

Searching for duplicates in a 2D array, adding the results

7 posts in this topic

I'm trying to write a script that reads a .txt file and searches for recurring names in the file to find any duplicates.

;Prompt user for folder to search
Local $folderSearch = FileSelectFolder("", @ScriptDir)

;Search for the file in the folder (always one)
Local $fileToRead = _FileListToArray($folderSearch, "*.txt", $FLTA_FILES, True)

I know have a two column array with strings in the first column, and numbers in the second. It's usually 300-400 rows long and consists of data from testing. To get better averaged results I run the same test four or more times, or only once depending on the test.

What I would like to do is search the first column, check if the first string occurs more than once, if it does, sum the results in a new file, and continue to the last row. I tried using ArraySearch and ArrayFindAll but I couldn't get it to work properly.

I wrote a loop iterating through the array but the logic is off and the values it returns don't add up, inflated numbers from bad loops. Also it doesn't include the tests that only run once. Can somebody show me the right way to sum these results?

Here is what I came up with, it obviously doesn't work... (And please excuse my bad coding)

Local $data = 0
    Local $indexCounter = 0

    _FileReadToArray($fileToRead[1], $data, 1, ":")
    
    Local $newData[UBound($data)][2]

    For $counterRow = 1 To UBound($data, 1) - 1
        Local $temp = $counterRow
        Local $tempData = $data[$counterRow][0]
        For $counterDup = $counterRow+1 To UBound($data, 1) - 1
            If $tempData == $data[$counterDup][0] Then
                $newData[$indexCounter][0] = $data[$counterRow][0]
                $newData[$indexCounter][1] = $data[$counterDup][1]
                $newData[$indexCounter][1] += $data[$temp][1]
                $indexCounter += 1
            EndIf
        Next
    Next

 

Share this post


Link to post
Share on other sites



can you please post a small 2d array that reproduces the data set, and the expected output?  You will no doubt speed up our guessing process.  


,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-.
|(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/
(_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_)
| | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) (
| | | | |)| | \ / | | | | | |)| | `--. | |) \ | |
`-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_|
'-' '-' (__) (__) (_) (__)

Share this post


Link to post
Share on other sites

#3 ·  Posted (edited)

21 minutes ago, iamtheky said:

can you please post a small 2d array that reproduces the data set, and the expected output?  You will no doubt speed up our guessing process.  

Sure, snippet of the file attached.

As you can see, I'm trying to sum tests run more than once to one line, example:

FIRST Max 10k: 379,00
FIRST Min 10k: 266,30
FIRST Avg 10k: 319,43
FIRST Max 10k: 377,50
FIRST Min 10k: 269,60
FIRST Avg 10k: 323,23

I'm trying to find "FIRST MAX 10k" and sum the average to one line, in this case it would be

FIRST Max 10k: 378,25
FIRST Min 10k: 267,95
FIRST Avg 10k: 321,33

The average of the tests in the file, written to a new file!

EDIT: The end goal, after the  calculating of the average, is to replace all whitespace with "_" in the string, and save the results to a .csv file with a "," separating the string from the data.

results.txt

Edited by amdogelover

Share this post


Link to post
Share on other sites

#4 ·  Posted (edited)

ok my guess wasnt too far off:  This snippet grabs all the matching things in [0] and sums their values stored in [1], can we start from this code

edit: which now returns the average of the values in [1] for items returned matching [0]

#include <Array.au3>

Local $aArray[4][2] = [["Thing1", "10"], ["Thing2", "10"], ["Thing1", "20"], ["Thing2", "40"]]
Local $outArray[0][2]


For $k = ubound($aArray) - 1 to 0 step -1

$aItems = _ArrayFindAll($aArray , $aArray[$k][0])

$x = 0

    for $i = 0 to UBound($aItems) - 1
        $x += $aArray[$aItems[$i]][1]
    Next

_ArrayAdd($outArray , $aArray[$k][0] & "|" & ($x / ubound($aItems)))

_ArrayInsert($aItems , 0 , ubound($aItems))
_ArrayDelete($aArray , $aItems)
$k -= ubound($aItems) - 1

next

_ArraySort($outArray)
_ArrayDisplay($outArray)

 

 

Edited by iamtheky
1 person likes this

,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-.
|(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/
(_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_)
| | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) (
| | | | |)| | \ / | | | | | |)| | `--. | |) \ | |
`-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_|
'-' '-' (__) (__) (_) (__)

Share this post


Link to post
Share on other sites

#5 ·  Posted (edited)

A try...   :)

#include <File.au3>

Local $ar
_FileReadToArray("results.txt", $ar, $FRTA_ENTIRESPLIT, ": ")
_ArrayDisplay($ar)

$u = _ArrayUnique($ar)
_ArrayColInsert($u, 1)
;_ArrayDisplay($u)

For $i = 1 to $u[0][0]
   Local $tmp = "", $n = 0
   For $j = 0 to UBound($ar)-1
         $ar[$j][1] = StringReplace($ar[$j][1], ",", ".")
         If $u[$i][0] = $ar[$j][0] Then 
              $n += 1
              $tmp += $ar[$j][1]
         EndIf
   Next
   $u[$i][1] = StringReplace($tmp/$n, ".", ",")
Next
_ArrayDisplay($u)

Local $txt
  For $i = 1 to UBound($u)-1
     $txt &= StringReplace($u[$i][0], " ", "_") & ";" & $u[$i][1] & @crlf
  Next
FileWrite("my_csv.csv", $txt)

Edit
I didn't see the last requirement - the csv thing
BTW using a comma as delimiter is not a good idea, since the values are european decimals using a comma instead of a dot

Edited by mikell
1 person likes this

Share this post


Link to post
Share on other sites

.... If you like to play with SQL give also a try to the following script ...

#include <File.au3>
#include 'ArraySql.au3' ; <-- Get this from the following link:
; https://www.autoitscript.com/forum/topic/166536-manage-arrays-by-means-of-sql/?do=findComment&comment=1234441

Local $ar
_FileReadToArray("results.txt", $ar, $FRTA_ENTIRESPLIT, ": ")

_ArrayToDbTable($ar, "DBarray") ; Transform the array into a Memory SQLite DB called "DBarray"

_SQLite_Exec($g__hMemDb, "UPDATE DBarray SET Column1 = replace(Column1, ',', '.' );") ; change "," with "."

; Perform the SQL Query
$aResult = _ArrayFromSQL("SELECT column0, AVG(Column1) FROM DBarray GROUP BY Column0 ORDER BY column0;")

If Not @error Then
    _ArrayDisplay($aResult, "result of query")
Else
    MsgBox(0, "error", $g__sSQLiteError)
EndIf

 

1 person likes this

small minds discuss people average minds discuss events great minds discuss ideas.... and use AutoIt....

Share this post


Link to post
Share on other sites

#7 ·  Posted (edited)

3 hours ago, iamtheky said:

ok my guess wasnt too far off:  This snippet grabs all the matching things in [0] and sums their values stored in [1], can we start from this code

edit: which now returns the average of the values in [1] for items returned matching [0]

#include <Array.au3>

Local $aArray[4][2] = [["Thing1", "10"], ["Thing2", "10"], ["Thing1", "20"], ["Thing2", "40"]]
Local $outArray[0][2]


For $k = ubound($aArray) - 1 to 0 step -1

$aItems = _ArrayFindAll($aArray , $aArray[$k][0])

$x = 0

    for $i = 0 to UBound($aItems) - 1
        $x += $aArray[$aItems[$i]][1]
    Next

_ArrayAdd($outArray , $aArray[$k][0] & "|" & ($x / ubound($aItems)))

_ArrayInsert($aItems , 0 , ubound($aItems))
_ArrayDelete($aArray , $aItems)
$k -= ubound($aItems) - 1

next

_ArraySort($outArray)
_ArrayDisplay($outArray)

 

 

Thank you for your help, I ended up using mikells code instead but your code works as well.

3 hours ago, mikell said:

A try...   :)

#include <File.au3>

Local $ar
_FileReadToArray("results.txt", $ar, $FRTA_ENTIRESPLIT, ": ")
_ArrayDisplay($ar)

$u = _ArrayUnique($ar)
_ArrayColInsert($u, 1)
;_ArrayDisplay($u)

For $i = 1 to $u[0][0]
   Local $tmp = "", $n = 0
   For $j = 0 to UBound($ar)-1
         $ar[$j][1] = StringReplace($ar[$j][1], ",", ".")
         If $u[$i][0] = $ar[$j][0] Then 
              $n += 1
              $tmp += $ar[$j][1]
         EndIf
   Next
   $u[$i][1] = StringReplace($tmp/$n, ".", ",")
Next
_ArrayDisplay($u)

Local $txt
  For $i = 1 to UBound($u)-1
     $txt &= StringReplace($u[$i][0], " ", "_") & ";" & $u[$i][1] & @crlf
  Next
FileWrite("my_csv.csv", $txt)

Edit
I didn't see the last requirement - the csv thing
BTW using a comma as delimiter is not a good idea, since the values are european decimals using a comma instead of a dot

Thank you mikell, I went with your solution as it was more complete. I ended up tweaking it a little but all in all, very much appreciated.

EDIT:

#include <File.au3>
#include 'ArraySql.au3' ; <-- Get this from the following link:
; https://www.autoitscript.com/forum/topic/166536-manage-arrays-by-means-of-sql/?do=findComment&comment=1234441

Local $ar
_FileReadToArray("results.txt", $ar, $FRTA_ENTIRESPLIT, ": ")

_ArrayToDbTable($ar, "DBarray") ; Transform the array into a Memory SQLite DB called "DBarray"

_SQLite_Exec($g__hMemDb, "UPDATE DBarray SET Column1 = replace(Column1, ',', '.' );") ; change "," with "."

; Perform the SQL Query
$aResult = _ArrayFromSQL("SELECT column0, AVG(Column1) FROM DBarray GROUP BY Column0 ORDER BY column0;")

If Not @error Then
    _ArrayDisplay($aResult, "result of query")
Else
    MsgBox(0, "error", $g__sSQLiteError)
EndIf

 

Thank you a lot for the tip, I was actually thinking of doing this, I'll have to check that ArraySql out first, I have some reading-up to do :D 

Edited by amdogelover

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0