Jump to content
roeselpi

2D Array Duplicate Search Replace without knowing the values beforehand

Recommended Posts

hello,

i have a 2D array (example follows) where in the second column the same value can be found several times. in general i wanted to delete always the duplicate value and leave it empty before saving or writing the result to a file.

#include <Array.au3>

Global $array[10][2] = [["name 0", "Peter"], ["name 1", "Paul"], ["name 2", "Mary"], ["name 3", "Mary"], ["name 4", "Charles"], _
        ["name 5", "Elizabeth"], ["name 6", "Victoria"], ["name 7", "Mary"], ["name 8", "Tom"], ["name 9", "Paul"]]

_ArrayDisplay($array)

$array[3][1] = deletemary()
$array[7][1] = deletemary()
$array[9][1] = deletepaul()

_ArrayDisplay($array)

Func deletemary()
    $byebyemary = "byebye mary"
EndFunc

Func deletepaul()
    $byebyepaul = "byebye paul"
EndFunc

here i get the double values replaced with a "0" and not with my text, but that does not bother me much. much more troublesome is the problem that in the array there are over 100 values and many of them are double. now i was wondering how i would have to go about it to find and replace the duplicates?

1) i would need to identify the duplicates without knowing what value they might be. the only constant in all values that might be duplicates would be the first two chars. for example: A1Paul, A1Peter, A1Tom, B2Elizabeth, B2Mary, B2Victoria, C3Charles. If there were two the same of A1Peter then i would need to know how to identify these duplicates in an automated process.

manually i can do this easily with the _ArrayDisplay() function because it is sortable in ascending or descending order. so what i do now is make a list on paper and then edit the textfile afterwards. that takes like an hour every time i want to test something.

2) i would need to know how i can replace the second, third, forth, [ ... ] of these duplicates with a "0" or anything else for that matter. one value must however remain (like an original)

please point me in the right direction, as many replace values in array topics were not adaptable (for my current status of knowledge) because they were mainly 1D arrays or had other not understandable segments.

is my task even possible at all? especially if you do not know what you are looking for and would have to compare over 100 values with each other. sounds like it would take a lot of time comparing like 100 values squared.

thanks for advice

Share this post


Link to post
Share on other sites

You could read the value in the first "row", and then go through the array searching for the same value as the first.

if any of them match, delete. Repeat for all rows, while not searching above the one you are starting with. Done.


Spoiler

Renamer - Rename files and folders, remove portions of text from the filename etc.

GPO Tool - Export/Import Group policy settings.

MirrorDir - Synchronize/Backup/Mirror Folders

BeatsPlayer - Music player.

Params Tool - Right click an exe to see it's parameters or execute them.

String Trigger - Triggers pasting text or applications or internet links on specific strings.

Inconspicuous - Hide files in plain sight, not fully encrypted.

Regedit Control - Registry browsing history, quickly jump into any saved key.

Time4Shutdown - Write the time for shutdown in minutes.

Power Profiles Tool - Set a profile as active, delete, duplicate, export and import.

Finished Task Shutdown - Shuts down pc when specified window/Wndl/process closes.

NetworkSpeedShutdown - Shuts down pc if download speed goes under "X" Kb/s.

IUIAutomation - Topic with framework and examples

Au3Record.exe

Share this post


Link to post
Share on other sites

Would that fit your needs?

Local $array = [["name 0", "Peter"], ["name 1", "Paul"], ["name 2", "Mary"], ["name 3", "Mary"], ["name 4", "Charles"], _
        ["name 5", "Elizabeth"], ["name 6", "Victoria"], ["name 7", "Mary"], ["name 8", "Tom"], ["name 9", "Paul"]]

Local $aTmp = _ArrayUnique($array, 1, 0, 0, 0, 0)
Local $aOut[UBound($aTmp)][UBound($array, 2)]
For $i = 0 To UBound($aOut) - 1
    $aOut[$i][0] = "name " & $i
    $aOut[$i][1] = $aTmp[$i]
Next
$aTmp = 0

_ArrayDisplay($aOut)

 


This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites

perfect. took a while to understand but perfect. i had tried with _ArrayUnique but did not get it done somehow. thanks a lot

Share this post


Link to post
Share on other sites

Another flavour, if you want to keep the corresponding element in the 1st column

#Include <Array.au3>

Local $array = [["name 0", "Peter"], ["name 1", "Paul"], ["name 2", "Mary"], ["name 3", "Mary"], ["name 4", "Charles"], _
        ["name 5", "Elizabeth"], ["name 6", "Victoria"], ["name 7", "Mary"], ["name 8", "Tom"], ["name 9", "Paul"]]

Local $sd = ObjCreate("Scripting.Dictionary")
For $i = 0 to UBound($array)-1
    $sd.add($array[$i][1], $array[$i][0])
Next
Local $aOut[$sd.count][2]
For $i = 0 to $sd.count-1
    $aOut[$i][0] = $sd.Items[$i]
    $aOut[$i][1] = $sd.Keys[$i]
Next
_ArrayDisplay($aOut)

 

Share this post


Link to post
Share on other sites
Posted (edited)

*see below

For the edge case where you want the last entry for each.

Edited by iamtheky

,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-.
|(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/
(_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_)
| | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) (
| | | | |)| | \ / | | | | | |)| | `--. | |) \ | |
`-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_|
'-' '-' (__) (__) (_) (__)

Share this post


Link to post
Share on other sites
11 hours ago, roeselpi said:

perfect. took a while to understand but perfect.

If that's OK then you can get rid of the first column, since it's just a redundant decoration of the row number, i.e. the index thru the array. That will simplify code some more.

If you want to save the simplified array while adding the "name #" prefix, you can do that while going thru the array for output.

Also I still don't understand what you meant whith "A1Peter...": the A1, B2 C3 prefixes jump in your post but have no materialization in the code sample you supplied. That's why I ignored that sentence.


This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites

hi,

@jchd i am a stickler for numbers of exact lengths, and at the moment i can not seem to get the count to be 3 digits long. at the moment it looks a bit 'stupid'

Quote

name 1

name 2

name 10

name 73

name 105

it would look nicer if they all had an exact same length. so name 1 would be name 001 etc.

that means the first colum is not quite decoration but rather more a counter for me that i can see at one glance by looking at the last entry of the to be created file how many names are listed.

what you called "prefix" is not a prefix, but i was trying to express that if that would make a comparison easier then i could say that the first two elements of the names can be the same throughout all the names in the file. manually checking i always looked at the third character because that is where the first duplicate can occur. anyway that was just a side information and not of relevance here or in my example really. so ignoring that was fine.

i am still pondering on two things now: getting the counting done and saving the new output, because for some reason it will not save the $aOut at the moment. but i am still testing currently

 

-----

 

@mikell for some reason your version does not work at all. i get an error message:

Quote

"C:\AutoIt3\testfiles\array-delete-duplicate-example.au3" (17) : ==> The requested action with this object has failed.:
$sd.add($array[$i][1], $array[$i][0])
$sd^ ERROR

do not quite know why.  just for testing purposes i tried it.

 

-----

 

@iamtheky i also like that version. but for my needs i do not think it is quite the right solution. i think the one from jchd is the one i will have to try to adapt to what i need it for. but that does not mean that i am not greatful. of course i am greatful. i always look at all the examples and try them out to see what they do and how they do it, you never know when you might need something like that down the line.

 

 

Share this post


Link to post
Share on other sites
Posted (edited)

stringformat will make your numbers all pretty. 

went ahead and fixed my other example, and made it clunkier:

#include<array.au3>

$flag = 0

Local $array = [["name 0", "Peter"], ["name 1", "Paul"], ["name 2", "Mary"], ["name 3", "Mary"], ["name 4", "Charles"], _
        ["name 5", "Elizabeth"], ["name 6", "Victoria"], ["name 7", "Mary"], ["name 8", "Tom"], ["name 9", "Paul"]]


For $i = ubound($array) - 1 to 0 step -1

$array[$i][0] = "name " &  StringFormat("%03i" , stringtrimleft($array[$i][0] , 5))

    $aFnd = _ArrayFindAll($array , $array[$i][1] , 0 , $i - 1 , 0 ,0 , 1)

    If $flag = 1 Then ExitLoop
    If @error Then ContinueLoop

    _ArrayAdd($aFnd , ubound($aFnd) , 0)
    _ArrayDelete($array , $aFnd)
    $i -= ubound($aFnd) - 2
    $flag = $i - (ubound($aFnd) - 2) < 1 ? Assign("i" , 1) : 0

Next

_ArrayDisplay($array)

 

Edited by iamtheky

,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-.
|(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/
(_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_)
| | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) (
| | | | |)| | \ / | | | | | |)| | `--. | |) \ | |
`-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_|
'-' '-' (__) (__) (_) (__)

Share this post


Link to post
Share on other sites

i can not get it to work the way i want. it is quite frustrating. i have been trying about 100 different possible solutions, but it just will not do it the way i would like it to be.

at first it looked very successful, but my lack of understanding keeps making it very hard for me to adapt a code to get where i want it to be. to make it a bit more clear, i have made a simple example with the generation of an ini file with random entries (had a lot of help with that from "BrewManNH" in this thread: replace data in infile for that) to make clear what i am getting at and where the problem is it will generate 150 radom entries consisting of 3 digits and here you can easily see what i mean with the first two digits being able to be identical and the third digit is the one which causes the duplicate entry. after deleting the duplicates it would leave you with about 101-115 unique entries that are not duplicated. but on the first generation the numerals are all right and after the unique check/delete the numbers all get a different amount of digits and that is annoying.

#include <Array.au3>

Global $inifile = "nothing.ini"
Global $standard = "chopchop"
Global $keynames = "key"

If FileExists($inifile) = 1 Then
    MsgBox(0, "YES", "File does exist!")
ElseIf FileExists($inifile) = 0 Then
    MsgBox(0, "NO", "File does not exist")

    For $Loop = 0 To 150
        $num = StringRight("00" & $Loop + 1, 3)
        IniWrite($inifile, $standard, $keynames & $num, $Loop) ;-- why is it "$num, $Loop" and not "$num & $Loop"?
    Next

    Global $ini = IniReadSection($inifile, "chopchop")

    For $Loop = 1 To $ini[0][0]
     $ini[$Loop][1] = _Randomizer(1)
    Next

    For $Loop = 1 To UBound($ini) - 1
        IniWrite($inifile, $standard, $ini[$Loop][0], $ini[$Loop][1])
    Next

    MsgBox(0,"SUCCESS", "ini-file was saved.")

    Global $ini = IniReadSection($inifile, "chopchop")
    _ArrayDisplay($ini, "see duplicates?")

    FileOpen($inifile, 2) ;-- why do i have to open the file while above i did not need to do that before writing the ini?

    Global $aTmp = _ArrayUnique($ini, 1, 0, 0, 0, 0)
    Global $aOut[UBound($aTmp)][UBound($ini, 2)]
    For $i = 0 To UBound($aOut) - 1
        $nums = StringRight("00" & $i + 1, 3)
        ;$aOut[$i][0] = $keynames & $nums & $i ---> total garbage will not do anything that is supposed to do.
        $aOut[$i][0] = $keynames & $i
        $aOut[$i][1] = $aTmp[$i]
    Next

    For $i = 1 To  UBound($aOut) - 1
        IniWrite($inifile, $standard, $aOut[$i][0], $aOut[$i][1])
    Next

    $aTmp = 0

    Global $ini = IniReadSection($inifile, "chopchop")
    _ArrayDisplay($ini, "still duplicates?")

EndIf

i spotted the duplicates later down the line but with a unique check, i can eliminate the duplicates and that saves like 20-30 minutes during testing whilst doing it manually.

i tested the other version as well from iamtheky  and there no unique single key will be left. all duplicates will be removed but one of them should remain because one is the first and original, so that was not the right thing to go with.

i just do not understand where i am going wrong at present.

 

 

Share this post


Link to post
Share on other sites

roeselpi,

make sure you understand  StringFormat  to get the clear idea of what it does ,to remove the prefix use Number()

 

#include <Array.au3>

Local $array = [["name 0", "Peter"], ["name 1", "Paul"], ["name 2", "Mary"], ["name 3", "Mary"], ["name 4", "Charles"], _
        ["name 5", "Elizabeth"], ["name 6", "Victoria"], ["name 7", "Mary"], ["name 8", "Tom"], ["name 9", "Paul"]]

Local $sd = ObjCreate("Scripting.Dictionary")
For $i = 0 To UBound($array) - 1
    If $sd.Exists($array[$i][1]) Then ContinueLoop
    $sd.Item($array[$i][1]) = $array[$i][1]
Next
Local $a = $sd.Items
_ArrayColInsert($a, 0)
_ArrayDisplay($a)

For $i = 0 To UBound($a) - 1
    $a[$i][0] = StringFormat("%05d", $i)
Next
_ArrayDisplay($a)

Deye

Share this post


Link to post
Share on other sites
52 minutes ago, roeselpi said:
    For $Loop = 1 To $ini[0][0]
     $ini[$Loop][1] = _Randomizer(1)
    Next

Your script is not runnable, the _Randomizer func is missing

Share this post


Link to post
Share on other sites

got it. thanks. i just changed, like you suggested:

$aOut[$i][0] = $keynames & $i

to:

$aOut[$i][0] = $keynames & StringFormat("%03d", $i)

then i have the values all numbered nice and neatly. i would never have looked at StringFormat if you had not mentioned it and shown me an example.

does not solve the mystery why i have to FileOpen() first before writing the ini, whilst above the ini was created and then rewritten without having to use the FileOpen() function. but as long as it works, i am happy. thanks

Share this post


Link to post
Share on other sites

oh yes mikell, you are right. i forgot the function. just for sake of completation:

#include <Array.au3>

Global $inifile = "nothing.ini"
Global $standard = "chopchop"
Global $keynames = "key"

If FileExists($inifile) = 1 Then
    MsgBox(0, "YES", "File does exist!")
ElseIf FileExists($inifile) = 0 Then
    MsgBox(0, "NO", "File does not exist")

    For $Loop = 0 To 150
        $num = StringRight("00" & $Loop + 1, 3)
        IniWrite($inifile, $standard, $keynames & $num, $Loop) ;-- why is it "$num, $Loop" and not "$num & $Loop"?
    Next

    Global $ini = IniReadSection($inifile, "chopchop")

    For $Loop = 1 To $ini[0][0]
     $ini[$Loop][1] = _Randomizer(1)
    Next

    For $Loop = 1 To UBound($ini) - 1
        IniWrite($inifile, $standard, $ini[$Loop][0], $ini[$Loop][1])
    Next

    MsgBox(0,"SUCCESS", "ini-file was saved.")

    Global $ini = IniReadSection($inifile, "chopchop")
    _ArrayDisplay($ini, "see duplicates?")

    FileOpen($inifile, 2) ;-- why do i have to open the file while above i did not need to do that before writing the ini?

    Global $aTmp = _ArrayUnique($ini, 1, 0, 0, 0, 0)
    Global $aOut[UBound($aTmp)][UBound($ini, 2)]
    For $i = 0 To UBound($aOut) - 1
        $nums = StringRight("00" & $i + 1, 3)
        ;$aOut[$i][0] = $keynames & $nums & $i ---> total garbage will not do anything that is supposed to do.
        $aOut[$i][0] = $keynames & StringFormat("%03d", $i)
        $aOut[$i][1] = $aTmp[$i]
    Next

    For $i = 1 To  UBound($aOut) - 1
        IniWrite($inifile, $standard, $aOut[$i][0], $aOut[$i][1])
    Next

    $aTmp = 0

    Global $ini = IniReadSection($inifile, "chopchop")
    _ArrayDisplay($ini, "still duplicates?")

EndIf

Func _Randomizer($length)
    $String = ""
    $aChars = StringSplit("abcdefg", "")
    $sString = ""
    $bChars = StringSplit("ABCDEFG", "")
    $bString = ""
    $cChars = StringSplit("1234567", "")
    $cString = ""
    $i=0
    Do
        If $length<=0 then ExitLoop
        $String &=  $achars[Random(1,$achars[0])]&$cchars[Random(1,$cchars[0])]&$bchars[Random(1,$bchars[0])]
        $i += 1
    Until $i = $length
    Return $String
EndFunc

 

Share this post


Link to post
Share on other sites
8 minutes ago, roeselpi said:

why i have to FileOpen() first before writing the ini

Because using the flag 2 means 'overwrite'
Just run a test using your previous script, with the FileOpen line commented or not, and then count the lines in the final ini file in both cases  ;)

Share this post


Link to post
Share on other sites

yes i see that but:

line 14 the ini is created like so:

IniWrite($inifile, $standard, $keynames & $num, $Loop)

result:
[chopchop]
key001=0
key002=1
key003=2
key004=3
key005=4
key006=5
key007=6
key008=7
key009=8
key010=9
key011=10
etc

then further down: 

line 24 the ini is overwritten with the new data like so:

IniWrite($inifile, $standard, $ini[$Loop][0], $ini[$Loop][1])

result:

[chopchop]
key001=f5C
key002=f1D
key003=f2A
key004=b3A
key005=e6D
key006=a4C
key007=c5B
key008=c2D
key009=f2D
key010=a2A
etc.

no FileOpen() needed there for the overwrite

why will that not work a third time? because only after that i must use the FileOpen() function

somehow i was expecting that line 24 and line 44 (the third IniWrite() would both work without a FileOpen() because it worked the first time. i do not quite understand why it works one time and not the other. it does not make sense to me somehow.

Share this post


Link to post
Share on other sites

IniWrite overwrites the values of the keys which already exist, while FileOpen(... , 2) overwrites the whole file

The previous script first creates a .ini file, 150 lines. OK. After deletion of duplicates, you get about 110 values
Using only IniWrite, you will overwrite the first 110 lines. There will still be 150 lines in the final ini, including possible duplicates in the last 40 lines which are kept unchanged
Using FileOpen, you delete all the old lines and then re-create 110 new ones. No possible duplicates  :)

BTW don't forget FileClose() after the job is done

Share this post


Link to post
Share on other sites

ah, now i understand. okay that fills another hole in my understanding. thanks for that. i dare not ask about the FileClose() because in another topic i was told that i did not close the opened file because i closed the file with FileClose($filename) and it must be a "handle", which is something i do not quite understand yet, but okay. bit by bit i am learning. everything one step at a time.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×
×
  • Create New...