Sign in to follow this  
Followers 0
cypher175

Finding Duplicate Names in a List..??

9 posts in this topic

I have this text file that has a bunch of names on each line, the thing is, is that there are multiple lines that have the same name as an already listed name, but they are scattered all through out the list which consists of around 1700 lines of names.

I was trying to make a script to detect duplicate names but its not working how I want it to..

What I would like it to do is:

_FileReadToArray(List.txt) the name list, then compare each line against one line at a time, for the names/lines that dont match any others FileWrite(New-List.txt), for the names/lines that do match/repeat FileWrite(Repeat-Names.txt) but only once into (Repeat-Names.txt). thats where i have the problem, I can seem to figure out a way to only have the repeat names only written once to Repeat-Names.txt.. they get written multiple times.. im having a mental block trying to figure out a way to make this all work could sumone please help me with this please..??

this is what i have so far...

#NoTrayIcon
#Include <File.au3>
#include <Array.au3>
#include <String.au3>
#include <Misc.au3>
_Singleton(@ScriptName)
;#####################################

$File1 = "C:\list.txt"

Dim $Lines1
_FileReadToArray($File1, $Lines1)
_ArrayDisplay($Lines1)

For $Name1 In $Lines1

    $Num = 0
    For $Name2 In $Lines1


        If $Name1 = $Name2 Then
            $Num += 1
        EndIf


        If $Name1 = $Name2 And $Num > 1 Then
            ToolTip($Num&" Times"&@CRLF&$Name1&" = "&$Name2)
            FileWrite("C:\Repeat-Names.txt", $Name1&@CRLF)
        EndIf


    Next

Next

Share this post


Link to post
Share on other sites



#2 ·  Posted (edited)

Well, you could maintain a second array that contains the names you want to write into the repeated names list. What you do is:

1) Check in your file array if the name is repeated

2) Check in your second array if the repeated name is already there

2a) If not: Include the repeated name into your second list.

2b) If already there: Go to the next name in your file array.

3) After processing the whole file array, write the second array containing the repeated names into the text file.

Edited by omikron48

Share this post


Link to post
Share on other sites

Use ArrayUnique to reduce both arrays (list and new-list) to their unique values.

The common lines (repeat-names) is the array intersection of these two arrays.

You could try using this:

http://www.autoitscript.com/forum/index.php?showtopic=97163

Share this post


Link to post
Share on other sites

Hi,

Local $sFileIn, $sFileNoDupOut, $sFileDupOut, $hFO, $sFR, $aNames, $sTmp = '', $sDup = ''

$sFileIn = @ScriptDir & "\list.txt"

$sFileNoDupOut = @ScriptDir & "\NoDuplicateNames.txt"
$sFileDupOut = @ScriptDir & "\DuplicateNames.txt"

$hFO = FileOpen($sFileIn, 0)
$sFR = FileRead($hFO)
FileClose($hFO)


$aNames = StringSplit(StringStripCR(StringStripWS($sFR, 3)), @LF, 2)
For $i = 0 To UBound($aNames) -1
    If Not StringInStr($sTmp, $aNames[$i]) Then
        $sTmp &= $aNames[$i] & @CRLF
    Else
        If Not StringInStr($sDup, $aNames[$i]) Then $sDup &= $aNames[$i] & @CRLF
    EndIf
Next

If StringStripWS($sTmp, 2) <> '' Then FileWrite($sFileNoDupOut, $sTmp)
If StringStripWS($sDup, 2) <> '' Then FileWrite($sFileDupOut, $sDup)

The list.txt I used contained:

Tom
Mark
Sandy
John
Ashley
Peter
Mary
Suzi
Tom
Mark
Sandy
Mark
John
Ashley
Mary
Suzi
Tom
Mark
Sandy
John
Ashley
Mary
Suzi
Delta
Tom
Mark
Sandy
Phillip
John
Ashley
Mary
Suzi
Tom
Mark
Angel
Sandy
John
Ashley
Jenny
Mary
Suzi
Tom
Mark
Sandy
Sony
John
Ashley
Mary
Suzi

Cheers

Share this post


Link to post
Share on other sites

Not very knowledgeable in it, but some guys I work with say "hash" arrays are the way to go when it comes to things like this. Mind you, I can't see it being a very fast option for AutoIt, but it could provide what you want/need if "_ArrayUnique" doesn't.


Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.

Share this post


Link to post
Share on other sites

#6 ·  Posted (edited)

Not very knowledgeable in it, but some guys I work with say "hash" arrays are the way to go when it comes to things like this. Mind you, I can't see it being a very fast option for AutoIt, but it could provide what you want/need if "_ArrayUnique" doesn't.

Except hash arrays are not supported in AutoIt and you're stuck with using two or more arrays and a bit of dirty work. : )

Edit: I should note that some people have tried making an implementation of hash arrays and linked lists, but those still use two arrays internally.

Edited by Manadar

Share this post


Link to post
Share on other sites

Except hash arrays are not supported in AutoIt and you're stuck with using two or more arrays and a bit of dirty work. : )

Edit: I should note that some people have tried making an implementation of hash arrays and linked lists, but those still use two arrays internally.

I'm trying to find the dll/plugin I wrote for this and I can't!! Didn't someone do some extensive associative array work?

Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0