Sign in to follow this  
Followers 0
e2e4au

Removing Doubles

3 posts in this topic

I have a plain text file that contains 1 word per line.

This text file has a huge amount of words in it, some words appear more than

once. I have tried several times to create a script the will remove all doubles without success (very frustrating :):evil: ) Could someone help PLEASE.

Sample text file to work with.

arcane

has

its

experts

and

its

devotees

And

every

group

of

these

appear

to

have

one

place

where

they

congregate

to

talk

discuss

exchange

their

information

and

impart

the

newest

gossip

Shipping

movements

in

the

eastern

Mediterranean

hardly

form

a

subject

on

which

doctorates

are

earned

but

in

that

area

they

do

form

a

subject

of

great

interest

arcane

has

its

experts

and

its

devotees

And

every

group

of

these

appear

to

have

one

place

where

they

congregate

to

talk

discuss

exchange

their

information

and

impart

the

newest

gossip

Shipping

movements

in

the

eastern

Mediterranean

hardly

form

Share this post


Link to post
Share on other sites



#2 ·  Posted (edited)

I would sort it first (with a DOS sort) .. and then process each line sequentially, skipping it if it's the same as the previous one.

HTH

:)

Edit: spelling etc.

Edited by trids

Share this post


Link to post
Share on other sites

#include <Array.au3>

Dim $a_text
$file = FileOpen("test.txt", 0)

; Check if file opened for reading OK
If $file = -1 Then
    MsgBox(0, "Error", "Unable to open file.")
    Exit
EndIf

; Read in lines of text until the EOF is reached
While 1
    $found = 0
    $line = FileReadLine($file)
    If @error = -1 Then ExitLoop
    If IsArray($a_text) Then
        For $i = 0 To UBound($a_text) - 1
            If ($a_text[$i] == $line) Then
                $found = 1
                ExitLoop
            EndIf
        Next
    EndIf
    If (Not $found) Then
        If IsArray($a_text) Then
            ReDim $a_text[UBound($a_text) + 1]
        Else
            Dim $a_text[1]
        EndIf
        $a_text[UBound($a_text) - 1] = $line
    EndIf
WEnd

FileClose($file)
_ArrayDisplay($a_text, "Unique")


SciTE for AutoItDirections for Submitting Standard UDFs

 

Don't argue with an idiot; people watching may not be able to tell the difference.

 

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0