[help]delete repeat Data in a txt

dd8 · August 23, 2007

in the first txt,it read like this:

aa

bb

cc

aa

bb

And want to be the second txt,which read like this:

aa

bb

cc

which way is the faster,I wonder

thanks for help

sandman · August 23, 2007

How about something like this:

$file = FileRead("file.txt")

If __StringFindOccurances($file, "aa") > 1 Then
    For $i = 1 To __StringFindOccurances($file, "aa") - 1 Step 1
        $file = StringReplace($file, StringInStr($file, "aa"), "")
    Next
EndIf

Func __StringFindOccurances($sStr1, $sStr2) ; This is from the Web-Based AutoIt script, no idea who originally made it
    For $i = 1 to StringLen($sStr1)
        If not StringInStr($sStr1, $sStr2, 1, $i) Then ExitLoop
    Next
    Return $i
EndFunc

Sorry if that's a bit confusing. It's basically checking if there's more than one occurrence of "aa", and if there is, taking all but one out.

Edited August 23, 2007 by sandman

bluebearr · August 23, 2007

Hey, this was fun. This took a 10,000 line file of two character lines and removed all duplicates in less than 1.5 seconds:

$file = FileRead("first.txt")
$dummy = "1111" ; This is a unique value, not found in the file
$idx = 1    ; Position in the file
$iPosB = 1  ; Start of a line
$iPosE = StringInStr($file, @CRLF, 0, $idx) ; End of a line
Const $iCaseSensitive = 1

while StringInStr($file, $dummy)
    $dummy = Random(1000, 10000, 1)
WEnd

While $iPosE <> 0   ; Not at end of $file
    $strCurrLine = StringMid($file, $iPosB, $iPosE - $iPosB + 2)    ; Get the current line, with @CRLF
    If $strCurrLine <> @CRLF Then   ; Skip blank lines
        ; Mark first value by replacing with dummy
        $file = StringReplace($file, $strCurrLine, $dummy, 1, $iCaseSensitive)
        ; Replace all remaining matches with empty string
        $file = StringReplace($file, $strCurrLine, "", 0, $iCaseSensitive)
        ; Replace dummy with true value
        $file = StringReplace($file, $dummy, $strCurrLine, 1, $iCaseSensitive)
    EndIf
    $idx += 1               ; Advance to next line
    $iPosB = $iPosE + 2     ; Advance past @CRLF
    $iPosE = StringInStr($file, @CRLF, 0, $idx) ; Find next @CRLF
WEnd

FileDelete("second.txt")
FileWrite("second.txt", $file)

I did this operation this way because I assume that using StringReplace to replace all occurrences is faster than looping through arrays. One thing to be careful of is that this script could mistakenly replace substrings if some exist. For example, if the following is in the file:

aa
bb
aaa

this script will change the "aaa" to "a". In this case you will want to test for @CRLF & <current line> & @CRLF to make sure you are only testing whole lines.

Edited August 23, 2007 by bluebearr

dd8 · August 23, 2007

BlueBearr is right~ thanks my two friends.~

Sign In

[help]delete repeat Data in a txt

Recommended Posts

dd8

Link to comment

Share on other sites

sandman

Link to comment

Share on other sites

bluebearr

Link to comment

Share on other sites

dd8

Link to comment

Share on other sites

Create an account or sign in to comment

Create an account

Sign in

Recently Browsing 0 members

Browse

AutoIt Resources

Release

Beta