Jump to content
Sign in to follow this  
dd8

[help]delete repeat Data in a txt

Recommended Posts

dd8

in the first txt,it read like this:

aa

bb

cc

aa

bb

bb

And want to be the second txt,which read like this:

aa

bb

cc

which way is the faster,I wonder

thanks for help

Share this post


Link to post
Share on other sites
sandman

How about something like this:

$file = FileRead("file.txt")

If __StringFindOccurances($file, "aa") > 1 Then
    For $i = 1 To __StringFindOccurances($file, "aa") - 1 Step 1
        $file = StringReplace($file, StringInStr($file, "aa"), "")
    Next
EndIf

Func __StringFindOccurances($sStr1, $sStr2) ; This is from the Web-Based AutoIt script, no idea who originally made it
    For $i = 1 to StringLen($sStr1)
        If not StringInStr($sStr1, $sStr2, 1, $i) Then ExitLoop
    Next
    Return $i
EndFunc
Sorry if that's a bit confusing. It's basically checking if there's more than one occurrence of "aa", and if there is, taking all but one out.

Edited by sandman

[center]"Yes, [our app] runs on Windows as well as Linux, but if you had a Picasso painting, would you put it in the bathroom?" -BitchX.com (IRC client)"I would change the world, but they won't give me the source code." -Unknownsite . blog . portfolio . claimidcode.is.poetry();[/center]

Share this post


Link to post
Share on other sites
bluebearr

Hey, this was fun. This took a 10,000 line file of two character lines and removed all duplicates in less than 1.5 seconds:

$file = FileRead("first.txt")
$dummy = "1111" ; This is a unique value, not found in the file
$idx = 1    ; Position in the file
$iPosB = 1  ; Start of a line
$iPosE = StringInStr($file, @CRLF, 0, $idx) ; End of a line
Const $iCaseSensitive = 1

while StringInStr($file, $dummy)
    $dummy = Random(1000, 10000, 1)
WEnd

While $iPosE <> 0   ; Not at end of $file
    $strCurrLine = StringMid($file, $iPosB, $iPosE - $iPosB + 2)    ; Get the current line, with @CRLF
    If $strCurrLine <> @CRLF Then   ; Skip blank lines
        ; Mark first value by replacing with dummy
        $file = StringReplace($file, $strCurrLine, $dummy, 1, $iCaseSensitive)
        ; Replace all remaining matches with empty string
        $file = StringReplace($file, $strCurrLine, "", 0, $iCaseSensitive)
        ; Replace dummy with true value
        $file = StringReplace($file, $dummy, $strCurrLine, 1, $iCaseSensitive)
    EndIf
    $idx += 1               ; Advance to next line
    $iPosB = $iPosE + 2     ; Advance past @CRLF
    $iPosE = StringInStr($file, @CRLF, 0, $idx) ; Find next @CRLF
WEnd

FileDelete("second.txt")
FileWrite("second.txt", $file)

I did this operation this way because I assume that using StringReplace to replace all occurrences is faster than looping through arrays. One thing to be careful of is that this script could mistakenly replace substrings if some exist. For example, if the following is in the file:

aa
bb
aaa

this script will change the "aaa" to "a". In this case you will want to test for @CRLF & <current line> & @CRLF to make sure you are only testing whole lines.

Edited by bluebearr

BlueBearrOddly enough, this is what I do for fun.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

×

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.