Jump to content

[help]delete repeat Data in a txt


dd8
 Share

Recommended Posts

How about something like this:

$file = FileRead("file.txt")

If __StringFindOccurances($file, "aa") > 1 Then
    For $i = 1 To __StringFindOccurances($file, "aa") - 1 Step 1
        $file = StringReplace($file, StringInStr($file, "aa"), "")
    Next
EndIf

Func __StringFindOccurances($sStr1, $sStr2) ; This is from the Web-Based AutoIt script, no idea who originally made it
    For $i = 1 to StringLen($sStr1)
        If not StringInStr($sStr1, $sStr2, 1, $i) Then ExitLoop
    Next
    Return $i
EndFunc
Sorry if that's a bit confusing. It's basically checking if there's more than one occurrence of "aa", and if there is, taking all but one out.

Edited by sandman

[center]"Yes, [our app] runs on Windows as well as Linux, but if you had a Picasso painting, would you put it in the bathroom?" -BitchX.com (IRC client)"I would change the world, but they won't give me the source code." -Unknownsite . blog . portfolio . claimidcode.is.poetry();[/center]

Link to comment
Share on other sites

Hey, this was fun. This took a 10,000 line file of two character lines and removed all duplicates in less than 1.5 seconds:

$file = FileRead("first.txt")
$dummy = "1111" ; This is a unique value, not found in the file
$idx = 1    ; Position in the file
$iPosB = 1  ; Start of a line
$iPosE = StringInStr($file, @CRLF, 0, $idx) ; End of a line
Const $iCaseSensitive = 1

while StringInStr($file, $dummy)
    $dummy = Random(1000, 10000, 1)
WEnd

While $iPosE <> 0   ; Not at end of $file
    $strCurrLine = StringMid($file, $iPosB, $iPosE - $iPosB + 2)    ; Get the current line, with @CRLF
    If $strCurrLine <> @CRLF Then   ; Skip blank lines
        ; Mark first value by replacing with dummy
        $file = StringReplace($file, $strCurrLine, $dummy, 1, $iCaseSensitive)
        ; Replace all remaining matches with empty string
        $file = StringReplace($file, $strCurrLine, "", 0, $iCaseSensitive)
        ; Replace dummy with true value
        $file = StringReplace($file, $dummy, $strCurrLine, 1, $iCaseSensitive)
    EndIf
    $idx += 1               ; Advance to next line
    $iPosB = $iPosE + 2     ; Advance past @CRLF
    $iPosE = StringInStr($file, @CRLF, 0, $idx) ; Find next @CRLF
WEnd

FileDelete("second.txt")
FileWrite("second.txt", $file)

I did this operation this way because I assume that using StringReplace to replace all occurrences is faster than looping through arrays. One thing to be careful of is that this script could mistakenly replace substrings if some exist. For example, if the following is in the file:

aa
bb
aaa

this script will change the "aaa" to "a". In this case you will want to test for @CRLF & <current line> & @CRLF to make sure you are only testing whole lines.

Edited by bluebearr
BlueBearrOddly enough, this is what I do for fun.
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...