Sign in to follow this  
Followers 0
jvanegmond

Duplicate Line Deleter

16 posts in this topic

#1 ·  Posted (edited)

Just something I threw together in a few minutes.

I was surprised with it's speed. It just sorts the array and then checks for duplicate entries. :D

Posted Image

Download here: http://www.manadar.com/repository/autoit/d...plindeleter.exe

Source code: http://www.manadar.com/repository/autoit/d...plindeleter.au3

Icon file: http://www.manadar.com/repository/autoit/d...plindeleter.ico

Enjoy!

Edited by Manadar

Share this post


Link to post
Share on other sites



Very nice!! Thanks for sharing :D Works very well...

Share this post


Link to post
Share on other sites

It works great, but one thing, it only deletes duplicate lines under the same row...

I tried copy+pasting the same phrases, and tabbing them differently:

this is a duplicate line
 
     this is a duplicate line
 
         this is a duplicate line
    
             this is a duplicate line
 
                 this is a duplicate line

After running the tool:

this is a duplicate line
             this is a duplicate line
         this is a duplicate line
     this is a duplicate line
 this is a duplicate line

Apart from that, it works great! :D

In my opinion the example you have given isn't correct. Those lines aren't the same (after all they have diffrent amount of TABS). For you it doesn't matter but there will be cases where it will. Maybe Manadar can add option to ignore double spaces etc or not.

My little company: Evotec (PL version: Evotec)

Share this post


Link to post
Share on other sites

I'm happy it's useful for you too, gessler. :D I ran it on a 120 kB text file, and I had a 9 kB file left.

@jackit, what Madboy said is right.

This string:

abc
is not the same as
abc
.

Clearing spaces and tabs in front of lines, is something completely different, and something I did not make this tool for.

However, if you have serious need of such a tool, you can simply edit my source code. That is why the source code is provided.

Share this post


Link to post
Share on other sites

I see, it works as advertised :D

For your example to work, it wouldn't be too difficult to add a StringStripWS() with the flag set to 6 into the compare. As a matter of fact a checkbox to ignore white space differences might be a nice addition. Much the way some search and replace dialogs work.

George

Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.

Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.***

The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number.

Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else.

"Old age and treachery will always overcome youth and skill!"

Share this post


Link to post
Share on other sites

sorry for retake old post but source isn't avaible on link...

search this script for years, can anyone post it ?

or give help to build new one ?

thank you !

m.

Share this post


Link to post
Share on other sites

hi

its pretty easy to write

; delete_duplicate_lines_from_text.au3

#include <String.au3>

Local $f = @ScriptDir & "\infile.txt"   ; or wherever
Local $f_out = @ScriptDir & "\outfile.txt";

    local $oldline="";

    $fh_in = FileOpen($f, 0)
    $fh_out = FileOpen($f_out, 2) ; write

    ; Check if file opened for reading OK
    If $fh_in = -1 Then
      ;MsgBox(0, "Error", "Unable to open file.")
    EndIf


    While 1
      $line = FileReadLine($fh_in)
      If @error = -1 Then ExitLoop

      if $oldline  <> $line then ; not the same ??

        FileWriteLine($fh_out, $line);
      EndIf

        $oldline = $line

    WEnd

    FileClose($fh_in)
    FileClose($fh_out)

Share this post


Link to post
Share on other sites

sorry for retake old post but source isn't avaible on link...

search this script for years, can anyone post it ?

or give help to build new one ?

thank you !

m.

Please guys, stop asking and start thinking...

http://www.manadar.com/repository/icons/au...plindeleter.au3

...he has such a great site which is easy to browse, so if the link is wrong, why not just go to the mainsite and browse/search from there??!!

search this script for years

-> lol, and Manadar made it in a couple of minutes xD

btw, thanks for sharing Manadar.


You can fool some of the people all of the time, and all of the people some of the time, but you can not fool all of the people all of the time. Abraham Lincoln - http://www.ae911truth.org/ - http://www.freedocumentaries.org/

Share this post


Link to post
Share on other sites

Please guys, stop asking and start thinking...

http://www.manadar.com/repository/icons/au...plindeleter.au3

...he has such a great site which is easy to browse, so if the link is wrong, why not just go to the mainsite and browse/search from there??!!

-> lol, and Manadar made it in a couple of minutes xD

btw, thanks for sharing Manadar.

I was unaware that I had all the AutoIt files and folders in there, accidentally. I thought I lost them. >.< xD lolol

Thanks for browsing my file manager. More people should do it.

No problem. :)

Share this post


Link to post
Share on other sites

I was unaware that I had all the AutoIt files and folders in there, accidentally. I thought I lost them. >.< xD lolol

Thanks for browsing my file manager. More people should do it.

No problem. :(

That's what I meant in that PM I send you a while back :)

Share this post


Link to post
Share on other sites

#12 ·  Posted (edited)

Just a quick observation. You are redim'ing the array by +1 every time. I don't know exactly how it's implemented in autoit, but generally that kind of thing is done by creating a new array, copying the old one and adding the new value in the new slot. Doing that every time is probably needlessly expensive. Making the newArray the same size as the oldArray and then just keeping track of the last element (and then scaling it down once if necessary), might make your already quick function even faster. But who knows, maybe they implemented arrays more like array lists and your really not loosing much, since in that case you'd simply be reducing an array into a linked list... maybe I should go take a gander at the autoit source code.

EDIT:

From a cursory glance at the source I'm gonna say that NOT calling redim will speed up the function. It looks like the array type is implemented exactly as a standard array and thus in order to expand it by one a new one must be created and the old one must be copied.

Edited by Wus

Share this post


Link to post
Share on other sites

The whole process could be made simple just by using

_FileReadToArray()

_ArrayFindAll()


George

Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.

Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.***

The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number.

Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else.

"Old age and treachery will always overcome youth and skill!"

Share this post


Link to post
Share on other sites

hi

its pretty easy to write

; delete_duplicate_lines_from_text.au3

#include <String.au3>

Local $f = @ScriptDir & "\infile.txt"   ; or wherever
Local $f_out = @ScriptDir & "\outfile.txt";

    local $oldline="";

    $fh_in = FileOpen($f, 0)
    $fh_out = FileOpen($f_out, 2) ; write

    ; Check if file opened for reading OK
    If $fh_in = -1 Then
      ;MsgBox(0, "Error", "Unable to open file.")
    EndIf


    While 1
      $line = FileReadLine($fh_in)
      If @error = -1 Then ExitLoop

      if $oldline  <> $line then ; not the same ??

        FileWriteLine($fh_out, $line);
      EndIf

        $oldline = $line

    WEnd

    FileClose($fh_in)
    FileClose($fh_out)
That only removes consecutive duplicates; it won't remove duplicates like this

abc

def

abc

hik

abc


Serial port communications UDF Includes functions for binary transmission and reception.printing UDF Useful for graphs, forms, labels, reports etc.Add User Call Tips to SciTE for functions in UDFs not included with AutoIt and for your own scripts.Functions with parameters in OnEvent mode and for Hot Keys One function replaces GuiSetOnEvent, GuiCtrlSetOnEvent and HotKeySet.UDF IsConnected2 for notification of status of connected state of many urls or IPs, without slowing the script.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0