Jump to content

best approach?


gcue
 Share

Recommended Posts

hello world

i wanted to see what's the best approach to eliminate duplicate downloads that have a trail of download(1) download(2) download (3) etc.  I would like to just retain the original download without the trailing enumerating files.  so i was thinking get an array of all the files listed then go each one to see if there are duplicates and return the dupes to an array and delete all files that have (x).

any thoughts?

Thanks

Link to comment
Share on other sites

I would consider collecting the file hashes and see if they are unique.  This may not be the quickest method, particularly if the files are very large, but could be a simple way to eliminate duplicates, particularly if the naming scheme varies from the (x) indexing.  This would also ensure unique files (perhaps a downloaded file that was modified) is retained instead of deleted just because the naming follows a pattern.

Just something else offered to consider I suppose.

Link to comment
Share on other sites

Assuming that the files are true duplicates (and can safely be deleted) this could work

#include <File.au3>

$a = _FileListToArray (@scriptdir, "download*", $FLTA_FILES)
$dupes = ""
For $i = 1 to $a[0]
    $tmp = StringRegExpReplace($a[$i], '\h*(\(\d+\)\h*(?=\.\w+$|$))?', "")
    If FileExists($tmp) and not ($tmp == $a[$i]) Then $dupes &= $a[$i] & @crlf ;FileDelete($a[$i])
Next
Msgbox(0,"dupes", $dupes)

 

Edited by mikell
Link to comment
Share on other sites

9 hours ago, mikell said:

Assuming that the files are true duplicates (and can safely be deleted) this could work

#include <File.au3>

$a = _FileListToArray (@scriptdir, "download*", $FLTA_FILES)
$dupes = ""
For $i = 1 to $a[0]
    $tmp = StringRegExpReplace($a[$i], '\h*(\(\d+\)\h*(?=\.\w+$|$))?', "")
    If FileExists($tmp) and not ($tmp == $a[$i]) Then $dupes &= $a[$i] & @crlf ;FileDelete($a[$i])
Next
Msgbox(0,"dupes", $dupes)

If strings alone are not enough, also consider using something as a secondary check like FileGetSize()

 

Link to comment
Share on other sites

17 hours ago, mikell said:

Assuming that the files are true duplicates (and can safely be deleted) this could work

#include <File.au3>

$a = _FileListToArray (@scriptdir, "download*", $FLTA_FILES)
$dupes = ""
For $i = 1 to $a[0]
    $tmp = StringRegExpReplace($a[$i], '\h*(\(\d+\)\h*(?=\.\w+$|$))?', "")
    If FileExists($tmp) and not ($tmp == $a[$i]) Then $dupes &= $a[$i] & @crlf ;FileDelete($a[$i])
Next
Msgbox(0,"dupes", $dupes)

 

works great!  except it doesnt work when there's a space between the file name and the (x).  (ie: download file (1).pdf) 

thank you very much :)

Link to comment
Share on other sites

On 12/9/2019 at 9:24 AM, gcue said:

have a trail of download(1) download(2) download (3) etc.  I would like to just retain the

$a = _FileListToArray(@scriptdir, "*")
$dupes = ""
For $i = 1 to $a[0]
    $tmp = StringInStr($a[$i], "(") > 0 ? True : False
    If $tmp = True Then $dupes &= $a[$i] & @crlf
Next
MsgBox(0, "", $dupes)

 

Edited by jugador
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...