Jump to content
gcue

best approach?

Recommended Posts

hello world

i wanted to see what's the best approach to eliminate duplicate downloads that have a trail of download(1) download(2) download (3) etc.  I would like to just retain the original download without the trailing enumerating files.  so i was thinking get an array of all the files listed then go each one to see if there are duplicates and return the dupes to an array and delete all files that have (x).

any thoughts?

Thanks

Share this post


Link to post
Share on other sites

I would consider collecting the file hashes and see if they are unique.  This may not be the quickest method, particularly if the files are very large, but could be a simple way to eliminate duplicates, particularly if the naming scheme varies from the (x) indexing.  This would also ensure unique files (perhaps a downloaded file that was modified) is retained instead of deleted just because the naming follows a pattern.

Just something else offered to consider I suppose.


Share this post


Link to post
Share on other sites

Assuming that the files are true duplicates (and can safely be deleted) this could work

#include <File.au3>

$a = _FileListToArray (@scriptdir, "download*", $FLTA_FILES)
$dupes = ""
For $i = 1 to $a[0]
    $tmp = StringRegExpReplace($a[$i], '\h*(\(\d+\)\h*(?=\.\w+$|$))?', "")
    If FileExists($tmp) and not ($tmp == $a[$i]) Then $dupes &= $a[$i] & @crlf ;FileDelete($a[$i])
Next
Msgbox(0,"dupes", $dupes)

 

Edited by mikell

Share this post


Link to post
Share on other sites
9 hours ago, mikell said:

Assuming that the files are true duplicates (and can safely be deleted) this could work

#include <File.au3>

$a = _FileListToArray (@scriptdir, "download*", $FLTA_FILES)
$dupes = ""
For $i = 1 to $a[0]
    $tmp = StringRegExpReplace($a[$i], '\h*(\(\d+\)\h*(?=\.\w+$|$))?', "")
    If FileExists($tmp) and not ($tmp == $a[$i]) Then $dupes &= $a[$i] & @crlf ;FileDelete($a[$i])
Next
Msgbox(0,"dupes", $dupes)

If strings alone are not enough, also consider using something as a secondary check like FileGetSize()

 

Share this post


Link to post
Share on other sites
17 hours ago, mikell said:

Assuming that the files are true duplicates (and can safely be deleted) this could work

#include <File.au3>

$a = _FileListToArray (@scriptdir, "download*", $FLTA_FILES)
$dupes = ""
For $i = 1 to $a[0]
    $tmp = StringRegExpReplace($a[$i], '\h*(\(\d+\)\h*(?=\.\w+$|$))?', "")
    If FileExists($tmp) and not ($tmp == $a[$i]) Then $dupes &= $a[$i] & @crlf ;FileDelete($a[$i])
Next
Msgbox(0,"dupes", $dupes)

 

works great!  except it doesnt work when there's a space between the file name and the (x).  (ie: download file (1).pdf) 

thank you very much :)

Share this post


Link to post
Share on other sites
On 12/9/2019 at 9:24 AM, gcue said:

have a trail of download(1) download(2) download (3) etc.  I would like to just retain the

$a = _FileListToArray(@scriptdir, "*")
$dupes = ""
For $i = 1 to $a[0]
    $tmp = StringInStr($a[$i], "(") > 0 ? True : False
    If $tmp = True Then $dupes &= $a[$i] & @crlf
Next
MsgBox(0, "", $dupes)

 

Edited by jugador

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

  • Recently Browsing   0 members

    No registered users viewing this page.

×
×
  • Create New...