gcue Posted December 9, 2019 Share Posted December 9, 2019 hello world i wanted to see what's the best approach to eliminate duplicate downloads that have a trail of download(1) download(2) download (3) etc. I would like to just retain the original download without the trailing enumerating files. so i was thinking get an array of all the files listed then go each one to see if there are duplicates and return the dupes to an array and delete all files that have (x). any thoughts? Thanks Link to comment Share on other sites More sharing options...
spudw2k Posted December 9, 2019 Share Posted December 9, 2019 I would consider collecting the file hashes and see if they are unique. This may not be the quickest method, particularly if the files are very large, but could be a simple way to eliminate duplicates, particularly if the naming scheme varies from the (x) indexing. This would also ensure unique files (perhaps a downloaded file that was modified) is retained instead of deleted just because the naming follows a pattern. Just something else offered to consider I suppose. Spoiler Things I've Made: Always On Top Tool ◊ AU History ◊ Deck of Cards ◊ HideIt ◊ ICU ◊ Icon Freezer ◊ Ipod Ejector ◊ Junos Configuration Explorer ◊ Link Downloader ◊ MD5 Folder Enumerator ◊ PassGen ◊ Ping Tool ◊ Quick NIC ◊ Read OCR ◊ RemoteIT ◊ SchTasksGui ◊ SpyCam ◊ System Scan Report Tool ◊ System UpTime ◊ Transparency Machine ◊ VMWare ESX BuilderMisc Code Snippets: ADODB Example ◊ CheckHover ◊ Detect SafeMode ◊ DynEnumArray ◊ GetNetStatData ◊ HashArray ◊ IsBetweenDates ◊ Local Admins ◊ Make Choice ◊ Recursive File List ◊ Remove Sizebox Style ◊ Retrieve PNPDeviceID ◊ Retreive SysListView32 Contents ◊ Set IE Homepage ◊ Tickle Expired Password ◊ Transpose ArrayProjects: Drive Space Usage GUI ◊ LEDkIT ◊ Plasma_kIt ◊ Scan Engine Builder ◊ SpeeDBurner ◊ SubnetCalcCool Stuff: AutoItObject UDF ◊ Extract Icon From Proc ◊ GuiCtrlFontRotate ◊ Hex Edit Funcs ◊ Run binary ◊ Service_UDF Link to comment Share on other sites More sharing options...
mikell Posted December 9, 2019 Share Posted December 9, 2019 (edited) Assuming that the files are true duplicates (and can safely be deleted) this could work #include <File.au3> $a = _FileListToArray (@scriptdir, "download*", $FLTA_FILES) $dupes = "" For $i = 1 to $a[0] $tmp = StringRegExpReplace($a[$i], '\h*(\(\d+\)\h*(?=\.\w+$|$))?', "") If FileExists($tmp) and not ($tmp == $a[$i]) Then $dupes &= $a[$i] & @crlf ;FileDelete($a[$i]) Next Msgbox(0,"dupes", $dupes) Edited December 9, 2019 by mikell Link to comment Share on other sites More sharing options...
Deye Posted December 9, 2019 Share Posted December 9, 2019 Maybe try giving this a go Link to comment Share on other sites More sharing options...
ViciousXUSMC Posted December 9, 2019 Share Posted December 9, 2019 9 hours ago, mikell said: Assuming that the files are true duplicates (and can safely be deleted) this could work #include <File.au3> $a = _FileListToArray (@scriptdir, "download*", $FLTA_FILES) $dupes = "" For $i = 1 to $a[0] $tmp = StringRegExpReplace($a[$i], '\h*(\(\d+\)\h*(?=\.\w+$|$))?', "") If FileExists($tmp) and not ($tmp == $a[$i]) Then $dupes &= $a[$i] & @crlf ;FileDelete($a[$i]) Next Msgbox(0,"dupes", $dupes) If strings alone are not enough, also consider using something as a secondary check like FileGetSize() Link to comment Share on other sites More sharing options...
gcue Posted December 10, 2019 Author Share Posted December 10, 2019 17 hours ago, mikell said: Assuming that the files are true duplicates (and can safely be deleted) this could work #include <File.au3> $a = _FileListToArray (@scriptdir, "download*", $FLTA_FILES) $dupes = "" For $i = 1 to $a[0] $tmp = StringRegExpReplace($a[$i], '\h*(\(\d+\)\h*(?=\.\w+$|$))?', "") If FileExists($tmp) and not ($tmp == $a[$i]) Then $dupes &= $a[$i] & @crlf ;FileDelete($a[$i]) Next Msgbox(0,"dupes", $dupes) works great! except it doesnt work when there's a space between the file name and the (x). (ie: download file (1).pdf) thank you very much Link to comment Share on other sites More sharing options...
jugador Posted December 10, 2019 Share Posted December 10, 2019 (edited) On 12/9/2019 at 9:24 AM, gcue said: have a trail of download(1) download(2) download (3) etc. I would like to just retain the $a = _FileListToArray(@scriptdir, "*") $dupes = "" For $i = 1 to $a[0] $tmp = StringInStr($a[$i], "(") > 0 ? True : False If $tmp = True Then $dupes &= $a[$i] & @crlf Next MsgBox(0, "", $dupes) Edited December 10, 2019 by jugador Link to comment Share on other sites More sharing options...
Deye Posted December 10, 2019 Share Posted December 10, 2019 (edited) Updated Here Edited December 14, 2019 by Deye Link to comment Share on other sites More sharing options...
gcue Posted December 12, 2019 Author Share Posted December 12, 2019 thanks! Link to comment Share on other sites More sharing options...
Deye Posted December 12, 2019 Share Posted December 12, 2019 and updated a bit ;) Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now