asgarcymed Posted November 28, 2007 Share Posted November 28, 2007 I want to create an AutoIt script able to search for true duplicate files (same SIZE + same MD5 HASH/CHECKSUM). Getting each file's size is relatively easy; unlike getting MD5... However I got an external ActiveX=COM component to easily get the MD5 hash. I use the "Dictionary Object" since I have "Windows Script Host" installed in my PC, and AutoIt lacks a true and easy similar thing... My strategy is: use, as dictionary's keys, the combination of [$size & $md5]. As dictionary's items, I use Full Path (which, of course, includes the name of each file). I tried to use the dictionary's "Exists" method, which checks and does not allow 2 keys to be equal. Thus, if a true duplicate file exists, the keys (combination of [$size & $md5]) would be equal... This is the way how to "hunt"/"catch" the true duplicate files... In theory, I think I am correct; however, in practice I am making something wrong, because my script never catches true duplicate files (and I voluntary copied the same file many times; just to test/debug my script)... Here it is: #Include <File.au3> #Include <Array.au3> $path = "L:\mais eBooks\0000\TESTES" $files_list_array = _FileListToArray($path, "*.*") For $n = 1 To $files_list_array[0] $file_size_bytes = FileGetSize($path & "\" & $files_list_array[$n]) $md5_object = ObjCreate("XStandard.MD5") ; "ActiveX=COM Component to easily get MD5 Hash/CheckSum ; It is FreeWare!!!... If interested, look at: ; http://www.xstandard.com/en/documentation/xmd5/ $md5_hash = $md5_object.GetCheckSumFromFile ($path & "\" & $files_list_array[$n]) $dict = ObjCreate("Scripting.Dictionary") $dict.CompareMode = 1 ; "Text Mode" $dict_key = $file_size_bytes & $md5_hash $dict_item = $path & "\" & $files_list_array[$n] If Not $dict.Exists ($dict_key) Then $dict.Add ($dict_key, $dict_item) ; MsgBox (0, "", $dict_key & Chr(13) & $dict_item) ElseIf $dict.Exists ($dict_key) Then FileWrite(@DesktopDir & "\Dupes.csv", $dict_item & "," & $dict_key & Chr(13)) EndIf Next Can you please help me to correct/debug this script? Thanks. Regards. MLMK - my blogging craziness... Link to comment Share on other sites More sharing options...
PsaltyDS Posted November 28, 2007 Share Posted November 28, 2007 (edited) I want to create an AutoIt script able to search for true duplicate files (same SIZE + same MD5 HASH/CHECKSUM). Getting each file's size is relatively easy; unlike getting MD5... However I got an external ActiveX=COM component to easily get the MD5 hash. I use the "Dictionary Object" since I have "Windows Script Host" installed in my PC, and AutoIt lacks a true and easy similar thing... My strategy is: use, as dictionary's keys, the combination of [$size & $md5]. As dictionary's items, I use Full Path (which, of course, includes the name of each file). I tried to use the dictionary's "Exists" method, which checks and does not allow 2 keys to be equal. Thus, if a true duplicate file exists, the keys (combination of [$size & $md5]) would be equal... This is the way how to "hunt"/"catch" the true duplicate files... In theory, I think I am correct; however, in practice I am making something wrong, because my script never catches true duplicate files (and I voluntary copied the same file many times; just to test/debug my script)... Create your objects for the dictionary and the MD5 function only once, outside of the loop. The file size is irrelevant. Two differing files may have the same size, but not the same MD5 hash. Just work with the hash, as it is the only reliably unique property: #Include <File.au3> #Include <Array.au3> $md5_object = ObjCreate("XStandard.MD5") ; "ActiveX MD5 Hash/CheckSum $dict = ObjCreate("Scripting.Dictionary") $dict.CompareMode = 1 ; "Text Mode" $path = "L:\mais eBooks\0000\TESTES" $files_list_array = _FileListToArray($path, "*.*") For $n = 1 To $files_list_array[0] $sFile = $path & "\" & $files_list_array[$n] $dict_key = $md5_object.GetCheckSumFromFile($sFile) If $dict.Exists($dict_key) Then FileWrite(@DesktopDir & "\Dupes.csv", $sFile & "," & $dict_key & Chr(13)) Else $dict.Add($dict_key, $sFile) EndIf Next Not tested, of course... Edited November 28, 2007 by PsaltyDS Valuater's AutoIt 1-2-3, Class... Is now in Session!For those who want somebody to write the script for them: RentACoder"Any technology distinguishable from magic is insufficiently advanced." -- Geek's corollary to Clarke's law Link to comment Share on other sites More sharing options...
asgarcymed Posted November 28, 2007 Author Share Posted November 28, 2007 You are 100% correct!!! Your script works 100% perfectly!!! Thank you very much!! Regards. MLMK - my blogging craziness... Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now