cookiemonster Posted April 25, 2014 Share Posted April 25, 2014 I already know how to MD5 check a folders contents using _Crypt_HashFile but what I would like to do is MD5 the folder as a whole rather than each file in it, has anyone got any suggestions to point me in the right direction for this? Is it even possible? Example: I have this folder: c:test with 5 files in it At the moment I get the MD5 for all 5 files, I want to MD5 the folder 'test' so I have one MD5 for the folder rather than 5 (one for each file) Link to comment Share on other sites More sharing options...
Unc3nZureD Posted April 25, 2014 Share Posted April 25, 2014 You can't md5 a whole folder. However you can Enumerate the files with _FileListToArrray and Md5 each. After that you can add all the strings into a whole string. For example: 7d57a619199f55893319bcaf78ddbb94 0d84a154b2c0520b0c9db876b2108e9d 148d2bee8401710e51588117bd5769ff 61f052ffe4903319e89648a0a8881e78 Into: 7d57a619199f55893319bcaf78ddbb940d84a154b2c0520b0c9db876b2108e9d148d2bee8401710e51588117bd5769ff61f052ffe4903319e89648a0a8881e78 After that, you count it's md5 again and you will get: 053e630c05a34677d9c13734c6bad6ab So, you can get an md5 hash which depends on the folder. Link to comment Share on other sites More sharing options...
sahsanu Posted April 25, 2014 Share Posted April 25, 2014 However you can Enumerate the files with _FileListToArrray and Md5 each. After that you can add all the strings into a whole string. Unc3nZureD, there is a problem with that method, in case the order of the files changes (a file renamed for example), you will get a different string and of course a different hash. Well, of course all depends on what OP really needs ;-). cookiemonster, I've just created it as an example, it only takes care about md5 hashes for files, no matters whether a file name changes or if a file has been moved to another dir, but again, don't know whether you want this or not. If you take care about the name of the file use Unc3nZureD suggestion or a mix of both ;-). #include <File.au3> #include <Crypt.au3> $sPath = @ScriptDir & "\test\" $aList = _FileListToArray($sPath, "*", 1, True) ;$aList = _FileListToArrayRec($sPath, "*", 1, 1, 0, 2) ;use this instead if you want to be recursive _Crypt_Startup() Local $MD5 = 0 For $i = 1 To $aList[0] $aList[$i] = Hex(_Crypt_HashFile($aList[$i], $CALG_MD5)) Local $aTemp = StringRegExp($aList[$i], "(.{4}+)", 3) Local $sumTemp = 0 For $j = 0 To UBound($aTemp) - 1 $sumTemp += Dec($aTemp[$j]) Next $MD5 += $sumTemp Next $MD5 = _Crypt_HashData($MD5, $CALG_MD5) _Crypt_Shutdown() ConsoleWrite("MD5 (well, it isn't but...) for directory " & $sPath & " is: " & Hex($MD5) & @CRLF) Cheers, sahsanu Link to comment Share on other sites More sharing options...
guinness Posted April 25, 2014 Share Posted April 25, 2014 Instead of doing a hash of the whole file, how about reading a small percentage e.g. PS That actually answers the OP's question as well. UDF List: _AdapterConnections() • _AlwaysRun() • _AppMon() • _AppMonEx() • _ArrayFilter/_ArrayReduce • _BinaryBin() • _CheckMsgBox() • _CmdLineRaw() • _ContextMenu() • _ConvertLHWebColor()/_ConvertSHWebColor() • _DesktopDimensions() • _DisplayPassword() • _DotNet_Load()/_DotNet_Unload() • _Fibonacci() • _FileCompare() • _FileCompareContents() • _FileNameByHandle() • _FilePrefix/SRE() • _FindInFile() • _GetBackgroundColor()/_SetBackgroundColor() • _GetConrolID() • _GetCtrlClass() • _GetDirectoryFormat() • _GetDriveMediaType() • _GetFilename()/_GetFilenameExt() • _GetHardwareID() • _GetIP() • _GetIP_Country() • _GetOSLanguage() • _GetSavedSource() • _GetStringSize() • _GetSystemPaths() • _GetURLImage() • _GIFImage() • _GoogleWeather() • _GUICtrlCreateGroup() • _GUICtrlListBox_CreateArray() • _GUICtrlListView_CreateArray() • _GUICtrlListView_SaveCSV() • _GUICtrlListView_SaveHTML() • _GUICtrlListView_SaveTxt() • _GUICtrlListView_SaveXML() • _GUICtrlMenu_Recent() • _GUICtrlMenu_SetItemImage() • _GUICtrlTreeView_CreateArray() • _GUIDisable() • _GUIImageList_SetIconFromHandle() • _GUIRegisterMsg() • _GUISetIcon() • _Icon_Clear()/_Icon_Set() • _IdleTime() • _InetGet() • _InetGetGUI() • _InetGetProgress() • _IPDetails() • _IsFileOlder() • _IsGUID() • _IsHex() • _IsPalindrome() • _IsRegKey() • _IsStringRegExp() • _IsSystemDrive() • _IsUPX() • _IsValidType() • _IsWebColor() • _Language() • _Log() • _MicrosoftInternetConnectivity() • _MSDNDataType() • _PathFull/GetRelative/Split() • _PathSplitEx() • _PrintFromArray() • _ProgressSetMarquee() • _ReDim() • _RockPaperScissors()/_RockPaperScissorsLizardSpock() • _ScrollingCredits • _SelfDelete() • _SelfRename() • _SelfUpdate() • _SendTo() • _ShellAll() • _ShellFile() • _ShellFolder() • _SingletonHWID() • _SingletonPID() • _Startup() • _StringCompact() • _StringIsValid() • _StringRegExpMetaCharacters() • _StringReplaceWholeWord() • _StringStripChars() • _Temperature() • _TrialPeriod() • _UKToUSDate()/_USToUKDate() • _WinAPI_Create_CTL_CODE() • _WinAPI_CreateGUID() • _WMIDateStringToDate()/_DateToWMIDateString() • Au3 script parsing • AutoIt Search • AutoIt3 Portable • AutoIt3WrapperToPragma • AutoItWinGetTitle()/AutoItWinSetTitle() • Coding • DirToHTML5 • FileInstallr • FileReadLastChars() • GeoIP database • GUI - Only Close Button • GUI Examples • GUICtrlDeleteImage() • GUICtrlGetBkColor() • GUICtrlGetStyle() • GUIEvents • GUIGetBkColor() • Int_Parse() & Int_TryParse() • IsISBN() • LockFile() • Mapping CtrlIDs • OOP in AutoIt • ParseHeadersToSciTE() • PasswordValid • PasteBin • Posts Per Day • PreExpand • Protect Globals • Queue() • Resource Update • ResourcesEx • SciTE Jump • Settings INI • SHELLHOOK • Shunting-Yard • Signature Creator • Stack() • Stopwatch() • StringAddLF()/StringStripLF() • StringEOLToCRLF() • VSCROLL • WM_COPYDATA • More Examples... Updated: 22/04/2018 Link to comment Share on other sites More sharing options...
jguinch Posted April 25, 2014 Share Posted April 25, 2014 there is a problem with that method, in case the order of the files changes (a file renamed for example) You're right. But his solution seems to be a good way for me, with some changes. Like Unc3nZureD suggests, you can generate an array with all md5 checksums, and then simply sort the array (so the file name has no more importance). But wait, what about an empty folder ? Should it be part of the checksum calculation ? You could convert the folder name with StringToBinary ? (just an idea) Spoiler Network configuration UDF, _DirGetSizeByExtension, _UninstallList Firefox ConfigurationArray multi-dimensions, Printer Management UDF Link to comment Share on other sites More sharing options...
Unc3nZureD Posted April 26, 2014 Share Posted April 26, 2014 You're right. But his solution seems to be a good way for me, with some changes. Like Unc3nZureD suggests, you can generate an array with all md5 checksums, and then simply sort the array (so the file name has no more importance). But wait, what about an empty folder ? Should it be part of the checksum calculation ? You could convert the folder name with StringToBinary ? (just an idea) Well, currently I'm against myself Let's look the following example: Folder with the following: - apple (1) - banana (2) - cake (3) ranamed: - apple -> qwerty (3) - banana -> still banana (2) - cake -> apple (1) This way ordering by name isn'T good However I'Ve got an idea. You could order it by FILE SIZE. That has really really low (nearly impossible without on purpose manual manipulation) chance to conflict. Probably if the folder is empty then he could return 0 or MD5 of the folder name. Link to comment Share on other sites More sharing options...
jguinch Posted April 26, 2014 Share Posted April 26, 2014 Unc3nZureD, my suggestion was : you can generate an array with all md5 checksums, and then simply sort the array. I meant that the array will contain only the MD5 of each file/folder-name, so there will not be any problem with sorting (I think)... What do you think about this ? Spoiler Network configuration UDF, _DirGetSizeByExtension, _UninstallList Firefox ConfigurationArray multi-dimensions, Printer Management UDF Link to comment Share on other sites More sharing options...
Unc3nZureD Posted April 26, 2014 Share Posted April 26, 2014 Oh, really Yeah, that's true. Agree, that's a good idea Link to comment Share on other sites More sharing options...
jguinch Posted April 26, 2014 Share Posted April 26, 2014 Good idea, yes and not... Because if we use the checksum of folder names, what about a renamed folder ? Cookiemonster should explain the goal and decide if yes or not the file and folder names are important. Spoiler Network configuration UDF, _DirGetSizeByExtension, _UninstallList Firefox ConfigurationArray multi-dimensions, Printer Management UDF Link to comment Share on other sites More sharing options...
Unc3nZureD Posted April 26, 2014 Share Posted April 26, 2014 well, md5 of a folder could be simply 0 Link to comment Share on other sites More sharing options...
JohnOne Posted April 26, 2014 Share Posted April 26, 2014 Good idea, yes and not... Because if we use the checksum of folder names, what about a renamed folder ? Cookiemonster should explain the goal and decide if yes or not the file and folder names are important. Agreed, you'd expect the mds to change if a folder or file is renamed. I'd guess that the OP wanted to check if any contents had been altered. List to array, array sort, loop and hash, hash all hashes. AutoIt Absolute Beginners Require a serial Pause Script Video Tutorials by Morthawt ipify Monkey's are, like, natures humans. Link to comment Share on other sites More sharing options...
jguinch Posted April 26, 2014 Share Posted April 26, 2014 (edited) As I found the subject interesting and funny, I tried to create a function _FolderCheckSum It offers options for recursivity, algorithm to use and percentage of the files to read (suggested bu Guinness in #4) expandcollapse popup#include <Array.au3> #include <crypt.au3> Local $checksum = _FolderCheckSum(@Desktopdir, 1, 25, $CALG_MD5) MsgBox(0, "", "Checksum for the folder " & @Desktopdir & " : " & @CRLF & $checksum) ; #FUNCTION# ==================================================================================================================== ; Name ..........: _FolderCheckSum ; Syntax ........: _FolderCheckSum($sDir[, $iRecur = 0[, $iPercent = Default[, $iALG_ID = $CALG_MD5]]]) ; Parameters ....: $sDir - Folder full path. ; $iRecur - [optional] 1 - Search in all subfolders (unlimited recursion) ; 0 - Do not search in subfolders (Default) ; $iPercent - [optional] Percentage of the file size to read for the hash. Default is 25 ; $iALG_ID - [optional] Hash ID to use (see crypt.au3). Default is $CALG_MD5 ; Return values .: A hash of the whole files (from a combination of hash of all files) ; =============================================================================================================================== Func _FolderCheckSum($sDir, $iRecur = 0, $iPercent = Default, $iALG_ID = $CALG_MD5) If NOT FileExists($sDir) Then Return SetError(1, 0, -1) If ($iPercent > 100) Or ($iPercent < 0) Or $iPercent = Default Then $iPercent = 25 Local $aDirs[1] = [ StringRegExpReplace($sDir, "\\$", "") ], $aFiles[1] = [0] Local $iCountDir = 0, $iCountFile = 0, $n = 0 Local $hSearch, $sFileName Local $iRead Local $sResult While 1 $hSearch = FileFindFirstFile( $aDirs[$n] & "\*.*" ) If $hSearch <> -1 Then While 1 $sFileName = FileFindNextFile($hSearch) If @error Then ExitLoop If @Extended Then If $iRecur Then $iCountDir += 1 If $iCountDir >= UBound($aDirs) Then Redim $aDirs[ UBound($aDirs) * 2] $aDirs[$iCountDir] = StringRegExpReplace($aDirs[$n], "\\$", "") & "\" & $sFileName EndIf Else $iCountFile += 1 If $iCountFile >= UBound($aFiles) Then Redim $aFiles[ UBound($aFiles) * 2] $iRead = ($iPercent / 100) * FileGetSize($aDirs[$n] & "\" & $sFileName) $aFiles[$iCountFile] = _Crypt_HashData(FileRead($aDirs[$n] & "\" & $sFileName, $iRead), $iALG_ID) EndIf WEnd EndIf FileClose($hSearch) If $n = $iCountDir Then ExitLoop $n += 1 WEnd If $iCountFile = 0 Then Return SetError(2, 0, 0) _ArraySort($aFiles) For $i = 1 To $iCountFile $sResult &= @CRLF & StringReplace($aFiles[$i], "0x", "") Next $sResult = StringRegExpReplace($sResult, "^\R", "") Return _Crypt_HashData($sResult, $iALG_ID) EndFunc (i'm happy de show you my own non-recursive function for subfolders) Edited April 30, 2014 by jguinch Spoiler Network configuration UDF, _DirGetSizeByExtension, _UninstallList Firefox ConfigurationArray multi-dimensions, Printer Management UDF Link to comment Share on other sites More sharing options...
cookiemonster Posted April 28, 2014 Author Share Posted April 28, 2014 As I found the subject interesting and funny, I tried to create a function _FolderCheckSum It offers options for recursivity, algorithm to use and percentage of the files to read (suggested bu Guinness in #4) expandcollapse popup#include <Array.au3> #include <crypt.au3> Local $checksum = _FolderCheckSum(@Desktopdir, 1, 25, $CALG_MD5) MsgBox(0, "", "Checksum for the folder " & @Desktopdir & " : " & @CRLF & $checksum) ; #FUNCTION# ==================================================================================================================== ; Name ..........: _FolderCheckSum ; Syntax ........: _FolderCheckSum($sDir[, $iRecur = 0[, $iPercent = Default[, $iALG_ID = $CALG_MD5]]]) ; Parameters ....: $sDir - Folder full path. ; $iRecur - [optional] 1 - Search in all subfolders (unlimited recursion) ; 0 - Do not search in subfolders (Default) ; $iPercent - [optional] Percentage of the file size to read for the hash. Default is 25 ; $iALG_ID - [optional] Hash ID to use (see crypt.au3). Default is $CALG_MD5 ; Return values .: A hash of the whole files (from a combination of hash of all files) ; =============================================================================================================================== Func _FolderCheckSum($sDir, $iRecur = 0, $iPercent = Default, $iALG_ID = $CALG_MD5) If NOT FileExists($sDir) Then Return SetError(1, 0, -1) If ($iPercent > 100) Or ($iPercent < 0) Or $iPercent = Default Then $iPercent = 25 Local $aDirs[1] = [ StringRegExpReplace($sDir, "\\$", "") ], $aFiles[1] = [0] Local $iCountDir = 0, $iCountFile = 0, $n = 0 Local $hSearch, $sFileName Local $iRead Local $sResult While 1 $hSearch = FileFindFirstFile( $aDirs[$n] & "\*.*" ) If $hSearch <> -1 Then While 1 $sFileName = FileFindNextFile($hSearch) If @error Then ExitLoop If @Extended Then If $iRecur Then $iCountDir += 1 If $iCountDir >= UBound($aDirs) Then Redim $aDirs[ UBound($aDirs) * 2] $aDirs[$iCountDir] = StringRegExpReplace($aDirs[$n], "\\$", "") & "\" & $sFileName EndIf Else $iCountFile += 1 If $iCountFile >= UBound($aFiles) Then Redim $aFiles[ UBound($aFiles) * 2] $iRead = ($iPercent / 100) * FileGetSize($aDirs[$n] & "\" & $sFileName) $aFiles[$iCountFile] = _Crypt_HashData(FileRead($aDirs[$n] & "\" & $sFileName, $iRead), $iALG_ID) EndIf WEnd Else Return SetError(2, 0, -1) EndIf FileClose($hSearch) If $n = $iCountDir Then ExitLoop $n += 1 WEnd _ArraySort($aFiles) For $i = 1 To $iCountFile $sResult &= @CRLF & StringReplace($aFiles[$i], "0x", "") Next $sResult = StringRegExpReplace($sResult, "^\R", "") Return _Crypt_HashData($sResult, $iALG_ID) EndFunc (i'm happy de show you my own non-recursive function for subfolders) Hey so ive been away this weekend hence no responses, friday night i started getting somewhere with doing a MD5 on all the files in the folder, then md5 the result. I have just now tried the code ive quoted but I can seem to specify the directory to it, ive put Local $checksum = _FolderCheckSum("c:\Test", 1, 25, $CALG_MD5) MsgBox(0, "", "Checksum for the folder " & "c:\Test" & " : " & @CRLF & $checksum) But this gives me a -1 response Link to comment Share on other sites More sharing options...
cookiemonster Posted April 28, 2014 Author Share Posted April 28, 2014 No Matter what I change @desktopdir to, eg: "c:newfolder" it still gives me a response of -1 Link to comment Share on other sites More sharing options...
cookiemonster Posted April 29, 2014 Author Share Posted April 29, 2014 (edited) Hey so I thought id post a bit more information on what im trying to achieve seen as it was reuqested. I have four folders: C:Folder1, C:Folder2, C:Folder3, C:Folder4 Each folder has a range of sub folders and files. What I am aiming for is one folder at a time, get the md5 for all files, then get the md5 for that total string, check it against the hard coded md5 in the script, if they are identical, then we move to the next MD5 folder check, if they are not identical then it will call a function to delete the folder, then download the original from a centralized server. Once downloaded it will need to run the check again before continuing to the next MD5 folder check to ensure it has downloaded all necessary folders and files. Does that make sense? Anyone have any questions for further clarification? At until recent I only had to check one folder, which had around 200 files, so each files md5 was hard coded into the .au3, but now I have four files totaling around 1000 files, so I dont really want to carry on the way I was going as it will put a big bit of bulk in the code and is messy No matter what was I do it, i need to do 100% checks for the files, so if i get the MD5 I think I need to do a 100% MD5 on the files Edited April 29, 2014 by cookiemonster Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now