Sign in to follow this  
Followers 0
gcue

comparing files after a backup

9 posts in this topic

Here is what I have - seems to work ok - just reeeally slow with large directories

I know theres other utilities out there but I'd like to keep it in autoit.

Any ideas for improvement?

Thanks.

#include<Array.au3>
#include<date.au3>

$target_dir = "C:\PC Migrator\PCName\Documents and Settings\User\My Documents"
$source_dir = "C:\Documents and Settings\User\My Documents"

$results = Compare($target_dir, $source_dir)

_ArrayDisplay($results)

Func Compare($sDir1, $sDir2)

    $target_dir_array = Get_Files($sDir1)
    $source_dir_array = Get_Files($sDir2)

    If Not (IsArray($target_dir_array) And IsArray($source_dir_array)) Then Return -1

    $iLen = StringLen($sDir1) + 1
    For $iN = 1 To UBound($target_dir_array) - 1
        $target_dir_array[$iN][0] = StringTrimLeft($target_dir_array[$iN][0], $iLen)
    Next

    $iLen = StringLen($sDir2) + 1
    For $iN = 1 To UBound($source_dir_array) - 1
        $source_dir_array[$iN][0] = StringTrimLeft($source_dir_array[$iN][0], $iLen)
    Next

    For $x = 1 To UBound($source_dir_array) - 1
        For $y = 1 To UBound($target_dir_array) - 1
            If $target_dir_array[$y][0] = $source_dir_array[$x][0] Then
                If $source_dir_array[$x][1] <> 0 Then
                    If $target_dir_array[$y][1] = $source_dir_array[$x][1] And $target_dir_array[$y][2] = $source_dir_array[$x][2] Then
                        $source_dir_array[$x][3] = "MATCH"
                    Else
                        $source_dir_array[$x][3] = "MISMATCH"
                        $source_dir_array[$x][4] = "SOURCE SIZE: " & $source_dir_array[$x][1]
                        $source_dir_array[$x][5] = "SOURCE DATE: " & $source_dir_array[$x][2]
                        $source_dir_array[$x][6] = "TARGET SIZE: " & $target_dir_array[$y][1]
                        $source_dir_array[$x][7] = "TARGET DATE: " & $target_dir_array[$y][2]
                    EndIf
                Else
                    $source_dir_array[$x][3] = ""
                EndIf
            EndIf
        Next
        If $source_dir_array[$x][3] = "-" Then
            $source_dir_array[$x][3] = "MISSING"
        EndIf
    Next

    Return $source_dir_array

EndFunc   ;==>Compare


Func Get_Files($sDir)

    $hPID = Run(@ComSpec & ' /c dir /b /s "' & $sDir & '"', @ScriptDir, @SW_HIDE, 0x2)
    ProcessWaitClose($hPID)
    $asSplit = StringSplit(StdoutRead($hPID), @CRLF, 1)

    Dim $results[1][1]

    For $x = 1 To UBound($asSplit) - 1
        If $asSplit[$x] <> "" Then
;~          ConsoleWrite($asSplit[$x] & @CRLF)

            $size = FileGetSize($asSplit[$x])
            $modify_date_array = FileGetTime($asSplit[$x], 0, 0)

            If Not @error Then
                $modify_time = _DateTimeFormat($modify_date_array[0] & "/" & $modify_date_array[1] & "/" & $modify_date_array[2] & " " & $modify_date_array[3] & ":" & $modify_date_array[4] & ":" & $modify_date_array[5], 3)
                $modify_date = _DateTimeFormat($modify_date_array[0] & "/" & $modify_date_array[1] & "/" & $modify_date_array[2], 1)
                $modify_stamp = $modify_date & ", " & $modify_time
            Else
                $modify_stamp = "UNABLE TO GET MOD DATE"
            EndIf

            ReDim $results[UBound($results) + 1][8]

            $results[UBound($results) - 1][0] = $asSplit[$x]
            $results[UBound($results) - 1][1] = $size
            $results[UBound($results) - 1][2] = $modify_stamp
            $results[UBound($results) - 1][3] = "-"
            $results[UBound($results) - 1][4] = ""
            $results[UBound($results) - 1][5] = ""
            $results[UBound($results) - 1][6] = ""
            $results[UBound($results) - 1][7] = ""
        EndIf
    Next

    Return $results

EndFunc   ;==>Get_Files

Share this post


Link to post
Share on other sites



#2 ·  Posted (edited)

Hi,

1) For large directories it might be faster, if you use _ArraySearch () to compare your arrays.

2) BugFix wrote a function to get file or folder names with Scripting.FileSystemObject. It might be faster then dir and StringSplit. Also you have all files with full path straight stored in an array.

this would change your func get_files:

$hPID = Run(@ComSpec & ' /c dir /b /s "' & $sDir & '"', @ScriptDir, @SW_HIDE, 0x2)
    ProcessWaitClose($hPID)
    $asSplit = StringSplit(StdoutRead($hPID), @CRLF, 1)

to

$asSplit = _GetFilesFolder_Rekursiv ($sDir, "*", 0)

;==================================================================================================
; Function Name:   _GetFilesFolder_Rekursiv($sPath [, $sExt='*' [, $iDir=-1 [, $iRetType=0 ,[$sDelim='0']]]])
; Description:     recursive listing of files and/or folders
; Parameter(s):    $sPath     Basicpath of listing ('.' -current path, '..' -parent path)
;                  $sExt      Extension for file selection '*' or -1 for all (Default)
;                  $iDir      -1 Files+Folder(Default), 0 only Files, 1 only Folder
;      optional:   $iRetType  0 for Array, 1 for String as Return
;      optional:   $sDelim    Delimiter for string return
;                             0 -@CRLF (Default)  1 -@CR  2 -@LF  3 -';'  4 -'|'
; Return Value(s): Array (Default) or string with found pathes of files and/or folder
;                  Array[0] includes count of found files/folder
; Author(s):       BugFix (bugfix@autoit.de)
;==================================================================================================
Func _GetFilesFolder_Rekursiv($sPath, $sExt='*', $iDir=-1, $iRetType=0, $sDelim='0')
    Global $oFSO = ObjCreate('Scripting.FileSystemObject')
    Global $strFiles = ''
    Switch $sDelim
        Case '1'
            $sDelim = @CR
        Case '2'
            $sDelim = @LF
        Case '3'
            $sDelim = ';'
        Case '4'
            $sDelim = '|'
        Case Else
            $sDelim = @CRLF
    EndSwitch
    If ($iRetType < 0) Or ($iRetType > 1) Then $iRetType = 0
    If $sExt = -1 Then $sExt = '*'
    If ($iDir < -1) Or ($iDir > 1) Then $iDir = -1
    _ShowSubFolders($oFSO.GetFolder($sPath),$sExt,$iDir,$sDelim)
    If $iRetType = 0 Then
        Local $aOut
        $aOut = StringSplit(StringTrimRight($strFiles, StringLen($sDelim)), $sDelim, 1)
        If $aOut[1] = '' Then
            ReDim $aOut[1]
            $aOut[0] = 0
        EndIf
        Return $aOut
    Else
        Return StringTrimRight($strFiles, StringLen($sDelim))
    EndIf
EndFunc

Func _ShowSubFolders($Folder, $Ext='*', $Dir=-1, $Delim=@CRLF)
    If Not IsDeclared("strFiles") Then Global $strFiles = ''
    If ($Dir = -1) Or ($Dir = 0) Then
        For $file In $Folder.Files
            If $Ext <> '*' Then
                If StringRight($file.Name, StringLen($Ext)) = $Ext Then _
                    $strFiles &= $file.Path & $Delim
            Else
                $strFiles &= $file.Path & $Delim
            EndIf
        Next
    EndIf
    For $Subfolder In $Folder.SubFolders
        If ($Dir = -1) Or ($Dir = 1) Then $strFiles &= $Subfolder.Path & '\' & $Delim
        _ShowSubFolders($Subfolder, $Ext, $Dir, $Delim)
    Next
EndFunc

;-))

Stefan

Edited by 99ojo

Share this post


Link to post
Share on other sites

cool that function works faster..

gonna try the arraysearch - thanks!

Share this post


Link to post
Share on other sites

#4 ·  Posted (edited)

try this

Global $UPDATE="test"
Func test($a,$b,$c,$d)
    ConsoleWrite(StringFormat("]Change detected %s\%s : %s (%s)" , $a,$b,$c,$d) & @CRLF)
EndFunc
Func diff($path,$a,$b) ; a=
    Local $return = 0
    Local $i=1,$j=1,$_a,$_b
    While $i<=$a[0] And $j<=$b[0]
        If $a[$i]=$b[$j] Then
            $i+=1
            $j+=1
        Else
            $return = 1
            $_a = StringSplit($a[$i],":")
            $_b = StringSplit($b[$j],":")
            If $_a[1]=$_b[1] Then
                update($path,$_a[1],Number($_a[2]<$_b[2]),__Max($_a[2],$_b[2])) ; difera timestamp
                $i+=1
                $j+=1
            Else;difera fisierele...
                While $_a[1]<$_b[1]; a are fisiere care b nu
                    update($path,$_a[1],0,$_a[2])
                    $i+=1
                    If $i>$a[0] Then ExitLoop
                    $_a=StringSplit($a[$i],":")
                WEnd
                While $_a[1]>$_b[1] ; b are fisiere care a nu
                    update($path,$_b[1],1,$_b[2])
                    $j+=1
                    If $j>$b[0] Then ExitLoop
                    $_b=StringSplit($b[$j],":")
                WEnd
                If $_a[2]<>$_b[2] Then
                    update($path,$_a[1],Number($_a[2]<$_b[2]),__Max($_a[2],$_b[2]))
                    $i+=1
                    $j+=1
                EndIf
            EndIf
        EndIf
    WEnd
    While $i<=$a[0]
        $return = 1
        $_a = StringSplit($a[$i],":")
        update($path,$_a[1],0,$_a[2])
        $i+=1
    WEnd
    While $j<=$b[0]
        $return = 1
        $_b = StringSplit($b[$i],":")
        update($path,$_b[1],1,$_b[2])
        $j+=1
    WEnd
    Return $return
EndFunc
Func update($path,$file,$direction,$stamp,$line=@ScriptLineNumber);direction : 1 => primul e mai vechi 0 => al doilea e mai vechi
    Call($UPDATE,$path,$file,$direction,$stamp)
EndFunc
Func watch($folder)
    Local $files[1000]
    If StringRight($folder,1)<>"\" Then $folder&="\"
    _watch($folder,$files)
    ReDim $files[$files[0]+1]
    Return $files
EndFunc
Func _watch($folder,ByRef $test,$dir="")
    $list = _FileListToArray($folder&$dir,"*",1)
    $folders = _FileListToArray($folder&$dir,"*" , 2)
    If IsArray($list) Then
        For $i=1 To $list[0]
            If FileGetSize($folder & $dir & $list[$i]) <> 0 Then ; no empty files dude!!
                $test[0]+=1
                If $test[0]=UBound($test) Then ReDim $test[UBound($test)+1000]
                $test[$test[0]]=$dir & $list[$i]&":"&FileGetTime($folder & $dir & $list[$i],0,1)
            EndIf
        Next
    EndIf
    If IsArray($folders) Then
        For $i=1 To $folders[0]
            _watch($folder,$test,$dir & $folders[$i] & "\")
        Next
    EndIf
EndFunc
Func __Max($a,$b)
    If $a>$b Then Return $a
    Return $b
EndFunc

watch(folder) to create an index

watch(folder) again and use diff to compare :mellow: it works very fast for me :(

$start = TimerInit()
$list = watch("d:\")
MsgBox(0 ,"" , TimerDiff($start) & ", " & UBound($list))

3.3 seconds for 16k files on my D drive

Edited by Xand3r

Only two things are infinite, the universe and human stupidity, and i'm not sure about the former -Alber EinsteinPractice makes perfect! but nobody's perfect so why practice at all?http://forum.ambrozie.ro

Share this post


Link to post
Share on other sites

ok here's the entire script

it syncronizes 2 folders in real time :mellow:

it works on tcp because i made it in order to sync some autoit scripts on several virtual machines :(

sync.rar


Only two things are infinite, the universe and human stupidity, and i'm not sure about the former -Alber EinsteinPractice makes perfect! but nobody's perfect so why practice at all?http://forum.ambrozie.ro

Share this post


Link to post
Share on other sites

hey 99ojo.

ive been using this but have found a few bugs with it.

i know you said bugfix wrote this but i havent been able to track the article you got it from.

bugs:

1. cant process paths that contain more than ~255 characters

2. when comparing c:\dir1\*.html it looks at c:\dir\html

Hi,

1) For large directories it might be faster, if you use _ArraySearch () to compare your arrays.

2) BugFix wrote a function to get file or folder names with Scripting.FileSystemObject. It might be faster then dir and StringSplit. Also you have all files with full path straight stored in an array.

this would change your func get_files:

$hPID = Run(@ComSpec & ' /c dir /b /s "' & $sDir & '"', @ScriptDir, @SW_HIDE, 0x2)
    ProcessWaitClose($hPID)
    $asSplit = StringSplit(StdoutRead($hPID), @CRLF, 1)

to

$asSplit = _GetFilesFolder_Rekursiv ($sDir, "*", 0)

;==================================================================================================
; Function Name:   _GetFilesFolder_Rekursiv($sPath [, $sExt='*' [, $iDir=-1 [, $iRetType=0 ,[$sDelim='0']]]])
; Description:     recursive listing of files and/or folders
; Parameter(s):    $sPath     Basicpath of listing ('.' -current path, '..' -parent path)
;                  $sExt      Extension for file selection '*' or -1 for all (Default)
;                  $iDir      -1 Files+Folder(Default), 0 only Files, 1 only Folder
;      optional:   $iRetType  0 for Array, 1 for String as Return
;      optional:   $sDelim    Delimiter for string return
;                             0 -@CRLF (Default)  1 -@CR  2 -@LF  3 -';'  4 -'|'
; Return Value(s): Array (Default) or string with found pathes of files and/or folder
;                  Array[0] includes count of found files/folder
; Author(s):       BugFix (bugfix@autoit.de)
;==================================================================================================
Func _GetFilesFolder_Rekursiv($sPath, $sExt='*', $iDir=-1, $iRetType=0, $sDelim='0')
    Global $oFSO = ObjCreate('Scripting.FileSystemObject')
    Global $strFiles = ''
    Switch $sDelim
        Case '1'
            $sDelim = @CR
        Case '2'
            $sDelim = @LF
        Case '3'
            $sDelim = ';'
        Case '4'
            $sDelim = '|'
        Case Else
            $sDelim = @CRLF
    EndSwitch
    If ($iRetType < 0) Or ($iRetType > 1) Then $iRetType = 0
    If $sExt = -1 Then $sExt = '*'
    If ($iDir < -1) Or ($iDir > 1) Then $iDir = -1
    _ShowSubFolders($oFSO.GetFolder($sPath),$sExt,$iDir,$sDelim)
    If $iRetType = 0 Then
        Local $aOut
        $aOut = StringSplit(StringTrimRight($strFiles, StringLen($sDelim)), $sDelim, 1)
        If $aOut[1] = '' Then
            ReDim $aOut[1]
            $aOut[0] = 0
        EndIf
        Return $aOut
    Else
        Return StringTrimRight($strFiles, StringLen($sDelim))
    EndIf
EndFunc

Func _ShowSubFolders($Folder, $Ext='*', $Dir=-1, $Delim=@CRLF)
    If Not IsDeclared("strFiles") Then Global $strFiles = ''
    If ($Dir = -1) Or ($Dir = 0) Then
        For $file In $Folder.Files
            If $Ext <> '*' Then
                If StringRight($file.Name, StringLen($Ext)) = $Ext Then _
                    $strFiles &= $file.Path & $Delim
            Else
                $strFiles &= $file.Path & $Delim
            EndIf
        Next
    EndIf
    For $Subfolder In $Folder.SubFolders
        If ($Dir = -1) Or ($Dir = 1) Then $strFiles &= $Subfolder.Path & '\' & $Delim
        _ShowSubFolders($Subfolder, $Ext, $Dir, $Delim)
    Next
EndFunc

;-))

Stefan

Share this post


Link to post
Share on other sites

#7 ·  Posted (edited)

Hi,

cant process paths that contain more than ~255 characters

This is a well known Windows restriction (or bug) of the FileSystemObject, see:

http://msdn.microsoft.com/en-us/library/aa365247(VS.85).aspx

http://support.microsoft.com/kb/320081/en-us

I don't know, if the recursive search with FileFind has the same retriction. You might search forum for recursiv file find and try one ot these. There are lot's of examples.

But there must be a solution, as robocopy hasn't had such a restriction. Maybe some of the cracks know one.

when comparing c:\dir1\*.html it looks at c:\dir\html

This works perfect for me:

#include <array.au3>
$sDir = "c:\dir1"
$asSplit = _GetFilesFolder_Rekursiv ($sDir, "html", 0)
_ArrayDisplay ($asSplit)

;-))

Stefan

Edited by 99ojo

Share this post


Link to post
Share on other sites

#8 ·  Posted (edited)

Hi,

can you try:

Func _GetFilesFolder_Rekursiv($sPath, $sExt='*', $iDir=-1, $iRetType=0, $sDelim='0')
    $sPath = "\\?\" & $sPath ; this line was added and works for pathlength > MAX_Path 255

Your file and folder names in array then looks like \\?\<origin path>. I PM BugFix.

He may chage his function _GetFilesFolder.... to our purpose.

;-))

Stefan

Edited by 99ojo

Share this post


Link to post
Share on other sites

havent had a chance to test - will let you know

thx for the follow up

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0