Jump to content

proposed UDF: _FileCompare()


Celeri
 Share

Recommended Posts

## UPDATED October 23rd, 2005 ##

Also attached UDF as .ZIP file.

## UPDATED again October 23rd, 2005 ##

Fixed a few typos and following Valik's advice stop trying to make it 3.1.1 "compatible".

Since AutoIT3 Beta 3.1.1.77, it's possible to handle binary strings ...

Just what I needed for my very own file compare UDF ...

I've given it a good run for it's money but if someone can test it on another setup it would be great.

A typical call to this UDF would look something like this (actually this is the example script I will submit):

; Comparing two files...
$first = "C:\FirstFile.DAT"
$Second = "C:\SecondFile.DAT"

If Not _FileCompareBinary ($first, $Second) Then
    $ext = @extended
    $error = @error
    If $error = 1 Then
        ConsoleWrite("Wrong version of AutoIT (minimum = 3.1.1.77)" & @CR)
    ElseIf $error = 2 Then
        ConsoleWrite("At least one of the two files don't exist" & @CR)
    ElseIf $error = 3 Then
        ConsoleWrite("Both files are different size" & @CR)
    ElseIf $error = 4 Then
        ConsoleWrite("Bad buffer size" & @CR)
    ElseIf $error = 5 Then
        ConsoleWrite("Cannot compare complete file (increase BufferSize)" & @CR)
    ElseIf $error = 6 Then
        ConsoleWrite("Cannot open file for read" & @CR)
    ElseIf $error = 7 Then
        ConsoleWrite("Content doesn't match (compare fail)" & @CR)
        ConsoleWrite("Mistmatch at byte " & $ext & @CR)
    Else
        ConsoleWrite("Internal error, contact function author" & @CR)
        ConsoleWrite("Error code " & $error & @TAB & "Exteded code " & $ext & @CR)
    EndIf
Else
    ConsoleWrite("Congratulations, both files identical!" & @CR)
EndIf
Exit

And here's the function. And a few comments to go with it:

- Made a check to make sure people use the proper version of autoIT3. I actually only check to see if AutoIT can handle chr(0) (it couldn't in earlier versions of AutoIT3).

- A small BufferSize can make a compare crawl to a halt. 8192 bytes seemed perfect but you can change it if you want (optional). I didn't check but I assume using any number that is not a multiple of 512 (sector size) will also negatively affect performance.

- If both files are different size, there's no use in comparing them is there?

- If one insists on using a 1 byte BufferSize, _FileCompare() will refuse to compare files larger than 999999999 bytes. Just in case. Btw, such a comparison would probably take forever B)

- To modify a file (to test this script) there are a number of solutions but I use Edit from DOS and open a file in binary form.

- Used ConsoleWrite for messages in my example. Can obviously be changed to MsgBox ...

- Put a liberal amount of comments in function to help people follow my code.

- Used the command "Return 0" ... I didn't need to but you never know ... :o

- You can use a larger buffer size than the actual file size. It doesn't matter.

;===============================================================================
;
; Function Name:   _FileCompareBinary()
; Description:      Compare two files to see if their content match
; Parameter(s):     $s_1stFile        - Complete path to first file to compare
;                   $s_2ndFile        - Complete path to second file to compare
;                   $i_BufferSize    - BufferSize to use during compare (optionnal).
; Requirement(s):   Requires AutoIT 3.1.1.77 minimum
; Return Value(s):  On Success - Returns 1
;                   On Failure - Returns 0 and sets the @error flag in the following fashion:
;                    @error = 1 - Wrong version of AutoIT (minimum = 3.1.1.77)
;                    @error = 2 - At least one of the two files doesn't exist
;                    @error = 3 - Both files are different size
;                    @error = 4 - Bad buffer size
;                    @error = 5 - Cannot compare complete file (increase BufferSize)
;                    @error = 6 - Cannot open file for read
;                    @error = 7 - Content doesn't match (compare fail)
;                    @extended = Offset (in bytes) of first difference
; Remarks:            Low buffer values result in poor performance
; Author(s):        Louis Horvath (celeri@videotron.ca)
;
;===============================================================================
Func _FileCompareBinary($s_1stFile, $s_2ndFile, $i_BufferSize = 8192); Compare two files to see if their content match
    Local $h_1stFile, $h_2ndFile, $i_Loop1, $i_Loop2, $i_ChunkOffset, $i_ByteOffset
    
; Basic sanity checks
    If Not StringLen(Chr(0)) Then
        SetError(1); Cannot detect character "0" then wrong version of AutoIT ...
        Return 0
    EndIf
    If Not FileExists($s_1stFile) Or Not FileExists($s_2ndFile) Then
        SetError(2); One of the two files don't exist
        Return 0
    EndIf
    If FileGetSize($s_1stFile) <> FileGetSize($s_2ndFile) Then
        SetError(3); Both files are different size (duh!)
        Return 0
    EndIf
    If (Not IsInt($i_BufferSize)) Or ($i_BufferSize < 1 And $i_BufferSize > 65536) Then
        SetError(4); Bad buffer size
        Return 0
    EndIf
    If FileGetSize($s_1stFile) > $i_BufferSize * 999999999 Then
        SetError(5); Cannot compare complete file (increase BufferSize)
        Return 0
    EndIf
    
    $h_1stFile = FileOpen($s_1stFile, 0); Open the first file
    If $h_1stFile = -1 Then
        SetError(6); Cannot open file
        Return 0
    EndIf
    $h_2ndFile = FileOpen($s_2ndFile, 0); Open the second file
    If $h_2ndFile = -1 Then
        FileClose($h_1stFile); Cannot open file
        SetError(6)
        Return 0
    EndIf
    
; Actual compare loop
    For $i_Loop1 = 1 To 999999999; Binary compare of both files. For/Next for speed
        $b_Source = FileRead($h_1stFile, $i_BufferSize); Get chunk of first file
        $b_Dest = FileRead($h_2ndFile, $i_BufferSize); Get chunk of  second file
        If Not ($b_Source = $b_Dest) Then; If they don't match ...
            FileClose($h_1stFile); Close the first
            FileClose($h_2ndFile); And the second
        ; Find actual byte where difference is found
            $i_ChunkOffset = $i_BufferSize* ($i_Loop1 - 1); How many chunks have been processed * buffer size
            For $i_Loop2 = 1 To $i_BufferSize; Do at least one run
                If StringMid($b_Source, $i_Loop2, 1) <> StringMid($b_Dest, $i_Loop2, 1) Then; Is this byte different?
                    $i_ByteOffset = $i_ChunkOffset + $i_Loop2; Yup so that's where the difference is.
                    ExitLoop
                EndIf
            Next
            SetError(7, $i_ByteOffset); Ok tell program calling function that files don't match.
            Return 0; 0 = Something went wrong;)
        EndIf
    Next
    FileClose($h_1stFile); Close first file
    FileClose($h_2ndFile); Close second file
    Return 1; Return "Yes, both file compare"
EndFunc  ;==>_FileCompareBinary

And in case IE plays tricks on you when you cut and paste / Save As ... I've attached a zip file.

Will be updated if I make a booboo (which is highly likely ahahaha).

Any positive criticism accepted.

_FileCompareBinary__.zip

Edited by Celeri

I am endeavoring, ma'am, to construct a mnemonic circuit using stone knives and bearskins.SpockMy UDFs:Deleted - they were old and I'm lazy ... :)My utilities:Comment stripperPolicy lister 1.07AutoIT Speed Tester (new!)

Link to comment
Share on other sites

Cool. B)

Just tried on two exes and it worked great.

However I suggest removing the check for the autoit version as all the other udfs I've seen don't do this check and it should work fine under previous versions for comparing plain text files.

Anyway, Nice Job. :o

HKTunes:Softpedia | GoogleCodeLyricToy:Softpedia | GoogleCodeRCTunes:Softpedia | GoogleCodeMichtaToolsProgrammer n. - An ingenious device that turns caffeine into code.
Link to comment
Share on other sites

Cool. B)

Just tried on two exes and it worked great.

However I suggest removing the check for the autoit version as all the other udfs I've seen don't do this check and it should work fine under previous versions for comparing plain text files.

Anyway, Nice Job. :o

Thanks :graduated:

Mind you this was *meant* to be a binary compare :x

After sleeping on it I was meaning to change it's name to _FileCompareBinary()

I might do a text compare func but I would do a seperate function ... perhaps _FileCompareText()

(although this program will obviously work on text files also)

So that explains the check.

I also meant to send back the offset to the first different byte in @extended.

BTW I'm really with the speed on this function. Up to par with most other compare programs :)

Keep you posted :D

P.S.: I'll check out your functions too!

P.P.S.: Interesting, putting links to posted functions in the signature ... smart :x

I am endeavoring, ma'am, to construct a mnemonic circuit using stone knives and bearskins.SpockMy UDFs:Deleted - they were old and I'm lazy ... :)My utilities:Comment stripperPolicy lister 1.07AutoIT Speed Tester (new!)

Link to comment
Share on other sites

I have to agree with SolidSnake on removing the @AutoitVersion check, I'll give you an example of why:

if the person creates a script that includes your udf and they use reshacker to set the exe version information your udf won't work, (compile with beta) here's an extract from one of my scripts to show the point:

#Region Compiler directives section
;** This is a list of compiler directives used by CompileAU3.exe.
;** comment the lines you don't need or else it will override the default settings
;#Compiler_Prompt=y               ;y=show compile menu
;** AUT2EXE settings
;#Compiler_AUT2EXE=
#Compiler_Icon=wrench.ico                   ;Filename of the Ico file to use
;#Compiler_OutFile=               ;Target exe filename.
#Compiler_Compression=4          ;Compression parameter 0-4  0=Low 2=normal 4=High
#Compiler_Allow_Decompile=y       ;y= allow decompile
;#Compiler_PassPhrase=             ;Password to use for compilation
;** Target program Resource info
#Compiler_Res_Comment=Tool to Aid Admins.
#Compiler_Res_Description=Software tool to aid Admins
#Compiler_Res_Fileversion=1.9.1.2
#Compiler_Res_LegalCopyright=
; free form resource fields ... max 15
[email="#Compiler_Res_Field=Email|custompcs@charter.net"]#Compiler_Res_Field=Email|custompcs@charter.net[/email] ;Free format fieldname|fieldvalue
#Compiler_Res_Field=Release Date|09/28/2005   ;Free format fieldname|fieldvalue
#Compiler_Res_Field=Update Date|10/21/2005  ;Free format fieldname|fieldvalue
#Compiler_Res_Field=Internal Name|AdminTool.exe ;Free format fieldname|fieldvalue
#Compiler_Res_Field=Status|Beta  ;Free format fieldname|fieldvalue
#Compiler_Run_AU3Check=y            ;Run au3check before compilation
; The following directives can contain:
;   %in% , %out%, %icon% which will be replaced by the fullpath\filename.
;   %scriptdir% same as @ScriptDir and %scriptfile% = filename without extension.
#Compiler_Run_Before=              ;process to run before compilation - you can have multiple records that will be processed in sequence
#Compiler_Run_After=move "%out%" "%scriptdir%"  ;process to run After compilation - you can have multiple records that will be processed in sequence
#EndRegion

_Main()

Func _Main()
 MsgBox(0,"test",@AutoItVersion)
EndFunc

Gary

SciTE for AutoItDirections for Submitting Standard UDFs

 

Don't argue with an idiot; people watching may not be able to tell the difference.

 

Link to comment
Share on other sites

if the person creates a script that includes your udf and they use reshacker to set the exe version information your udf won't work

That's a very interesting observation Gary -- I would almost consider this undesirable behaviour. In fact I might ask about this in the Developers forum.
Link to comment
Share on other sites

@Celeri:

I found your script very helpful, thanks for posting it. Suggestion (may be dumb, I'm a newbie), why not set buffer size in script automatically so that it will always be correct size? Maybe like this (untestd, may not work, I am not sure what would be good values):

$i_BufferSize = 8192 
while ( FileGetSize($s_1stFile) > $i_BufferSize * 999999999 )
$i_BufferSize = $i_BufferSize + 8192
wend
IF $i_BufferSize > 65536 then ERROR; file too big

Also, can max buffer size be larger than 2^16 in 32 bit machines?

Edited by peter1234
Link to comment
Share on other sites

First of all I modified the first post of this message with an UPDATED version of my script. Take a look if you're interested. Now for the comments and stuff B)

I have to agree with SolidSnake on removing the @AutoitVersion check, I'll give you an example of why:

if the person creates a script that includes your udf and they use reshacker to set the exe version information your udf won't work, (compile with beta) here's an extract from one of my scripts ...

Gary

Well I have to agree with LxP on that one. The Macro does say "@AutoITVersion". If it was called @ExeVersion or @ProgramVersion it would make more sense. Although I understand you can't predict EVERYTHING :o

Doesn't matter anyways, I came up with another solution; since the major difference was being able to detect chr(0) then I just added this check: If Not StringLen(chr(0)) ... then it's not the good version (for example version 3.1.1 considers chr(0) as a non-character and returns a blank string). This is an important check since I got fooled myself many times (can you imagine!) running my script with F5 instead of Alt-F5 (run Beta).

Well anyways I also added something interesting, when the file compare fails the program returns the exact offset (in bytes) to the first different byte. Now here's the part that sucks, in order to return the offset I wanted to use @extended. Well guess what I can't simply write :

SetExtended($Offset)
SetError(7)

It'll return @Error 7 but @extended is blank. So I have to use

SetError(7,$Offset)

So guess what ... AutoIT3 version 3.11 ... throws out an error. It can't handle more than one parameter with SetError().

Oh well if it comes to that I'll simply take the offset calculation out.

@Peter1234

why not set buffer size in script automatically so that it will always be correct size?

Well it is. I think you don't understand what I meant by buffer. So here's a quick explanation.

Since it would be ridiculous to try to compare a whole file in one shot, I compare it a "chunk" at a time. In this case, the default "chunk" size is 8192 bytes. 8192 is a multiple of 512 which is the size of a sector on a hard drive. I found, after testing on many devices, it was the best possible buffer size.

Now just for kicks, try using 1 as a buffer size ... It should take you a while :graduated:

Oh, one last thing, my script is now slightly slower. Well it had to be since I now have to calculate the offset. If enough people complain about it I will just take that function out (too bad for that @extended thing also).

Edited by Celeri

I am endeavoring, ma'am, to construct a mnemonic circuit using stone knives and bearskins.SpockMy UDFs:Deleted - they were old and I'm lazy ... :)My utilities:Comment stripperPolicy lister 1.07AutoIT Speed Tester (new!)

Link to comment
Share on other sites

Ok, unless something happens, final version posted.

You now need minimum 3.1.1.55 to compile the script and minimum 3.1.1.77 to run it B)

- I took out line 75: SetExtended($i_ByteOffset) since you cannot set @error and @extended on different lines.

I am endeavoring, ma'am, to construct a mnemonic circuit using stone knives and bearskins.SpockMy UDFs:Deleted - they were old and I'm lazy ... :)My utilities:Comment stripperPolicy lister 1.07AutoIT Speed Tester (new!)

Link to comment
Share on other sites

@Celeri, very useful UDF. Nice to see AutoIt users adding to the infinite potential of this scripting language. Any additions please post.. :o

Cheers.. B)

Thanks!

Actually before beta .77 it ... almost worked :graduated:

Got fooled for a while until I checked the changelog ... goes to show you never can be too careful ...

Anyways I'm now working on a base64 converter in order to send attachements with this UDF I'm tinkering with :) ... it's a real you-know-what ... looks so easy but I just can't seem to get it right ... Now that my kid's asleep I'm doing an all-nighter to get this one working :D

Hopefully it'll come in handy for someone ...

P.S.: This might be obvious for some people but one of the main reasons I'm posting these UDF is because I am so grateful for the responses I have been getting on this forum ... one of the best places around!

I am endeavoring, ma'am, to construct a mnemonic circuit using stone knives and bearskins.SpockMy UDFs:Deleted - they were old and I'm lazy ... :)My utilities:Comment stripperPolicy lister 1.07AutoIT Speed Tester (new!)

Link to comment
Share on other sites

Thanks!

Actually before beta .77 it ... almost worked :o

Got fooled for a while until I checked the changelog ... goes to show you never can be too careful ...

Anyways I'm now working on a base64 converter in order to send attachements with this UDF I'm tinkering with :graduated: ... it's a real you-know-what ... looks so easy but I just can't seem to get it right ... Now that my kid's asleep I'm doing an all-nighter to get this one working :)

Hopefully it'll come in handy for someone ...

P.S.: This might be obvious for some people but one of the main reasons I'm posting these UDF is because I am so grateful for the responses I have been getting on this forum ... one of the best places around!

@Celeri, it will be nice to see what a nice of coding will turn out. Post with the goodies when you get it done.

Cheers.. B)

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...