Jump to content

Compare 2 Text Files


dfdub
 Share

Recommended Posts

I am writing a script to compare the lines within the following 2 text files and print the lines that do not match.

File1:

100001
100002
100003
999998
100005
100006
999999
100008
100009
100000

File2:

100001
100002
100003
100004
100005
100006
100007
100008
100009
100000

My code:

$File1Input = FileOpen("input1.txt", 0)
$File2Input = FileOpen("input2.txt", 0)

While 1
    local $flag
    $Linex = FileReadLine($File1Input)
    
    For $a=0 to 10 step 1
        $Liney = FileReadLine($File2Input)
        
        If $Linex = $Liney Then
            WinActivate("notes.txt - Notepad")
            Send($Linex & " has a match.")
            Send("{enter}")
            ExitLoop
            SetError(0)
        EndIf
        
        if @error = -1 Then
            WinActivate("notes.txt - Notepad")
            Send($Linex & " does NOT have a match.")
            Send("{enter}")
            SetError(0)
            ExitLoop
            
        EndIf
        
    Next
    
WEnd

FileClose($File1Input)
FileClose($File2Input)

I am simply pulling one line from file 1 and comparing it against all lines within file 2. If they match, I print "xxx has a match" and exit the loop. If they do not match, it compares to the next line until it reaches EOF, at which point it prints "xxxx does not have a match". It seems to work until it reaches the first line that does not match. It then prints all lines afterwards. Here is my output:

100001 has a match.
100002 has a match.
100003 has a match.
999998 does NOT have a match.
100005 does NOT have a match.
100006 does NOT have a match.
999999 does NOT have a match.
100008 does NOT have a match.
100009 does NOT have a match.
100000 does NOT have a match.

Unfortunately I think that @error is being stored incorrectly. I have read the @error and seterror() documentation and still cant figure out how to set @error correctly... is this even my issue?

Edited by dfdub
Link to comment
Share on other sites

If file 2 has less lines than file 1, this will keep it from erroring out.

#include <file.au3>

Dim $array1
Dim $array2

_FileReadToArray("text1.txt", $array1)
_FileReadToArray("text2.txt", $array2)

For $X = 1 to $array1[0]
    If $X <= $array2[0] Then
        If $array1[$X] = $array2[$X] Then 
            $array1[$X] &= " has a match."
        Else
            $array1[$X] &= " does NOT have a match."
        EndIf
    Else
        ExitLoop
    EndIf
Next

_FileWriteFromArray("text1.txt",$array1, 1)

Edit: Added exitloop in case file1 is way bigger than file 2

Edited by weaponx
Link to comment
Share on other sites

  • Moderators

$sToMatch = "100001" & @CRLF & _
            "100002" & @CRLF & _
            "100003" & @CRLF & _
            "999998" & @CRLF & _
            "100005" & @CRLF & _
            "100006" & @CRLF & _
            "999999" & @CRLF & _
            "100008" & @CRLF & _
            "100009" & @CRLF & _
            "100000";Would normally be the fileread to the file that stores the chars to find
            
$sMatchFrom = "100001" & @CRLF & _
                "100002" & @CRLF & _
                "100003" & @CRLF & _
                "100004" & @CRLF & _
                "100005" & @CRLF & _
                "100006" & @CRLF & _
                "100007" & @CRLF & _
                "100008" & @CRLF & _
                "100009" & @CRLF & _
                "100000";File to read to find the info

$sOutPut = _myFileReturnInfo($sToMatch, $sMatchFrom)
MsgBox(64, "Info", $sOutPut)

;Acutal function
Func _myFileReturnInfo($sFile1, $sFile2)
    ;Might have your file reads here or whatever
    Local $aSplit = StringSplit(StringStripCR($sFile1), @LF);Create file 1 array
    ;With RegExp, we don't really need a big function
    Local $sHoldText
    For $i = 1 To $aSplit[0]
        If StringRegExp($sFile2, "(?s)(?i)(?m:^|\n)" & $aSplit[$i] & "(?m:$|\r)") Then
            $sHoldText &= $aSplit[$i] & " has a match." & @CRLF
        Else
            $sHoldText &= $aSplit[$i] & " does NOT have a match." & @CRLF
        EndIf
    Next
    Return StringTrimRight($sHoldText, 2);trim off the last carriage return + line feed
EndFunc

Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.

Link to comment
Share on other sites

$sToMatch = "100001" & @CRLF & _
            "100002" & @CRLF & _
            "100003" & @CRLF & _
            "999998" & @CRLF & _
            "100005" & @CRLF & _
            "100006" & @CRLF & _
            "999999" & @CRLF & _
            "100008" & @CRLF & _
            "100009" & @CRLF & _
            "100000";Would normally be the fileread to the file that stores the chars to find
            
$sMatchFrom = "100001" & @CRLF & _
                "100002" & @CRLF & _
                "100003" & @CRLF & _
                "100004" & @CRLF & _
                "100005" & @CRLF & _
                "100006" & @CRLF & _
                "100007" & @CRLF & _
                "100008" & @CRLF & _
                "100009" & @CRLF & _
                "100000";File to read to find the info

$sOutPut = _myFileReturnInfo($sToMatch, $sMatchFrom)
MsgBox(64, "Info", $sOutPut)

;Acutal function
Func _myFileReturnInfo($sFile1, $sFile2)
    ;Might have your file reads here or whatever
    Local $aSplit = StringSplit(StringStripCR($sFile1), @LF);Create file 1 array
    ;With RegExp, we don't really need a big function
    Local $sHoldText
    For $i = 1 To $aSplit[0]
        If StringRegExp($sFile2, "(?s)(?i)(?m:^|\n)" & $aSplit[$i] & "(?m:$|\r)") Then
            $sHoldText &= $aSplit[$i] & " has a match." & @CRLF
        Else
            $sHoldText &= $aSplit[$i] & " does NOT have a match." & @CRLF
        EndIf
    Next
    Return StringTrimRight($sHoldText, 2);trim off the last carriage return + line feed
EndFunc

Thanks, but obviously this isnt what I'm shooting for. The file I need to compare is 3000 lines long.

Link to comment
Share on other sites

  • Moderators

Thanks, but obviously this isnt what I'm shooting for. The file I need to compare is 3000 lines long.

Um... Enlighten me on the "obvious" and why wouldn't it work? It's much faster and more precise than anything else that is going to be come up with here. Edited by SmOke_N

Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.

Link to comment
Share on other sites

So I guess I'll send you my 3000 line file and you hardcode it all into the script? Or did you parse that out real quick with some other way that I am unfamiliar with?

Why didn't you try the script I posted?

Link to comment
Share on other sites

I glanced over and assumed you hard coded the data into the script. I ran it against 500 lines of data (after reading comments) and it worked perfectly. Sorry for the smart ass response.

I guess another question I have is... what is wrong with my original logic- albeit may not have been the most efficient way to tackle this issue with autoit, yet it almost worked :/

Any ideas?

Edited by dfdub
Link to comment
Share on other sites

  • Moderators

So I guess I'll send you my 3000 line file and you hardcode it all into the script? Or did you parse that out real quick with some other way that I am unfamiliar with?

I thought I left enough comments to make sense.

I used your example file "A" you left (the one you want to match) and your example file "B" you left for use to understand what you were doing.

I simply made them strings.

If you ran the example, you'd see that the result was what you were looking for.

I even left a note, that you would replace the "strings" I put in there with "FileRead"

So it would be as simple as:

$sToMatch = "Filename to read to an array and seperate.txt"

$sMatchFrom = "This is the file that contains all the data you want to compare.txt"

$sOutPut = _myFileReturnInfo(FileRead($sToMatch), FileRead($sMatchFrom))
MsgBox(64, "Info", $sOutPut)

;Acutal function
Func _myFileReturnInfo($sFile1, $sFile2)
    ;Might have your file reads here or whatever
    Local $aSplit = StringSplit(StringStripCR($sFile1), @LF);Create file 1 array
    ;With RegExp, we don't really need a big function
    Local $sHoldText
    For $i = 1 To $aSplit[0]
        If StringRegExp($sFile2, "(?s)(?i)(?m:^|\n)" & $aSplit[$i] & "(?m:$|\r)") Then
            $sHoldText &= $aSplit[$i] & " has a match." & @CRLF
        Else
            $sHoldText &= $aSplit[$i] & " does NOT have a match." & @CRLF
        EndIf
    Next
    Return StringTrimRight($sHoldText, 2);trim off the last carriage return + line feed
EndFunc

Note in the function why this is going to be faster.

You said the file(s) can be 3000 lines... that's 3000 * 3000 loops you'll need to do if they were both 3000 lines with the method(s) you presented or were presented.

If you look at the function... it simply Parses the first file into an array (lets say 3000 elements for arguements sake), then uses RegExp to match the start of the line to the end of a line to match the value.

This method saves you the other 3000 loops... and whatever hard core functions it uses will be on the lower side of the language which means speed will dramatically increase.

Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.

Link to comment
Share on other sites

Yeah, I was definitely aware of the loop redundancy with my original logic... I was just thinking back to my old C++ classes and how we used to write triple nested 'for' loops to sort numbers. I am still new to the handy functions in AutoIt and I apologize again for reacting so quick to your original post.

But I still cant figure out why my original script was failing after finding the first non-match :/

Link to comment
Share on other sites

  • Moderators

Yeah, I was definitely aware of the loop redundancy with my original logic... I was just thinking back to my old C++ classes and how we used to write triple nested 'for' loops to sort numbers. I am still new to the handy functions in AutoIt and I apologize again for reacting so quick to your original post.

But I still cant figure out why my original script was failing after finding the first non-match :/

The @error -1 is only to capture the FileReadLine error, should go right after that.

Don't know exactly what you're trying to do with the error, but I'd suggest maybe taking a look at your logic over.

Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.

Link to comment
Share on other sites

  • Moderators

I was using the error to identify the end of file, and thus display 'XXXXXX does not have a match'

Yeah, I know... Just looks like a logic issue (as I think I mentioned before)... See if this helps:
$File1Input = FileOpen("input1.txt", 0)
$File2Input = FileOpen("input2.txt", 0)

While 1
    local $flag
    $Linex = FileReadLine($File1Input)
    If @error = -1 Then ExitLoop
    
    For $a=0 to 10 step 1
        $Liney = FileReadLine($File2Input)
        If @error = -1 Then
            WinActivate("notes.txt - Notepad")
            Send($Linex & " does NOT have a match." & "{ENTER}")
            ExitLoop
        EndIf
        
        If $Linex = $Liney Then
            WinActivate("notes.txt - Notepad")
            Send($Linex & " has a match." & "{ENTER}")
            ExitLoop
        EndIf        
    Next
    
WEnd
But keep in mind... this is the slowest method possible.

Edit:

Had my own logic issue :)^_^

Edited by SmOke_N

Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.

Link to comment
Share on other sites

Still get the same output:

100000 has a match.
100001 has a match.
100002 has a match.
100003 has a match.
999998 does NOT have a match.
100005 does NOT have a match.
999999 does NOT have a match.
100007 does NOT have a match.
100008 does NOT have a match.
100009 does NOT have a match.

I know it really doesn't matter at this point since you have pointed out a much faster method. But I get a little obsessive over something that I cant figure out. I am almost positive the @error needs to be stored in a variable somewhere and/or it needs to be reset to 0. Notice the remarks from the helpfile:

When entering a function @error is set to 0. Unless SetError() is called, then @error will remain 0 after the function has ended. This means that in order for @error to be set after a function, it must be explicitly set. This also means you may need to backup the status of @error in a variable if you are testing it in a While-WEnd loop.

I know this pertains to what I was trying to do but... :)
Link to comment
Share on other sites

  • Moderators

Still get the same output:

100000 has a match.
100001 has a match.
100002 has a match.
100003 has a match.
999998 does NOT have a match.
100005 does NOT have a match.
999999 does NOT have a match.
100007 does NOT have a match.
100008 does NOT have a match.
100009 does NOT have a match.

I know it really doesn't matter at this point since you have pointed out a much faster method. But I get a little obsessive over something that I cant figure out. I am almost positive the @error needs to be stored in a variable somewhere and/or it needs to be reset to 0. Notice the remarks from the helpfile:

I know this pertains to what I was trying to do but... :)

Then obviously I have no idea what you were trying to do lol... maybe someone else will chime in to help you.

Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.

Link to comment
Share on other sites

Then obviously I have no idea what you were trying to do lol... maybe someone else will chime in to help you.

Nah, you already did help a ton. Your code works great for what I am trying to do. Im just backtracking a little and trying to figure out what was wrong with my original logic (besides it being very slow!)

BTW, what logic issues were you having?

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...