Sign in to follow this  
Followers 0
PCI

< Solved > Urgent help needed please - Compare files

25 posts in this topic

#1 ·  Posted (edited)

Hi everyone , hope some masters and MVP could help me on this.

I have 2 files to compare file1.txt and file2.txt

Both files have like 20000 lines and it's hard for me to go through them line by line.

Here's an example of the lines :

212121212121ýxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

313131313131ýxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

I need to read the first string 212121212121 before ý from file1.txt and compare it to the same string on file2.txt then if anything on the same line from both files is different then copy the whole line in another file result.txt

I'm really sorry i got stuck on this as i could figure out if i should use FileRead/_FileReadToArray or FileReadLine or StringMid

Please help me at least how to start my script

Thank you so much

PCI

Edited by PCI

Share this post


Link to post
Share on other sites



#2 ·  Posted (edited)

This is a basic script to do what you want.

I have made the assumption that the 2 files contain the same number of lines and that the lines are in the same order. If this is not the case then something a little more complex will be required.

#include <file.au3>
Global $aArray1 = 0
Global $aArray2 = 0
Global $hDiff = -1
_FileReadToArray("C:File1.txt",$aArray1)
_FileReadToArray("C:File2.txt",$aArray2)
$hDiff = Fileopen("C:Diff.txt",10)
For $i = 1 To UBound($aArray1) - 1
if $aArray1[$i] <> $aArray2[$i] Then
  FileWriteLine($hDiff, "File1:" & $aArray1[$i] & @CRLF &  "File2:" & $aArray2[$i])
endif
Next
FileClose($hDiff)

Edit: Fixed typo

Edited by Bowmore
1 person likes this

"Programming today is a race between software engineers striving to build bigger and better idiot-proof programs, and the universe trying to build bigger and better idiots. So far, the universe is winning."- Rick Cook

Share this post


Link to post
Share on other sites

This is a basic script to do what you want.

I have made the assumption that the 2 files contain the same number of lines and that the lines are in the same order. If this is not the case then something a little more complex will be required.

#include <file.au3>
Global $aArray1 = 0
Global $aArray2 = 0
Global $hDiff = -1
_FileReadToArray("C:File1.txt",$aArray1)
_FileReadToArray("C:File2.txt",$aArray2)
$hDiff = Fileopen("C:Diff.txt",10)
For $i = 1 To UBound($aArray1) - 1
if $aArray1[$i] <> $aArray2[$i] Then
  FileWriteLine($hDiff, "File1:" & $aArray1[$i] & @CRLF &  "File2:" & $aArray2[$i])
endif
Next
FileClose($hDiff)

Edit: Fixed typo

Thank you So much Bowmore , the issue i have i know that sometimes i will find the same first string but with difference like :

File1.txt

10757_160491010_0_1ý6.0000ýýNETýITEMýýý2ý1ýTestingýTestingýNýFULLý2012-01-01ý3000-12-31

File2.txt

10757_160491010_0_1ý6.0000ýýNETýITEMýýý2ý1ýTestingýTestingýNýFULLý2012-01-11ý3000-12-31

10757_160491010_0_1ý6.0000ýýNETýITEMýýý2ý1ýTestingýTestingýNýFULLý2012-01-31ý3000-12-31

So it's important not to check by lines but first numbers before the first ý .

Thank you for you valuable time

Share this post


Link to post
Share on other sites

Sorry I did not read your first post carefully enough. This version should check all the lines where the first part matches and then write the lines to new file if anything else on the lines from file1 and file2 are different.

#include <file.au3>
Global $aArray1 = 0
Global $aArray2 = 0
Global $hDiff = -1
_FileReadToArray("C:File1.txt",$aArray1)
_FileReadToArray("C:File2.txt",$aArray2)
$hDiff = Fileopen("C:Diff.txt",10)
For $i = 1 To UBound($aArray1) - 1
$sID1 = StringLeft($aArray1[$i],StringInStr($aArray1[$i],"ý",1,1)-1)
For $j = 1 To UBound($aArray2) - 1
  $sID2 = StringLeft($aArray2[$j],StringInStr($aArray2[$j],"ý",1,1)-1)
  if  $sID1 == $sID2 Then ;Check if the first part is the same
   ;then check if anything else on the lines is different
   if $aArray1[$i] <> $aArray2[$j] Then
    FileWriteLine($hDiff, "File1: Line" & $i & ":" & $aArray1[$i] & @CRLF &  "File2: Line" & $j & ":" & $aArray2[$j])
   endif
  EndIf
next
Next
FileClose($hDiff)
1 person likes this

"Programming today is a race between software engineers striving to build bigger and better idiot-proof programs, and the universe trying to build bigger and better idiots. So far, the universe is winning."- Rick Cook

Share this post


Link to post
Share on other sites

Thank you Mr JohnOne .

I looked at StringLeft , but i cannot use the count because my string count could change before the (ý).

I guess my question is badly formulated ?

You want me to delete my post and post in the help section http://www.autoitscript.com/forum/forum/2-general-help-and-support/ ?

Thank you JohnOne for your inputs.

PCI

Share this post


Link to post
Share on other sites

Sorry I did not read your first post carefully enough. This version should check all the lines where the first part matches and then write the lines to new file if anything else on the lines from file1 and file2 are different.

#include <file.au3>
Global $aArray1 = 0
Global $aArray2 = 0
Global $hDiff = -1
_FileReadToArray("C:File1.txt",$aArray1)
_FileReadToArray("C:File2.txt",$aArray2)
$hDiff = Fileopen("C:Diff.txt",10)
For $i = 1 To UBound($aArray1) - 1
$sID1 = StringLeft($aArray1[$i],StringInStr($aArray1[$i],"ý",1,1)-1)
For $j = 1 To UBound($aArray2) - 1
  $sID2 = StringLeft($aArray2[$j],StringInStr($aArray2[$j],"ý",1,1)-1)
  if  $sID1 == $sID2 Then ;Check if the first part is the same
   ;then check if anything else on the lines is different
   if $aArray1[$i] <> $aArray2[$j] Then
    FileWriteLine($hDiff, "File1: Line" & $i & ":" & $aArray1[$i] & @CRLF &  "File2: Line" & $j & ":" & $aArray2[$j])
   endif
  EndIf
next
Next
FileClose($hDiff)

Thank you Bowmore , i'm currently testing your script , i will post the results very soon as it's taking the time to check the 20000 lines

Thank you so much again

Share this post


Link to post
Share on other sites

#8 ·  Posted (edited)

Thank you Bowmore for the help i appreciate a lot.

Here's what i got after 977 seconds of comparing 13000 Lines

In Diff.txt :

File1: Line571:212121212121ýXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXY

File2: Line415:212121212121ýXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

File1: Line571:212121212121ýXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXY

File2: Line945:212121212121ýXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

Question why i have the same line compared twice in the above example ( line 571 ) ?

Is it because of the same string found ?

Edited by PCI

Share this post


Link to post
Share on other sites

The reason some lines are compared twice is because the first part of line 571 in file 1 matches the first part of both line 414 and 415 so line 571 in file 1 gets compared with line 414 and 415 in file 2.

PS: If this is somthing you are going to have to do on a regular basis the script can be made considerably faster by sorting the the arrays and walking the index on the second array manually rather than looping through the entire array for each line. You could also add a lttle GUI added for the user to select the input and output files and show progress.

1 person likes this

"Programming today is a race between software engineers striving to build bigger and better idiot-proof programs, and the universe trying to build bigger and better idiots. So far, the universe is winning."- Rick Cook

Share this post


Link to post
Share on other sites

This is more of a question than a suggestion. Rather than read partial strings, would it be faster or slower to use _ArraySearch() after sorting both arrays first? This way wouldn't identify partial matches Something like: (untested)

_ArraySort($Array1,0,0,0,0)
_ArraySort($Array2,0,0,0,0)

For $i = 1 To UBound($aArray1) - 1
$match = _ArraySearch($Array2, $Array1[$i])
If @error Then ContinueLoop
FileWriteLine($hDiff, "File1: Line" & $i & ":" & $aArray1[$i] & " File2: Line " & $match)
Next

It could be that this is a horrendous way to do it, as I say I really don't know, but I'd be interested to hear.

1 person likes this

[font='Comic Sans MS']Eagles may soar high but weasels dont get sucked into jet engines[/font]

Share this post


Link to post
Share on other sites

The reason some lines are compared twice is because the first part of line 571 in file 1 matches the first part of both line 414 and 415 so line 571 in file 1 gets compared with line 414 and 415 in file 2.

PS: If this is somthing you are going to have to do on a regular basis the script can be made considerably faster by sorting the the arrays and walking the index on the second array manually rather than looping through the entire array for each line. You could also add a lttle GUI added for the user to select the input and output files and show progress.

Thank you Bowmore for your inputs

Share this post


Link to post
Share on other sites

This is more of a question than a suggestion. Rather than read partial strings, would it be faster or slower to use _ArraySearch() after sorting both arrays first? This way wouldn't identify partial matches Something like: (untested)

_ArraySort($Array1,0,0,0,0)
_ArraySort($Array2,0,0,0,0)

For $i = 1 To UBound($aArray1) - 1
$match = _ArraySearch($Array2, $Array1[$i])
If @error Then ContinueLoop
FileWriteLine($hDiff, "File1: Line" & $i & ":" & $aArray1[$i] & " File2: Line " & $match)
Next

It could be that this is a horrendous way to do it, as I say I really don't know, but I'd be interested to hear.

I will try it and keep you posted.

Thank you so much !

Share this post


Link to post
Share on other sites

Is it normal that when comparing the 2 text files 13000 each it takes 14 or 15 minutes to complete ?

Please advise

Thank you

Share this post


Link to post
Share on other sites

I got stuck on way to long file compare.

Here's what i'm stuck on :

1- Need to compare lines on fileA with fileB

2- If Line on fileA exist on fileB then do not display it on the log.

3- If Line on fileA exist on fileB but have differences output Difference : the line number and the line

4- if Line on fileA does not exist on fileB output Missing : as the line number and the line content

5- if Line on fileB does not exist on fileA output Missing : as the line number and the line content

Please advise ,

I'm loosing my hair ;)

PCI simple begginer

Share this post


Link to post
Share on other sites

Why not use something like WinMerge if it's that urgent.


_AdapterConnections()_AlwaysRun()_AppMon()_AppMonEx()_BinaryBin()_CheckMsgBox()_CmdLineRaw()_ContextMenu()_ConvertLHWebColor()/_ConvertSHWebColor()_DesktopDimensions()_DisplayPassword()_DotNet_Load()/_DotNet_Unload()_Fibonacci()_FileCompare()_FileCompareContents()_FileNameByHandle()_FilePrefix/SRE()_FindInFile()_GetBackgroundColor()/_SetBackgroundColor()_GetConrolID()_GetCtrlClass()_GetDirectoryFormat()_GetDriveMediaType()_GetFilename()/_GetFilenameExt()_GetHardwareID()_GetIP()_GetIP_Country()_GetOSLanguage()_GetSavedSource()_GetStringSize()_GetSystemPaths()_GetURLImage()_GIFImage()_GoogleWeather()_GUICtrlCreateGroup()_GUICtrlListBox_CreateArray()_GUICtrlListView_CreateArray()_GUICtrlListView_SaveCSV()_GUICtrlListView_SaveHTML()_GUICtrlListView_SaveTxt()_GUICtrlListView_SaveXML()_GUICtrlMenu_Recent()_GUICtrlMenu_SetItemImage()_GUICtrlTreeView_CreateArray()_GUIDisable()_GUIImageList_SetIconFromHandle()_GUIRegisterMsg()_GUISetIcon()_Icon_Clear()/_Icon_Set()_IdleTime()_InetGet()_InetGetGUI()_InetGetProgress()_IPDetails()_IsFileOlder()_IsGUID()_IsHex()_IsPalindrome()_IsRegKey()_IsStringRegExp()_IsSystemDrive()_IsUPX()_IsValidType()_IsWebColor()_Language()_Log()_MicrosoftInternetConnectivity()_MSDNDataType()_PathFull/GetRelative/Split()_PathSplitEx()_PrintFromArray()_ProgressSetMarquee()_ReDim()_RockPaperScissors()/_RockPaperScissorsLizardSpock()_ScrollingCredits_SelfDelete()_SelfRename()_SelfUpdate()_SendTo()_ShellAll()_ShellFile()_ShellFolder()_SingletonHWID()_SingletonPID()_Startup()_StringCompact()_StringIsValid()_StringRegExpMetaCharacters()_StringReplaceWholeWord()_StringStripChars()_Temperature()_TrialPeriod()_UKToUSDate()/_USToUKDate()_WinAPI_Create_CTL_CODE()_WinAPI_CreateGUID()_WMIDateStringToDate()/_DateToWMIDateString()Au3 script parsingAutoIt SearchAutoIt3 PortableAutoIt3WrapperToPragmaAutoItWinGetTitle()/AutoItWinSetTitle()CodingDirToHTML5FileInstallrFileReadLastChars()GeoIP databaseGUI - Only Close ButtonGUI ExamplesGUICtrlDeleteImage()GUICtrlGetBkColor()GUICtrlGetStyle()GUIEventsGUIGetBkColor()Int_Parse() & Int_TryParse()IsISBN()LockFile()Mapping CtrlIDsOOP in AutoItParseHeadersToSciTE()PasswordValidPasteBinPosts Per DayPreExpandProtect GlobalsQueue()Resource UpdateResourcesExSciTE JumpSettings INISHELLHOOKShunting-YardSignature CreatorStack()Stopwatch()StringAddLF()/StringStripLF()StringEOLToCRLF()VSCROLLWM_COPYDATAMore Examples...

Updated: 04/09/2015

Share this post


Link to post
Share on other sites

Why not use something like WinMerge if it's that urgent.

Unfortunately , winmerge does not give me the flexibility to adjust the what to compare and what are differences i need to compare in the same line.

For now i was hoping to learn automating any process with Autoit , and have " Hopefully " solid coding skills in the future.

Thank you

Share this post


Link to post
Share on other sites

For now i was hoping to learn automating any process with Autoit , and have " Hopefully " solid coding skills in the future.

Have you actually coded anything? The only code I see in this thread has been provided by people other than yourself.


Whenever someone says "pls" because it's shorter than "please", I say "no" because it's shorter than "yes".

Share this post


Link to post
Share on other sites

Have you actually coded anything? The only code I see in this thread has been provided by people other than yourself.

Yes i did coded some routines and never worked for me ,,, ;)

Here's my code attached

#include <GUIConstantsEx.au3>
#include <WindowsConstants.au3>
#include <EditConstants.au3>
#include <misc.au3>
#include <file.au3>
;~ #NoTrayIcon
;=================================
Global $aArray1 = 0
Global $aArray2 = 0
Global $hDiff = -1
;=================================
_Singleton(@ScriptName, 0)
Dim $iNumber = 0
$VTC = ""
  If $VTC = "" Then _VTC_GUI()
Func _VTC_GUI()
$VTC_Production_Input=0
$VTC_Integration_Input=0
GuiCreate(" File Compare - Testing",520,128,-1,-1,$WS_BORDER,$WS_EX_ACCEPTFILES)
$VTC_Production_Label=GUICtrlCreateLabel("  Files - PRODUCTION ", 15, 12)
$VTC_Production_Input=GUICtrlCreateInput("", 180, 10, 210, 20)
GUICtrlSetData ($VTC_Production_Input, $VTC)
GUICTRLSetState ( $VTC_Production_Input, $GUI_DROPACCEPTED)
$VTC_Production_Browse_Button=GUICtrlCreateButton("Browse",400,8)
$VTC_Integration_Label=GUICtrlCreateLabel( "  Files - INTEGRATION ", 15, 42)
$VTC_Integration_Input=GUICtrlCreateInput("", 180, 40, 210, 20)
$VTC_Integration_Browse_Button=GUICtrlCreateButton("Browse",400,38)
GUICTRLSetState ( $VTC_Integration_Input, $GUI_DROPACCEPTED)
$Comparefiles=GuiCtrlCreateButton("Compare  Files",110,69)
$ConfigurationExitWithoutSaving=GuiCtrlCreateButton("Exit Without Comparing",260,69)
GUISetState()
While 1
  $msg=GuiGetMsg()
  If $msg=$VTC_Production_Browse_Button Then
   $VTC_Production_Browse_ButtonInput = FileOpenDialog("Select Production  File","", "All (*.Rep;*.Txt)")
   GUICtrlSetData($VTC_Production_Input, $VTC_Production_Browse_ButtonInput, "0")
  EndIf
  If $msg=$VTC_Integration_Browse_Button Then
   $VTC_Integration_Browse_ButtonInput = FileOpenDialog ("Select Integration  File","", "All (*.Rep;*.Txt)")
   GUICtrlSetData($VTC_Integration_Input, $VTC_Integration_Browse_ButtonInput, "0")
  EndIf
  If $msg=$Comparefiles Then
   ProgressOn("Compare Files", "Comparing Files...", "0% Complete", Default, (@DesktopHeight / 2) - (@DesktopHeight / 6) , 10)
   ;==================================================================================================================================
   $Production_VTC_Read = GUICTRLRead($VTC_Production_Input)
   $Integration_VTC_Read = GUICTRLRead($VTC_Integration_Input)
   _FileReadToArray($Production_VTC_Read,$aArray1)
   _FileReadToArray($Integration_VTC_Read,$aArray2)
   $hDiff = Fileopen("C:TempDifferences.txt",10)
   For $i = 1 To UBound($aArray1) - 1
    $iNumber = Round($i / UBound($aArray1) * 100, 2)
    ProgressSet($iNumber, $iNumber & "% Complete")
    If Mod($i, 5) = 0 Then
     $msg = GUIGetMsg()
     If $msg=$ConfigurationExitWithoutSaving Then
      $ExitDialog = MsgBox(36, "Are You Sure?", "Are you sure you want to exit?")
      If $ExitDialog = 6 then Exit
     EndIf
    EndIf
    $sID1 = StringLeft($aArray1[$i],StringInStr($aArray1[$i],"ý",1,12)-1)
      For $j = 1 To UBound($aArray2) - 1
    $sID2 = StringLeft($aArray2[$j],StringInStr($aArray2[$j],"ý",1,12)-1)
      if  $sID1 == $sID2 Then ;Check if the first part is the same
      ;then check if anything else on the lines is different
      if $aArray1[$i] <> $aArray2[$j] Then
       fileWriteLine($hDiff, " File Production  Line #"  & $i & ": " & $aArray1[$i] & @CRLF &  " File Integration Line #"  & $j & ": " & $aArray2[$j] & @CRLF & @CRLF)
      endif
     EndIf
     ;$msg=GuiGetMsg()
    Next
   Next
   ProgressOff()
   FileClose($hDiff)
   run("C:Program FilesIDM Computer SolutionsUltraEdituedit32.exe c:tempDifferences.txt","","")
   ;==================================================================================================================================
  EndIf
      If $msg=$ConfigurationExitWithoutSaving Then
      $ExitDialog = MsgBox(36, "Are You Sure?", "Are you sure you want to exit?")
      If $ExitDialog = 6 then Exit
      EndIf
  Wend
EndFunc
Exit

Share this post


Link to post
Share on other sites

Try this:

#include <SQLite.au3>
#include <SQLite.Dll.au3>
#include <Array.au3>
Main()

Func Main()
    ; init SQLite
    _SQLite_Startup()
_SQLite_SafeMode(0)  ; speed up SQLite UDF

    ; create a :memory: DB
    Local $hDB = _SQLite_Open()
    _SQLite_Exec($hDB,  "CREATE TABLE Strings (StrKey CHAR, Source INTEGER, Line integer, StrRest CHAR);")
    Local $dir = @ScriptDir & ""
    Local $file[2] = ["file1.txt", "file2.txt"]
    If @error Then Return
    ; process input files
    Local $txtstr, $strrestpos
    _SQLite_Exec($hDB, "begin;")
    For $i = 0 to 1
        ConsoleWrite("Processing file " & $dir & $file[$i] & @LF)
        _FileReadToArray($dir & $file[$i], $txtstr)
        ; process input lines
        If Not @error Then
            For $j = 1 To $txtstr[0]
    $strrestpos = StringInStr($txtstr[$j], 'ý', 2)
                _SQLite_Exec($hDB, "insert into Strings (Source, Line, StrKey, StrRest) values (" & _
           $i & "," & _
           $j & "," & _
           _SQLite_FastEscape(StringLeft($txtstr[$j], $strrestpos - 1)) & "," & _
           _SQLite_FastEscape(StringMid($txtstr[$j], $strrestpos)) & ");")
            Next
        EndIf
    Next
    _SQLite_Exec($hDB,  "CREATE index ixstrkey on Strings (StrKey collate nocase, Source, Line);")
    _SQLite_Exec($hDB, "commit;")
    ; create log file
    Local $nrows, $ncols, $hlog
    ConsoleWrite("Creating log file" & @LF)
$hlog = FileOpen($dir & "compare.log", 2)
    ; log orphan lines in files
For $i = 0 To 1
  $j = Int($i = 0)
  _SQLite_GetTable($hDB, "select 'Line ' || line || '" & @CRLF & "' || Strkey || strrest from Strings X where Source = " & $i & " and " & _
                                   "not exists (select 1 from Strings Y where Y.Strkey = X.Strkey and Y.Source != X.Source) order by line;", _
                          $txtstr, $nrows, $ncols)
  If $nrows Then
   FileWriteLine($hlog, "Orphan lines in " & $file[$i] & " :")
   _FileWriteFromArray($hlog, $txtstr, 2)
   FileWriteLine($hlog, @CRLF)
  EndIf
Next
; log differences
    _SQLite_GetTable($hDB, "select '" & $file[0] & "  line ' || X.line || '" & @CRLF & "' || X.Strkey || X.strrest || '" & @CRLF & _
                $file[1] & "  line ' || Y.line || '" & @CRLF & "' || Y.Strkey || Y.strrest || '" & @CRLF & "' " & _
         "from Strings X join Strings Y on X.Source = 0 and Y.Source = 1 and x.strkey = y.strkey and " & _
                                    "X.Strrest != Y.Strrest order by X.line;", _
                          $txtstr, $nrows, $ncols)
If $nrows Then
  FileWriteLine($hlog, "Differences :")
  _FileWriteFromArray($hlog, $txtstr, 2)
EndIf
FileClose($hlog)
EndFunc

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites

#20 ·  Posted (edited)

Thank you so much for you feedback and help jchd , i'm trying it but i think there's somthing broken on the code correct me if i'm wrong.

Between lines 26 and 33 i get errors.27) : ==> Unknown function name.:

$strrestpos = StringInStr($txtstr[$j], 'ý', 2)
                _SQLite_Exec($hDB, "insert into Strings (Source, Line, StrKey, StrRest) values (" & _
  
           $i & "," & _
           $j & "," & _
           _SQLite_FastEscape(StringLeft($txtstr[$j], $strrestpos - 1)) & "," & _
           _SQLite_FastEscape(StringMid($txtstr[$j], $strrestpos)) & ");")
            Next

Thank you

PCI

Edited by PCI

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0

  • Similar Content

    • SnArF
      By SnArF
      I had to compare two files with more than one million lines per file. I've tested several examples but all of them are too slow. Most of them are running for several hours to compare 1 million lines.   I have written a script that compare's 2 txt files with 1 million lines in less than 5 minutes. (After the files are loaded in an array) It writes the missing files to 2 textfiles.   It compares 10.000 lines in 1.8 sec, 100.000 lines in 21 sec, 1000.000 lines in 250 sec on my laptop.   The example script creates 2 array's with 1.000.000 lines and then remove's some entry's. At the end it writes 2 txt files with the missing lines per array.   Please test it and give commend's #include <array.au3> #include <Timers.au3> #include <file.au3> Local $NrOfRows = 1000000 ; Set number of rows to test Local $delString1 = 0 Local $delString2 = 0 Local $Array1[$NrOfRows] Local $Array2[$NrOfRows] $StartTime = _Timer_Init() $Timer = _Timer_Init() ; Creating 2 array's For $i = 0 to $NrOfRows - 1 $Array1[$i] = "Just some tekst to emulate data to compare " & $i Next $Array2 = $Array1 ConsoleWrite("Array's created in " & Round(_Timer_Diff($Timer)) & " milliseconds" & @CRLF) $Timer = _Timer_Init() ; removing some entry's from both array's to show functionality _ArrayDelete($Array1, "333;5555;7777") _ArrayDelete($Array2, "222;4444;6666") ConsoleWrite("Removed some value's in " & Round(_Timer_Diff($Timer)) & " milliseconds" & @CRLF) $Timer = _Timer_Init() ; You neede to sort the array is you use Binary Search _ArraySort($Array1, 0, 1, 0, 0, 1) ConsoleWrite("Sorted Array 1 in " & Round(_Timer_Diff($Timer)) & " milliseconds" & @CRLF) $Timer = _Timer_Init() ; comparing the 2 array's For $i = 0 to UBound($Array2) - 1 $Index = _ArrayBinarySearch($Array1, $Array2[$i], 1) ; add equal rows to a string If $Index <> -1 Then $delString1 &= ";" & $Index $delString2 &= ";" & $i EndIf Next ConsoleWrite("Array's compared in " & Round(_Timer_Diff($Timer)) & " milliseconds" & @CRLF) $Timer = _Timer_Init() ; removing the equal rows from the array's _ArrayDelete($Array1, $delString1) _ArrayDelete($Array2, $delString2) ConsoleWrite("removed equal rows in " & Round(_Timer_Diff($Timer)) & " milliseconds" & @CRLF) $Timer = _Timer_Init() ; writing the rsult to files _FileWriteFromArray("missing in array 1.txt", $Array2) _FileWriteFromArray("missing in array 2.txt", $Array1) ConsoleWrite("Write missing value's to File in " & Round(_Timer_Diff($Timer)) & " milliseconds" & @CRLF) $Timer = _Timer_Init() ConsoleWrite("Compare complete in " &Round(_Timer_Diff($StartTime)) & " milliseconds")
    • mLipok
      By mLipok
      If you need to compare two files using WinMerge, you can use the _WinMergeCompare2Files function as in the following example:
      _Example() Func _Example() FileCopy(@ScriptFullPath, @ScriptFullPath & '.txt') FileWrite(@ScriptFullPath & '.txt', @CRLF & 'TEST' & @CRLF & @CRLF) _WinMergeCompare2Files(@ScriptFullPath, @ScriptFullPath & '.txt') EndFunc ;==>_Example Func _WinMergeCompare2Files($sLeftFilePath, $sRightFilePath, $fWaitForWinMerge = True) ; Left is orginal , Right is new Local Const $sWinMergeParamsKey = 'HKEY_CURRENT_USER\Software\Thingamahoochie\WinMerge' If FileExists($sLeftFilePath) And FileExists($sRightFilePath) Then Local Const $sWinMergeExecutable = RegRead($sWinMergeParamsKey, 'Executable') Local Const $sFileName = StringRegExp($sLeftFilePath, '(?i).*\\(.*)', 3)[0] Local Const $sWinMergeParams = _ ' /u /e /xq /wl /maximize /dl "Original: ' & $sFileName & '" /dr "New version: ' & $sFileName & '" ' & _ '"' & $sLeftFilePath & '"' & ' ' & '"' & $sRightFilePath & '"' If $fWaitForWinMerge Then ShellExecuteWait($sWinMergeExecutable, $sWinMergeParams) Else ShellExecute($sWinMergeExecutable, $sWinMergeParams) EndIf EndIf EndFunc ;==>_WinMergeCompare2Files