Jump to content

Recommended Posts

Posted

Hello everyone,

I was able to make a script that compares 2 txt files and notify me of the differences between the 2 txt files.

 

I did this by using _FileReadToArray() on each txt file and then comparing both arrays for differences.

But in the word.au3 UDF, I don't see a siilar function to _FileReadToArray().

How would I go about creating a script to compare 2 word documents and having the script telling me the differences between both word docs?

Thanks

Posted

Comparing Word documents is much more complex than comparing simple text files.

Can you describe what you try to do? How about different formatting, tables etc.?

My UDFs and Tutorials:

  Reveal hidden contents

 

Posted (edited)

several resources at your disposal:

1. Word 2010+ built-in function to visually compare documents side-by-side (Review > Compare)

2. Word COM property or method to retrieve the content of documents (absent from the Word UDF it seems - perhaps water can comment on that?)

3. Word COM method to compare documents, with or without formatting: 

http://msdn.microsoft.com/en-us/library/ff192559(v=office.14).aspx

4. use 7zip to extract the content file from the Word file (DOCX format only!) and parse it for content

5. external utility to extract only text from Word documents (and other formats):

http://freemind.s57.xrea.com/xdocdiffPlugin/en/

6. this is also used as a plugin for WinMerge - highly recommended full featured diff tool.

perhaps other ways exist too. my favorite is WinMerge.

 

and of course as water commented, you should better describe your purpose. what changes interest you: text? formatting? tables? charts? authors? etc.

Edited by orbs

Signature - my forum contributions:

  Reveal hidden contents

 

Posted

Thanks for replying, both of you.

Simply comparing the text difference in each doc is good enough for me. The word doc's formatting, tables, chars or authors can be ignored.

Is there a way to do it without downloading WinMerge?

Here's an example of my script that simply compares text in .txt documents.

#include <File.au3>
#include <Array.au3>
#include <GUIConstantsEx.au3>
#include <GuiListView.au3> ;For making the clear listview function work
#include <GuiListBox.au3> ;For making a guictrlcreatelist
#include <EditConstants.au3> ;Allows positioning of gui input box text
#include <StaticConstants.au3> ;Allows formatting of gui labels
#include <WindowsConstants.au3>

Global $GUI = GuiCreate("Cute Fluffy Hamster Text Compare Tool", 890, 550, 0, 0) ;Creates the GUI

Dim $File1Array[10], $File2Array[10]

Global $FileInfo1 = GUICTRLCREATEListView("Line # | File Document #1", 10, 50, 430, 385, -1) ;The keyboard information box in the middle of the GUI
_GUICtrlListView_SetColumnWidth($FileInfo1, 0, 50) ;Increases the width of the "Line #" column
_GUICtrlListView_SetColumnWidth($FileInfo1, 1, 380) ;Increases the width of the "Document #1" column
_GUICtrlListView_JustifyColumn($FileInfo1, 0, 0) ;"Places the "Document #1" word in the center

Global $FileInfo2 = GUICTRLCREATEListView("Line # | File Document #2", 450, 50, 430, 385, -1) ;The keyboard information box in the middle of the GUI
_GUICtrlListView_SetColumnWidth($FileInfo2, 0, 50) ;Increases the width of the "Line #" column
_GUICtrlListView_SetColumnWidth($FileInfo2, 1, 380) ;Increases the width of the "Document #2" column
_GUICtrlListView_JustifyColumn($FileInfo2, 0, 0) ;"Places the "Document #1" word in the center

Global $Compare = GUICtrlCreateButton("Compare!", 780, 15, 100, 30)

Global $Input1 = GUICTRLCREATEInput("", 120, 20, 265, 22, $ES_READONLY) ;The search term input field
Global $ChooseFile1 = GUICtrlCreateButton("Select File #1", 10, 15, 100, 28)

Global $Input2 = GUICTRLCREATEInput("", 505, 20, 265, 22, $ES_READONLY) ;The search term input field
Global $ChooseFile2 = GUICtrlCreateButton("Select File #2", 395, 15, 100, 28)

Global $Exit = GUICtrlCreateButton("Exit", 10, 495, 870, 50)

Global $InfoConsole = GUICTRLCREATELIST("", 10, 440, 870, 60, -1, $SS_ETCHEDFRAME) ;The console info that describes all the changes



GUISetState(@SW_SHOW) ;Makes the GUI Appear


While 1
   Sleep(10)

   Switch GUIGetMsg()


            Case $GUI_EVENT_CLOSE ;If the "x" button on the GUI is clicked then exit while loop (which will lead to the last line of code which tells GUI to close)
                ExitLoop

                  Case $Exit ;If the exit button is pushed then close the GUI
                ExitLoop

            Case $ChooseFile1
                Local $FileOpen1 = FileOpenDialog("Choose 1st File", @WindowsDir & "\", "Text (*.txt)|Documents (*.doc;*.docx)", $FD_FILEMUSTEXIST + $FD_MULTISELECT)
                If @error Then
                   MsgBox($MB_SYSTEMMODAL, "", "No file(s) were selected.")
                   FileChangeDir(@ScriptDir)
                Else
                   FileChangeDir(@ScriptDir)
                   GUICtrlSetData($Input1, $FileOpen1) ;Change input box to display the opened file location
                   _FileReadToArray ($FileOpen1, $File1Array) ;Stores the contents of the file in an array

                EndIf

            Case $ChooseFile2
                Local $FileOpen2 = FileOpenDialog("Choose 2nd File", @WindowsDir & "\", "Text (*.txt)|Documents (*.doc;*.docx)", $FD_FILEMUSTEXIST + $FD_MULTISELECT)
                If @error Then
                   MsgBox($MB_SYSTEMMODAL, "", "No file(s) were selected.")
                   FileChangeDir(@ScriptDir)
                Else
                   FileChangeDir(@ScriptDir)
                   GUICtrlSetData($Input2, $FileOpen2) ;Change input box to display the opened file location
                   _FileReadToArray ($FileOpen2, $File2Array) ;Stores the contents of the file in an array

                EndIf



            Case $Compare

            If GUICtrlRead($Input1) <> "" and  GUICtrlRead($Input2) <> "" Then ;If there's files in both input boxes then do the comparing

            _GUICtrlListView_DeleteAllItems($FileInfo1) ;Delete all current things on the first listview window
            _GUICtrlListView_DeleteAllItems($FileInfo2) ;Delete all current things on the second listview window
            _GUICtrlListBox_ResetContent($InfoConsole) ;Delete all current things on the console info window

            For $i = 1 To UBound ($File1Array) - 1

         Local $aResult = _ArrayFindAll($File2Array, $File1Array[$i]) ;Search the entire file 2 array looking for each item in file 1
         GUICtrlCreateListViewItem($i &"|"&$File1Array[$i], $FileInfo1) ;Populates the listview window with info

         if ubound($aResult) < 1 and $File1Array[$i] <> "" Then  ;If you find a string in file 2 that doesn't exist in file 1 (in other words, 0 instances of it) Then

            GUICtrlSetBkColor(-1, 0xFFFF00) ;Sets a diff background color indicating the difference

            GUICtrlSetData($InfoConsole, "File #1 contains the word '"&$File1Array[$i]&"' (as seen on line "&$i& ") while File #2 does not contain that word.")
            GUICtrlSetFont($InfoConsole,10)
            GUICtrlSetColor($InfoConsole, 0xFF0000)
            EndIf
         Next

         For $j = 1 To UBound ($File2Array) - 1

         Local $bResult = _ArrayFindAll($File1Array, $File2Array[$j]) ;Search the entire file 1 array looking for each item in file 2
          GUICtrlCreateListViewItem($j &"|"&$File2Array[$j], $FileInfo2) ;Populates the listview window with info

         if ubound($bResult) < 1 and $File2Array[$j] <> "" Then  ;If you find a string in file 1 that doesn't exist in file 2 (in other words, 0 instances of it) Then

            GUICtrlSetBkColor(-1, 0xFFFF00) ;Sets a diff background color indicating the difference_GUICtrlListBox_AddString($InfoConsole, "File #2 contains the word '"&$File2Array[$j]&"' on line "&$j& " while File #1 does not contain that word.")

            GUICtrlSetData($InfoConsole, "File #2 contains the word '"&$File2Array[$j]&"' (as seen on line "&$j& ") while File #1 does not contain that word.")
            GUICtrlSetFont($InfoConsole,10)
            GUICtrlSetColor($InfoConsole, 0xFF0000)
            EndIf
         Next




Else ;If there isn't files in both input boxes then display message
    MsgBox(0,"File(s) Needed", "Please select 2 files")

               EndIf


                EndSwitch

Wend

Even though I allowed .doc's to be an option when you select a file, don't select a doc file :P

And Yes I purposely made my exit button that big :D

Posted

Hi Exit,

That was going to be my last resort.

It's just that I prefer not to generate 2 extra files for the user, but if it has to be that way, then so be it

-Brian

Posted (edited)

i think your best choice would be this:

  On 6/29/2014 at 6:14 PM, orbs said:

5. external utility to extract only text from Word documents (and other formats):

http://freemind.s57.xrea.com/xdocdiffPlugin/en/

 

(this can be used as a plugin for WinMerge, but it is a standalone app as well, with COM support)

  On 6/30/2014 at 1:01 AM, BlazerV60 said:

It's just that I prefer not to generate 2 extra files for the user, but if it has to be that way, then so be it

 

why? temp files are used all around the place, by practically all apps you can think of.

- if it's security concerns, wipe the temp files when you're done with them.

- if it's space concerns, text files are never that large to be concerned about - especially compared to their Word origin.

and these files are not for the user - you can have your user select a doc file, and before it is processed, your script can convert it to text. this is transparent to the end user.

 

EDIT: your script does not detect change of order of lines. b.t.w. it seems it checks lines, not words. you better rephrase the messages in the info console.

Edited by orbs

Signature - my forum contributions:

  Reveal hidden contents

 

Posted (edited)

Hey orbs,

Thanks for the feedback, I'll go with the text file route since that sounds simpler and you're right about how several programs create temp files.

And yeah my script only detects changes in lines atm, not each word individually. I'm working on fine tuning it to specifically look for each individual word right now.

Also, it seems that I can't see the scrollbar in my Info Console D:

Thanks again

Edited by BlazerV60
Posted (edited)

perhaps this is what you need?

#NoTrayIcon

$prog = "WordCMP"

if $CmdLine[0] < 2 then
    msgbox(0, $prog, "Use:  " & $prog & ' "<Full Path Name 1>" "<Full Path Name 2>"', 10 )
    exit
endif

$doc1 = $CmdLine[1]
$doc2 = $CmdLine[2]

if FileExists ( $doc1 ) == 0 then
    msgbox(0, $prog, "File " & $doc1 & " not found!")
    exit
endif

if FileExists ( $doc2 ) == 0 then
    msgbox(0, $prog, "File " & $doc2 & " not found!")
    exit
endif

RegWrite("HKEY_CURRENT_USER\Software\Classes\CLSID\{000209FE-0000-0000-C000-000000000046}\LocalServer32", "LocalServer32", "REG_MULTI_SZ", "']gAVn-}f(ZXfeAR6.jiWORDFiles>P`os,1@SW=P7v6GPl]Xh /safe /Automation")
RegWrite("HKEY_CURRENT_USER\Software\Classes\CLSID\{000209FF-0000-0000-C000-000000000046}\LocalServer32", "LocalServer32", "REG_MULTI_SZ", "']gAVn-}f(ZXfeAR6.jiWORDFiles>P`os,1@SW=P7v6GPl]Xh /safe /Automation")

_Msg("WordCMP running ...", 1)
$oWord = ObjCreate("Word.Application")
$oWord.Visible = 0

_Msg("Loading doc1 ...", 1)
$docA = $oWord.Documents.Open( $doc1)

_Msg("Loading doc2 ...", 1)
$docB = $oWord.Documents.Open( $doc2)

_Msg("Comparing doc1 and doc2 ...", 1)
$docC = $oWord.CompareDocuments($docA, $docB, 2, 1, 1, 1)

$docA.close
$docB.close

$oWord.Visible = 1
$oWord.DisplayAlerts = 0

_Msg($prog, 0)

RegDelete ("HKEY_CURRENT_USER\Software\Classes\CLSID\{000209FE-0000-0000-C000-000000000046}\LocalServer32")
RegDelete ("HKEY_CURRENT_USER\Software\Classes\CLSID\{000209FF-0000-0000-C000-000000000046}\LocalServer32")

Func _Msg($msg, $state)
    $Width = StringLen ($msg) * 8
    $Height = 40
    $left = @DesktopWidth - $Width - 10
    $top = @DesktopHeight - $Height - 40

    if $state = 1 then
        SplashTextOn ( "", $msg, $Width, $Height, $left, $top, 5, "Tahoma", 11)
    else
        SplashOff ( )
    EndIf
EndFunc
Edited by Melba23
Added code tags
  • Moderators
Posted

keldepulo,

When you post code please use Code tags - see here how to do it. Then you get a scrolling box and syntax colouring as you can see above now I have added the tags. ;)

M23

Public_Domain.png.2d871819fcb9957cf44f4514551a2935.png Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind

Open spoiler to see my UDFs:

  Reveal hidden contents

 

Posted

Hi keldepulo, I will definitely your method when I get home (I'm currently at work). I assume all Microsoft word versions have the .CompareDocuments method? :D

You also gave me the idea that if I do indeed decide to generate 2 extra files for the user, then I should place them somewhere in the HKEY_CURRENT_USER directory and then delete them once my program is done comparing. If I hadn't read read your post, I would have placed the 2 extra files somewhere in the same directory as the person's word docs or something D:

Posted
  On 6/30/2014 at 1:01 AM, BlazerV60 said:
It's just that I prefer not to generate 2 extra files for the user, but if it has to be that way, then so be it

 

Why ? The 2 files could be in a temp directory, just for reading them, you just have to delete after comparing them.

  • 4 years later...
Posted

Five years after the fact, I'd like to ask a question about the code in this post (above): 

I'm a super newbie to AutoIt, and keldepulo's code is probably not the correct place to start 🙂, but throwing caution to the wind, I'd like to ask two questions.

  1. Why are the RegWrite calls necessary. 
    1. I'm a little afraid to try this macro until I understand why it wants to modify my registry.
  2. Is it possible to add this to my Windows Explorer context menu? Ideally, I would want to be able to select to files, right-click, and choose Compare in Word.
    1. I can currently do this with Beyond Compare 3. 

Thanks in advance if anyone can help me out.

  • 1 month later...
Posted

Hi SoCalKen,

1 - Why are the RegWrite ï»¿calls necessary: they are not, that is necessary only to avoid winword macros to start, so automation go faster,
2 - I haven't tried. I will try and i'd answer

 

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...