Jump to content
Servant

How can I get the headings from a MS Word document?

Recommended Posts

Servant

How can I get a list of all the headings in a Microsoft Word document by using AutoIt?

I tried:

#include <Word.au3>
#include <MsgBoxConstants.au3>

Global Const $wdRefTypeHeading = 1 ; Heading

$Headings = $oDoc.GetCrossReferenceItems($wdRefTypeHeading)
$Count = UBound($Headings)

MsgBox($MB_SYSTEMMODAL, "Debug", $Count)

But it did not function well..

For example, it just get 1 heading from my rich document that have many headings!

I also tried this:

#include <Word.au3>
#include <MsgBoxConstants.au3>

$Count = $oDoc.Paragraphs.Count

For $i = 0 To $Count - 1
      $oRange = _Word_DocRangeSet($oDoc, -1, $wdParagraph, $i, $wdParagraph, 1)

      If StringInStr($oRange.text, "Header Text") Then
         MsgBox($MB_SYSTEMMODAL, "Debug", $oRange.Style)
      EndIf
Next

And this:

#include <Word.au3>
#include <MsgBoxConstants.au3>

$Count = $oDoc.Paragraphs.Count

For $i = 0 To $Count - 1
      $oRange = _Word_DocRangeSet($oDoc, -1, $wdSentence, $i, $wdSentence, 1)

      If StringInStr($oRange.text, "Header Text") Then
         MsgBox($MB_SYSTEMMODAL, "Debug", $oRange.Style)
      EndIf
Next 

But the Range.Style property didn't work in AutoIt..

Could someone help me how to get a list of all the headings in a Word document?

Share this post


Link to post
Share on other sites
water

#include <Word.au3> 
#include <MsgBoxConstants.au3>
Global Const $wdRefTypeHeading = 1 ; Heading 
$Headings = $oDoc.GetCrossReferenceItems($wdRefTypeHeading)
$Count = UBound($Headings)
MsgBox($MB_SYSTEMMODAL, "Debug", $Count)

Where do you set $oDoc in this code?


My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2018-12-03 - Version 1.4.11.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX (2018-10-31 - Version 1.3.4.1) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example Scripts
PowerPoint (2017-06-06 - Version 0.0.5.0) - Download - General Help & Support
Excel - Example Scripts - Wiki
Word - Wiki
 
Tutorials:

ADO - Wiki

 

Share this post


Link to post
Share on other sites
Servant

In the based index of a Word document:

$oWord = _Word_Create()
$oDoc = _Word_DocGet($oWord, 1)

Actually, it's working properly on a very simple Word document..

Maybe the Document.GetCrossReferenceItems method considered the styles of the headings from my rich and big document, as the styles of their parent styles, such as numbered items, etc... Because the headings on my document are also have another styles.

Share this post


Link to post
Share on other sites
water

I will test as soon as I'm in my office again.

  • Like 1

My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2018-12-03 - Version 1.4.11.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX (2018-10-31 - Version 1.3.4.1) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example Scripts
PowerPoint (2017-06-06 - Version 0.0.5.0) - Download - General Help & Support
Excel - Example Scripts - Wiki
Word - Wiki
 
Tutorials:

ADO - Wiki

 

Share this post


Link to post
Share on other sites
water

The document you posted has defined new heading styles (they are named _Headingx - note the leading "_"). That's why the GetCrossReferenceItems method doesn't list this "headings".

  • Like 1

My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2018-12-03 - Version 1.4.11.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX (2018-10-31 - Version 1.3.4.1) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example Scripts
PowerPoint (2017-06-06 - Version 0.0.5.0) - Download - General Help & Support
Excel - Example Scripts - Wiki
Word - Wiki
 
Tutorials:

ADO - Wiki

 

Share this post


Link to post
Share on other sites
water

This script searches for non-standard styles. Unfortunately you can't use wildcards to search for styles. You need to specify each style individually.

#include <Word.au3>
$oWord = _Word_Create()
Global $sDocument = @ScriptDir & "\Beginning_sample.docx"
$oDoc = _Word_DocOpen($oWord, $sDocument)
; Find first "_Heading 1"
$oRangeFound = _Word_DocFind($oDoc, Default, Default, Default, Default, Default, Default, Default, Default, Default, "_Heading 1")
ConsoleWrite($oRangeFound.Text & @LF)
; Find all "_Heading 1" till end of document
While 1
    $oRangeRound = _Word_DocFindEX($oDoc, Default, Default, $oRangeFound, Default, Default, Default, Default, Default, Default, "_Heading 1")
    If @error <> 0 Then ExitLoop
    ConsoleWrite($oRangeFound.Text & @LF)
WEnd
_Word_DocClose($oDoc)
_Word_Quit($oWord)

Func _Word_DocFindEX($oDoc, $sFindText = Default, $vSearchRange = Default, $oFindRange = Default, $bForward = Default, $bMatchCase = Default, $bMatchWholeWord = Default, $bMatchWildcards = Default, $bMatchSoundsLike = Default, $bMatchAllWordForms = Default, $vFormat = Default)
    Global $bFormat = False
    If $sFindText = Default Then $sFindText = ""
    If $vSearchRange = Default Then $vSearchRange = 0
    If $bForward = Default Then $bForward = True
    If $bMatchCase = Default Then $bMatchCase = False
    If $bMatchWholeWord = Default Then $bMatchWholeWord = False
    If $bMatchWildcards = Default Then $bMatchWildcards = False
    If $bMatchSoundsLike = Default Then $bMatchSoundsLike = False
    If $bMatchAllWordForms = Default Then $bMatchAllWordForms = False
    If Not IsObj($oDoc) Then Return SetError(1, 0, 0)
    Switch $vSearchRange
        Case -1
            $vSearchRange = $oDoc.Application.Selection.Range
        Case 0
            $vSearchRange = $oDoc.Range()
        Case Else
            If Not IsObj($vSearchRange) Then Return SetError(2, 0, 0)
    EndSwitch
    If $oFindRange = Default Then
        $oFindRange = $vSearchRange.Duplicate()
    Else
        If Not IsObj($oFindRange) Then Return SetError(3, 0, 0)
        If $bForward = True Then
            $oFindRange.Start = $oFindRange.End ; Search forward
            $oFindRange.End = $vSearchRange.End
        Else
            $oFindRange.End = $oFindRange.Start ; Search backward
            $oFindRange.Start = $vSearchRange.Start
        EndIf
    EndIf
    $oFindRange.Find.ClearFormatting()
    If $vFormat <> Default Then
        $bFormat = True
        $oFindRange.Find.Style = $vFormat
    EndIf
    $oFindRange.Find.Execute($sFindText, $bMatchCase, $bMatchWholeWord, $bMatchWildcards, $bMatchSoundsLike, _
            $bMatchAllWordForms, $bForward, $WdFindStop, $bFormat)
    If @error Or Not $oFindRange.Find.Found Then Return SetError(4, 0, 0)
    Return $oFindRange
EndFunc   ;==>_Word_DocFind
  • Like 1

My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2018-12-03 - Version 1.4.11.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX (2018-10-31 - Version 1.3.4.1) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example Scripts
PowerPoint (2017-06-06 - Version 0.0.5.0) - Download - General Help & Support
Excel - Example Scripts - Wiki
Word - Wiki
 
Tutorials:

ADO - Wiki

 

Share this post


Link to post
Share on other sites
Servant

How do you know the names of their heading styles?

And upon testing this code with my posted document:

$oRangeFound = _Word_DocFind($oDoc, Default, Default, Default, Default, Default, Default, Default, Default, Default, "_Heading 1")
ConsoleWrite($oRangeFound.Text & @LF)

It produce an error:

==> Variable must be of type "Object".:
ConsoleWrite($oRangeFound.Text & @LF)
ConsoleWrite($oRangeFound^ ERROR

Share this post


Link to post
Share on other sites
water

I opened the document and checked the used style of the heading.(it is being displayed in the ribbon).
 
My bad. You need to replace
"_Word_DocFind" with "_Word_DocFindEx".

  • Like 1

My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2018-12-03 - Version 1.4.11.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX (2018-10-31 - Version 1.3.4.1) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example Scripts
PowerPoint (2017-06-06 - Version 0.0.5.0) - Download - General Help & Support
Excel - Example Scripts - Wiki
Word - Wiki
 
Tutorials:

ADO - Wiki

 

Share this post


Link to post
Share on other sites
rossati

Hello

 

Thank Water, I used your script It works enough well, but it loops on a paragraph, my be becaus after there is a table.

Best regards

wordContextIndex.au3

cIndex.doc

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

  • Similar Content

    • rudi
      By rudi
      Hello,
      Propably not an absolute clean approach, (not checking/caring about little / big endian), but it's doing, what I need: Return the last modified time stamp including the milliseconds:
       
      #include <Date.au3> $file = "c:\temp\test.txt" ; file must already exist $TSLastModMs = GetFileLastModWithMs($file) ConsoleWrite('@@ Debug(' & @ScriptLineNumber & ') : $TSLastModMs = ' & $TSLastModMs & @CRLF & '>Error code: ' & @error & @CRLF) ;### Debug Console Func GetFileLastModWithMs($_FullFilePathName) local $h = _WinAPI_CreateFile($_FullFilePathName, 2, 2) local $aTS = _Date_Time_GetFileTime($h) _WinAPI_CloseHandle($h) local $aDate = _Date_Time_FileTimeToArray($aTS[2]) ; [2] = LastModified Return StringFormat("%04d-%02d-%02d %02d:%02d:%02d.%03d", $aDate[2], $aDate[0], $aDate[1], $aDate[3], $aDate[4], $aDate[5], $aDate[6]) EndFunc ;==>GetFileLastModWithMs >Running AU3Check (3.3.14.5) from:C:\Program Files (x86)\AutoIt3 input:C:\temp\filetime.au3 +>12:10:00 AU3Check ended.rc:0 >Running:(3.3.14.5):C:\Program Files (x86)\AutoIt3\autoit3.exe "C:\temp\filetime.au3" --> Press Ctrl+Alt+Break to Restart or Ctrl+Break to Stop @@ Debug(8) : $TSLastModMs = 2018-09-07 10:09:54.073 >Error code: 0 +>12:10:00 AutoIt3.exe ended.rc:0 +>12:10:00 AutoIt3Wrapper Finished. >Exit code: 0 Time: 0.9068 Regards, Rudi.
       
      --- original posting - what was my problem, below ---
       
      doing some search I found postings, stating that 2s will be the smallest time resolution for filegettime(). 2s seem to be fact for FAT as FS, NTFS has a much finer granularity.
       
      This posting states, that NTFS has a granularity of 100ns:
      https://superuser.com/questions/937380/get-creation-time-of-file-in-milliseconds
      is it possible to get more than just the "second" information? The reason, why I need this is, that I need to sort files by their creation sequence, and it can happen, that two files are created within the same second, so I cannot resolve their creation order without "millsecond info".
       
      Regards, Rudi.

       
      edit: I just tried PowerShell, there it's possible to retrieve even more than millisecond information:
       
      Millisecond : 336
      Ticks       : 636719150403363219
      TimeOfDay   : 11:04:00.3363219
       
      PS C:\Users\Rudi> echo test > test.txt PS C:\Users\Rudi> $(Get-ChildItem .\test.txt).creationtime | format-list Date : 07.09.2018 00:00:00 Day : 7 DayOfWeek : Friday DayOfYear : 250 Hour : 11 Kind : Local Millisecond : 336 Minute : 4 Month : 9 Second : 0 Ticks : 636719150403363219 TimeOfDay : 11:04:00.3363219 Year : 2018 DateTime : Freitag, 7. September 2018 11:04:00 regards, Rudi.
    • colombeen
      By colombeen
      Hi guys,
      I'm trying to get some information using WMI, from the Win32_EncryptableVolume class.
      I exec my query, filter out the C-drive, but when I need more info using the objects methods, I only get 1 value back and I can't seem to retrieve the other out params that should be there.
      A very minimal version of what I'm trying to do (no error checking etc, very basic). You need to start SciTE as admin or you won't see any results in the console!
      #RequireAdmin $strComputer = @ComputerName $objWMIService = ObjGet("winmgmts:{impersonationLevel=impersonate}!\\" & $strComputer & "\root\CIMV2\Security\MicrosoftVolumeEncryption") $objWMIQuery = $objWMIService.ExecQuery("SELECT * FROM Win32_EncryptableVolume WHERE DriveLetter='C:'", "WQL", 0) For $objDrive In $objWMIQuery ConsoleWrite("> " & $objDrive.GetConversionStatus() & @CRLF) ConsoleWrite("> " & $objDrive.GetConversionStatus().ConversionStatus & @CRLF) ConsoleWrite("> " & $objDrive.GetConversionStatus().EncryptionPercentage & @CRLF) Next The result from the console is : 
      > 0 > > What I'm expecting to get back is : 
      > 0 > 0 > 0 When using powershell I get this (run as admin is required!!!) : 
      PS C:\WINDOWS\system32> (Get-WmiObject -namespace "Root\cimv2\security\MicrosoftVolumeEncryption" -ClassName "Win32_Encryptablevolume" -Filter "DriveLetter='C:'").GetConversionStatus() ... ConversionStatus : 0 EncryptionFlags : 0 EncryptionPercentage : 0 ReturnValue : 0 ... All I seem to be getting is the ReturnValue when I use the method.
      I've tried this on multiple methods, always ending up with the same result
      Anyone here who has experience with this type of thing?
       
      Greetz
      colombeen
    • lavascript
      By lavascript
      I have a Word document containing a 9-column table where row 1 is the column headers. My goal is to read the table into a 2d array, remove some rows, update some fields, and add a few rows to the end. The resulting array will likely be a different length. Next, I want to write the data back into the table. If it's easier, I can write the data to a new document from a template containing the same table header with a blank 2nd row.
      Here's my early attempt:
      Local $oWord = _Word_Create() Local $oDoc = _Word_DocOpen($oWord, $sFile) Local $aData = _Word_DocTableRead($oDoc, 1) $aData[3][5] = "Something else" Local $oRange = _Word_DocRangeSet($oDoc, 0) $oRange = _Word_DocRangeSet($oDoc, $oRange, $wdCell, 9) _Word_DocTableWrite($oRange,$aData) This, unfortunately, writes the entire array into the first cell of row 2. What am I doing wrong?
       
    • Subz
      By Subz
      Backstory:
      Our Microsoft Office Templates shared folder was changed from a DFS share to an Isilon share. example:
      Old Server: \\Domain.com\Office\Templates
      New Server: \\Templates.domain.com\Office\Templates
      The team making the changes overlooked that several hundred thousand documents, had been attached to the old template documents.  So when you open a document which has been attached, it will take a couple of minutes to open, while it tries to locate the old server path.  I've been asked to come in and fix it, so after several hours found that the data is being held in document.zip\word\_rels\settings.xml.rels, I now need to replace the old server path with the new server path.  I didn't want to use dom as that would take too long and found a tool wtc https://github.com/NeosIT/wtc which  works perfectly, takes about 8 minutes to scan a single directory with 4000 documents and fix them.  The problem is the documents are all held on sharepoint and they want to retain the file timestamp, which is easy enough, but they also don't want to keep the "Modified By" apparently they don't like seeing all the documents appearing as "Modified by: Subz"  Anyone know of way to retain the "Modified By" info,
    • FrancescoDiMuro
      By FrancescoDiMuro
      Good evening everyone
      I am working with Word UDF ( thanks @water! ), and, especially, with the function _Word_DocFindReplace().
      The replace does work everywhere in the document, but, it does not work in Headers or Footers.
      Am I missing something or am I forced to use the code below?
      I have already looked in the Help file ( about _Word_DocFindReplace() ), but there are no mentions about replace text in Headers/Footers.
      Sub FindAndReplaceFirstStoryOfEachType() Dim rngStory As Range For Each rngStory In ActiveDocument.StoryRanges With rngStory.Find .Text = "find text" .Replacement.Text = "I'm found .Wrap = wdFindContinue .Execute Replace:=wdReplaceAll End With Next rngStory End Sub Thanks everyone in advance


      Best Regards.
×