Jump to content
Servant

How can I get the headings from a MS Word document?

Recommended Posts

Servant

How can I get a list of all the headings in a Microsoft Word document by using AutoIt?

I tried:

#include <Word.au3>
#include <MsgBoxConstants.au3>

Global Const $wdRefTypeHeading = 1 ; Heading

$Headings = $oDoc.GetCrossReferenceItems($wdRefTypeHeading)
$Count = UBound($Headings)

MsgBox($MB_SYSTEMMODAL, "Debug", $Count)

But it did not function well..

For example, it just get 1 heading from my rich document that have many headings!

I also tried this:

#include <Word.au3>
#include <MsgBoxConstants.au3>

$Count = $oDoc.Paragraphs.Count

For $i = 0 To $Count - 1
      $oRange = _Word_DocRangeSet($oDoc, -1, $wdParagraph, $i, $wdParagraph, 1)

      If StringInStr($oRange.text, "Header Text") Then
         MsgBox($MB_SYSTEMMODAL, "Debug", $oRange.Style)
      EndIf
Next

And this:

#include <Word.au3>
#include <MsgBoxConstants.au3>

$Count = $oDoc.Paragraphs.Count

For $i = 0 To $Count - 1
      $oRange = _Word_DocRangeSet($oDoc, -1, $wdSentence, $i, $wdSentence, 1)

      If StringInStr($oRange.text, "Header Text") Then
         MsgBox($MB_SYSTEMMODAL, "Debug", $oRange.Style)
      EndIf
Next 

But the Range.Style property didn't work in AutoIt..

Could someone help me how to get a list of all the headings in a Word document?

Share this post


Link to post
Share on other sites
water

#include <Word.au3> 
#include <MsgBoxConstants.au3>
Global Const $wdRefTypeHeading = 1 ; Heading 
$Headings = $oDoc.GetCrossReferenceItems($wdRefTypeHeading)
$Count = UBound($Headings)
MsgBox($MB_SYSTEMMODAL, "Debug", $Count)

Where do you set $oDoc in this code?


My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2018-06-01 - Version 1.4.9.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX (2018-01-27 - Version 1.3.3.1) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2015-04-01 - Version 0.4.0.0) - Download - General Help & Support - Example Scripts
Excel - Example Scripts - Wiki
Word - Wiki
PowerPoint (2015-06-06 - Version 0.0.5.0) - Download - General Help & Support

Tutorials:
ADO - Wiki

 

Share this post


Link to post
Share on other sites
Servant

In the based index of a Word document:

$oWord = _Word_Create()
$oDoc = _Word_DocGet($oWord, 1)

Actually, it's working properly on a very simple Word document..

Maybe the Document.GetCrossReferenceItems method considered the styles of the headings from my rich and big document, as the styles of their parent styles, such as numbered items, etc... Because the headings on my document are also have another styles.

Share this post


Link to post
Share on other sites
water

I will test as soon as I'm in my office again.

  • Like 1

My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2018-06-01 - Version 1.4.9.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX (2018-01-27 - Version 1.3.3.1) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2015-04-01 - Version 0.4.0.0) - Download - General Help & Support - Example Scripts
Excel - Example Scripts - Wiki
Word - Wiki
PowerPoint (2015-06-06 - Version 0.0.5.0) - Download - General Help & Support

Tutorials:
ADO - Wiki

 

Share this post


Link to post
Share on other sites
water

The document you posted has defined new heading styles (they are named _Headingx - note the leading "_"). That's why the GetCrossReferenceItems method doesn't list this "headings".

  • Like 1

My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2018-06-01 - Version 1.4.9.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX (2018-01-27 - Version 1.3.3.1) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2015-04-01 - Version 0.4.0.0) - Download - General Help & Support - Example Scripts
Excel - Example Scripts - Wiki
Word - Wiki
PowerPoint (2015-06-06 - Version 0.0.5.0) - Download - General Help & Support

Tutorials:
ADO - Wiki

 

Share this post


Link to post
Share on other sites
water

This script searches for non-standard styles. Unfortunately you can't use wildcards to search for styles. You need to specify each style individually.

#include <Word.au3>
$oWord = _Word_Create()
Global $sDocument = @ScriptDir & "\Beginning_sample.docx"
$oDoc = _Word_DocOpen($oWord, $sDocument)
; Find first "_Heading 1"
$oRangeFound = _Word_DocFind($oDoc, Default, Default, Default, Default, Default, Default, Default, Default, Default, "_Heading 1")
ConsoleWrite($oRangeFound.Text & @LF)
; Find all "_Heading 1" till end of document
While 1
    $oRangeRound = _Word_DocFindEX($oDoc, Default, Default, $oRangeFound, Default, Default, Default, Default, Default, Default, "_Heading 1")
    If @error <> 0 Then ExitLoop
    ConsoleWrite($oRangeFound.Text & @LF)
WEnd
_Word_DocClose($oDoc)
_Word_Quit($oWord)

Func _Word_DocFindEX($oDoc, $sFindText = Default, $vSearchRange = Default, $oFindRange = Default, $bForward = Default, $bMatchCase = Default, $bMatchWholeWord = Default, $bMatchWildcards = Default, $bMatchSoundsLike = Default, $bMatchAllWordForms = Default, $vFormat = Default)
    Global $bFormat = False
    If $sFindText = Default Then $sFindText = ""
    If $vSearchRange = Default Then $vSearchRange = 0
    If $bForward = Default Then $bForward = True
    If $bMatchCase = Default Then $bMatchCase = False
    If $bMatchWholeWord = Default Then $bMatchWholeWord = False
    If $bMatchWildcards = Default Then $bMatchWildcards = False
    If $bMatchSoundsLike = Default Then $bMatchSoundsLike = False
    If $bMatchAllWordForms = Default Then $bMatchAllWordForms = False
    If Not IsObj($oDoc) Then Return SetError(1, 0, 0)
    Switch $vSearchRange
        Case -1
            $vSearchRange = $oDoc.Application.Selection.Range
        Case 0
            $vSearchRange = $oDoc.Range()
        Case Else
            If Not IsObj($vSearchRange) Then Return SetError(2, 0, 0)
    EndSwitch
    If $oFindRange = Default Then
        $oFindRange = $vSearchRange.Duplicate()
    Else
        If Not IsObj($oFindRange) Then Return SetError(3, 0, 0)
        If $bForward = True Then
            $oFindRange.Start = $oFindRange.End ; Search forward
            $oFindRange.End = $vSearchRange.End
        Else
            $oFindRange.End = $oFindRange.Start ; Search backward
            $oFindRange.Start = $vSearchRange.Start
        EndIf
    EndIf
    $oFindRange.Find.ClearFormatting()
    If $vFormat <> Default Then
        $bFormat = True
        $oFindRange.Find.Style = $vFormat
    EndIf
    $oFindRange.Find.Execute($sFindText, $bMatchCase, $bMatchWholeWord, $bMatchWildcards, $bMatchSoundsLike, _
            $bMatchAllWordForms, $bForward, $WdFindStop, $bFormat)
    If @error Or Not $oFindRange.Find.Found Then Return SetError(4, 0, 0)
    Return $oFindRange
EndFunc   ;==>_Word_DocFind
  • Like 1

My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2018-06-01 - Version 1.4.9.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX (2018-01-27 - Version 1.3.3.1) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2015-04-01 - Version 0.4.0.0) - Download - General Help & Support - Example Scripts
Excel - Example Scripts - Wiki
Word - Wiki
PowerPoint (2015-06-06 - Version 0.0.5.0) - Download - General Help & Support

Tutorials:
ADO - Wiki

 

Share this post


Link to post
Share on other sites
Servant

How do you know the names of their heading styles?

And upon testing this code with my posted document:

$oRangeFound = _Word_DocFind($oDoc, Default, Default, Default, Default, Default, Default, Default, Default, Default, "_Heading 1")
ConsoleWrite($oRangeFound.Text & @LF)

It produce an error:

==> Variable must be of type "Object".:
ConsoleWrite($oRangeFound.Text & @LF)
ConsoleWrite($oRangeFound^ ERROR

Share this post


Link to post
Share on other sites
water

I opened the document and checked the used style of the heading.(it is being displayed in the ribbon).
 
My bad. You need to replace
"_Word_DocFind" with "_Word_DocFindEx".

  • Like 1

My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2018-06-01 - Version 1.4.9.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX (2018-01-27 - Version 1.3.3.1) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2015-04-01 - Version 0.4.0.0) - Download - General Help & Support - Example Scripts
Excel - Example Scripts - Wiki
Word - Wiki
PowerPoint (2015-06-06 - Version 0.0.5.0) - Download - General Help & Support

Tutorials:
ADO - Wiki

 

Share this post


Link to post
Share on other sites
rossati

Hello

 

Thank Water, I used your script It works enough well, but it loops on a paragraph, my be becaus after there is a table.

Best regards

wordContextIndex.au3

cIndex.doc

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

  • Similar Content

    • lavascript
      By lavascript
      I have a Word document containing a 9-column table where row 1 is the column headers. My goal is to read the table into a 2d array, remove some rows, update some fields, and add a few rows to the end. The resulting array will likely be a different length. Next, I want to write the data back into the table. If it's easier, I can write the data to a new document from a template containing the same table header with a blank 2nd row.
      Here's my early attempt:
      Local $oWord = _Word_Create() Local $oDoc = _Word_DocOpen($oWord, $sFile) Local $aData = _Word_DocTableRead($oDoc, 1) $aData[3][5] = "Something else" Local $oRange = _Word_DocRangeSet($oDoc, 0) $oRange = _Word_DocRangeSet($oDoc, $oRange, $wdCell, 9) _Word_DocTableWrite($oRange,$aData) This, unfortunately, writes the entire array into the first cell of row 2. What am I doing wrong?
       
    • Subz
      By Subz
      Backstory:
      Our Microsoft Office Templates shared folder was changed from a DFS share to an Isilon share. example:
      Old Server: \\Domain.com\Office\Templates
      New Server: \\Templates.domain.com\Office\Templates
      The team making the changes overlooked that several hundred thousand documents, had been attached to the old template documents.  So when you open a document which has been attached, it will take a couple of minutes to open, while it tries to locate the old server path.  I've been asked to come in and fix it, so after several hours found that the data is being held in document.zip\word\_rels\settings.xml.rels, I now need to replace the old server path with the new server path.  I didn't want to use dom as that would take too long and found a tool wtc https://github.com/NeosIT/wtc which  works perfectly, takes about 8 minutes to scan a single directory with 4000 documents and fix them.  The problem is the documents are all held on sharepoint and they want to retain the file timestamp, which is easy enough, but they also don't want to keep the "Modified By" apparently they don't like seeing all the documents appearing as "Modified by: Subz"  Anyone know of way to retain the "Modified By" info,
    • FrancescoDiMuro
      By FrancescoDiMuro
      Good evening everyone
      I am working with Word UDF ( thanks @water! ), and, especially, with the function _Word_DocFindReplace().
      The replace does work everywhere in the document, but, it does not work in Headers or Footers.
      Am I missing something or am I forced to use the code below?
      I have already looked in the Help file ( about _Word_DocFindReplace() ), but there are no mentions about replace text in Headers/Footers.
      Sub FindAndReplaceFirstStoryOfEachType() Dim rngStory As Range For Each rngStory In ActiveDocument.StoryRanges With rngStory.Find .Text = "find text" .Replacement.Text = "I'm found .Wrap = wdFindContinue .Execute Replace:=wdReplaceAll End With Next rngStory End Sub Thanks everyone in advance


      Best Regards.
    • Seminko
      By Seminko
      I'm trying to get data from http://poe.trade/ - disclaimer, although this site is about a game, my script will not in any way interact directly with the game in any way. The script is just to get data from the site.
      To explain how it works - you submit a POST request and a custom URL is returned, then you do a GET request on that URL and you get the final URL you want.
       
      First issue:
      Now, I've tried doing so by using https://apitester.com/ and the first phase works. Here's how it looks like at APITester:
      Request Headers POST /search HTTP/1.1 Host: poe.trade Accept: */* User-Agent: Rigor API Tester Content-Length: 43 Content-Type: application/x-www-form-urlencoded Request Body online=x&name=kaom%27s%20heart&league=incursion When I submit this, the response I get is this:
      <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN"> <title>Redirecting...</title> <h1>Redirecting...</h1> <p>You should be redirected automatically to target URL: <a href="http://poe.trade/search/ioritewoteteme">http://poe.trade/search/ioritewoteteme</a>. If not click the link. So I then do a GET request for 'http://poe.trade/search/ioritewoteteme', which results in this response:
      <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN"> <title>Redirecting...</title> <h1>Redirecting...</h1> <p>You should be redirected automatically to target URL: <a href="http://poe.trade/search/inamotezuakito">http://poe.trade/search/inamotezuakito</a>. If not click the link. Great, this link (http://poe.trade/search/inamotezuakito) is exactly what we want.
      However, when I try to do the same in autoit, the result is quite different:
      Global Const $HTTP_STATUS_OK = 200 $test = HttpPost("http://poe.trade/search", "/online=x&name=kaom%27s%20heart&league=incursion") ClipPut($test) MsgBox(1, "", $test) Func HttpPost($sURL, $sData = "") Local $oHTTP = ObjCreate("WinHttp.WinHttpRequest.5.1") $oHTTP.Open("POST", $sURL, False) If (@error) Then Return SetError(1, 0, 0) $oHTTP.SetRequestHeader("Host", "poe.trade") $oHTTP.SetRequestHeader("User-Agent", "Rigor API Tester") $oHTTP.SetRequestHeader("Accept", "*/*") $oHTTP.SetRequestHeader("Content-Type", "application/x-www-form-urlencoded") $oHTTP.Send($sData) If (@error) Then Return SetError(2, 0, 0) If ($oHTTP.Status <> $HTTP_STATUS_OK) Then Return SetError(3, 0, 0) Return SetError(0, 0, $oHTTP.ResponseText) EndFunc The code above returns: ' 謟 '
      Any ideas as to what I am doing incorrectly?
       
      Second issue:
      Once I get the final link using APITester and do a GET on that i get a bunch of hieroglyphs. A friend of mine advised that the data is GZiped, which is a pain in the butt to be honest. However, apparently curl can uncompres that.
      How would I go about it?
       
      Thanks
    • Benandro
      By Benandro
      Hello,
      im working on a Script that should change a high amount of Word Templates at once.
      Target is to open each Templatefile (.dotx) in a specific folder and do the following steps:
      Add a page break at the end of the document (works) Add a text on the created Page (works) Change the headerstyle to blank for the new page and the following (missing) Add a heading between two specific headings (missing) Can please someone help me to add the 2 functions to the script?
       
      #include <word.au3> #include <File.au3> #include <array.au3> ; wdGoToDirection Const $wdGoToNext = 2 ; wdGoToItem Const $wdGoToPage = 1 ; Created a logfile for tracking/error reporting on my local desktop, though anywhere would work. Needs to be changed or it will error. Global $LogFile = FileOpen("c:\logfiles\test.log", 1) ; This is the network path, change it or this will error as it is. ListFiles ("D:\Templates\") Global $loopend=$aFileList[0] ; Creates an instance of Word for the program to use. Logs any errors associated. Global $oWord = _Word_Create(False, False) If @error <> 0 Then Exit _FileWriteLog($LogFile, "Error creating a new Word application object. @error = " & @error & ", @extended = " & @extended & @crlf) If @extended = 1 Then _FileWriteLog ($LogFile, "MS Word was not running when _Word_Create was called." & @CRLF) Else _FileWriteLog ($LogFile, "MS Word was already running when _Word_Create was called." & @CRLF) EndIf ; Logs and begins loop _FileWriteLog ($LogFile, "Beginning Loop." & @CRLF) For $looper = 1 to $loopend Step +1 _FileWriteLog ($LogFile, "Modifying file: " & $aFileList[$looper], " ") OpenAndModify ("D:\Templates\" & $aFileList[$looper]) Next ; Closes instance of Word _Word_Quit ($oWord) _FileWriteLog ($LogFile, "Program Completed.") ; Begins Function section ; Two functions, Listfiles and OpenAndModify Func ListFiles($FolderPath) ; Function puts all files in the network folder into an array. Logs any errors. _FileWriteLog ($LogFile, "Getting File Information for: " & $FolderPath & @crlf) Global $aFileList = _FileListToArray($FolderPath, "*") If @error = 1 Then _FileWriteLog($LogFile, "Path was invalid." & @crlf) EndIf If @error = 4 Then _FileWriteLog ($LogFile, "No file(s) were found." & @crlf) EndIf EndFunc Func OpenAndModify ($sDocument) ; Function opens file and changes the Page Setup ; Opens the Document Local $oDoc = _Word_DocOpen ($oWord, $sDocument, Default, Default, Default) If @error <> 0 Then _FileWriteLog ($LogFile, "Error opening " & $sDocument & " @error = " & @error & ", @extended = " & @extended & @crlf) & Exit ; Changes Tray Settings ;$oDoc.PageSetup.FirstPageTray = 0 ;$oDoc.PageSetup.OtherPagesTray = 0 ; Add a link to the end of the document and set parameters ; ScreenTip and TextToDisplay Local $oRange = _Word_DocRangeSet($oDoc, -2); Go to end of document $oRange.InsertBreak($wdPageBreak) ;MsgBox($MB_SYSTEMMODAL, "Word UDF: _Word_DocRangeSet Example", "Inserted a break.") $oRange.Text = "«Text»" ; Add a space at the end of the document $oRange = _Word_DocRangeSet($oDoc, -2) If @error Then Exit MsgBox($MB_SYSTEMMODAL, "Word UDF: _Word_DocLinkAdd Example", _ "Error adding a link to the document." & @CRLF & "@error = " & @error & ", @extended = " & @extended) MsgBox($MB_SYSTEMMODAL, "Word UDF: _Word_DocLinkAdd Example", "Baustein wurde an das Ende des Dokuments eingefügt.") ; Saves the document _Word_DocSave($oDoc) _FileWriteLog ($LogFile, "Modification of" & $sDocument & " complete." & @CRLF) EndFunc  
×