Servant

How can I get the headings from a MS Word document?

10 posts in this topic

How can I get a list of all the headings in a Microsoft Word document by using AutoIt?

I tried:

#include <Word.au3>
#include <MsgBoxConstants.au3>

Global Const $wdRefTypeHeading = 1 ; Heading

$Headings = $oDoc.GetCrossReferenceItems($wdRefTypeHeading)
$Count = UBound($Headings)

MsgBox($MB_SYSTEMMODAL, "Debug", $Count)

But it did not function well..

For example, it just get 1 heading from my rich document that have many headings!

I also tried this:

#include <Word.au3>
#include <MsgBoxConstants.au3>

$Count = $oDoc.Paragraphs.Count

For $i = 0 To $Count - 1
      $oRange = _Word_DocRangeSet($oDoc, -1, $wdParagraph, $i, $wdParagraph, 1)

      If StringInStr($oRange.text, "Header Text") Then
         MsgBox($MB_SYSTEMMODAL, "Debug", $oRange.Style)
      EndIf
Next

And this:

#include <Word.au3>
#include <MsgBoxConstants.au3>

$Count = $oDoc.Paragraphs.Count

For $i = 0 To $Count - 1
      $oRange = _Word_DocRangeSet($oDoc, -1, $wdSentence, $i, $wdSentence, 1)

      If StringInStr($oRange.text, "Header Text") Then
         MsgBox($MB_SYSTEMMODAL, "Debug", $oRange.Style)
      EndIf
Next 

But the Range.Style property didn't work in AutoIt..

Could someone help me how to get a list of all the headings in a Word document?

Share this post


Link to post
Share on other sites



#include <Word.au3> 
#include <MsgBoxConstants.au3>
Global Const $wdRefTypeHeading = 1 ; Heading 
$Headings = $oDoc.GetCrossReferenceItems($wdRefTypeHeading)
$Count = UBound($Headings)
MsgBox($MB_SYSTEMMODAL, "Debug", $Count)

Where do you set $oDoc in this code?


My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2017-04-18 - Version 1.4.8.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX (NEW 2017-02-27 - Version 1.3.1.0) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2015-04-01 - Version 0.4.0.0) - Download - General Help & Support - Example Scripts
Excel - Example Scripts - Wiki
Word - Wiki
PowerPoint (2015-06-06 - Version 0.0.5.0) - Download - General Help & Support

Tutorials:
ADO - Wiki

 

Share this post


Link to post
Share on other sites

In the based index of a Word document:

$oWord = _Word_Create()
$oDoc = _Word_DocGet($oWord, 1)

Actually, it's working properly on a very simple Word document..

Maybe the Document.GetCrossReferenceItems method considered the styles of the headings from my rich and big document, as the styles of their parent styles, such as numbered items, etc... Because the headings on my document are also have another styles.

Share this post


Link to post
Share on other sites

I will test as soon as I'm in my office again.

1 person likes this

My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2017-04-18 - Version 1.4.8.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX (NEW 2017-02-27 - Version 1.3.1.0) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2015-04-01 - Version 0.4.0.0) - Download - General Help & Support - Example Scripts
Excel - Example Scripts - Wiki
Word - Wiki
PowerPoint (2015-06-06 - Version 0.0.5.0) - Download - General Help & Support

Tutorials:
ADO - Wiki

 

Share this post


Link to post
Share on other sites

The document you posted has defined new heading styles (they are named _Headingx - note the leading "_"). That's why the GetCrossReferenceItems method doesn't list this "headings".

1 person likes this

My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2017-04-18 - Version 1.4.8.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX (NEW 2017-02-27 - Version 1.3.1.0) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2015-04-01 - Version 0.4.0.0) - Download - General Help & Support - Example Scripts
Excel - Example Scripts - Wiki
Word - Wiki
PowerPoint (2015-06-06 - Version 0.0.5.0) - Download - General Help & Support

Tutorials:
ADO - Wiki

 

Share this post


Link to post
Share on other sites

This script searches for non-standard styles. Unfortunately you can't use wildcards to search for styles. You need to specify each style individually.

#include <Word.au3>
$oWord = _Word_Create()
Global $sDocument = @ScriptDir & "\Beginning_sample.docx"
$oDoc = _Word_DocOpen($oWord, $sDocument)
; Find first "_Heading 1"
$oRangeFound = _Word_DocFind($oDoc, Default, Default, Default, Default, Default, Default, Default, Default, Default, "_Heading 1")
ConsoleWrite($oRangeFound.Text & @LF)
; Find all "_Heading 1" till end of document
While 1
    $oRangeRound = _Word_DocFindEX($oDoc, Default, Default, $oRangeFound, Default, Default, Default, Default, Default, Default, "_Heading 1")
    If @error <> 0 Then ExitLoop
    ConsoleWrite($oRangeFound.Text & @LF)
WEnd
_Word_DocClose($oDoc)
_Word_Quit($oWord)

Func _Word_DocFindEX($oDoc, $sFindText = Default, $vSearchRange = Default, $oFindRange = Default, $bForward = Default, $bMatchCase = Default, $bMatchWholeWord = Default, $bMatchWildcards = Default, $bMatchSoundsLike = Default, $bMatchAllWordForms = Default, $vFormat = Default)
    Global $bFormat = False
    If $sFindText = Default Then $sFindText = ""
    If $vSearchRange = Default Then $vSearchRange = 0
    If $bForward = Default Then $bForward = True
    If $bMatchCase = Default Then $bMatchCase = False
    If $bMatchWholeWord = Default Then $bMatchWholeWord = False
    If $bMatchWildcards = Default Then $bMatchWildcards = False
    If $bMatchSoundsLike = Default Then $bMatchSoundsLike = False
    If $bMatchAllWordForms = Default Then $bMatchAllWordForms = False
    If Not IsObj($oDoc) Then Return SetError(1, 0, 0)
    Switch $vSearchRange
        Case -1
            $vSearchRange = $oDoc.Application.Selection.Range
        Case 0
            $vSearchRange = $oDoc.Range()
        Case Else
            If Not IsObj($vSearchRange) Then Return SetError(2, 0, 0)
    EndSwitch
    If $oFindRange = Default Then
        $oFindRange = $vSearchRange.Duplicate()
    Else
        If Not IsObj($oFindRange) Then Return SetError(3, 0, 0)
        If $bForward = True Then
            $oFindRange.Start = $oFindRange.End ; Search forward
            $oFindRange.End = $vSearchRange.End
        Else
            $oFindRange.End = $oFindRange.Start ; Search backward
            $oFindRange.Start = $vSearchRange.Start
        EndIf
    EndIf
    $oFindRange.Find.ClearFormatting()
    If $vFormat <> Default Then
        $bFormat = True
        $oFindRange.Find.Style = $vFormat
    EndIf
    $oFindRange.Find.Execute($sFindText, $bMatchCase, $bMatchWholeWord, $bMatchWildcards, $bMatchSoundsLike, _
            $bMatchAllWordForms, $bForward, $WdFindStop, $bFormat)
    If @error Or Not $oFindRange.Find.Found Then Return SetError(4, 0, 0)
    Return $oFindRange
EndFunc   ;==>_Word_DocFind
1 person likes this

My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2017-04-18 - Version 1.4.8.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX (NEW 2017-02-27 - Version 1.3.1.0) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2015-04-01 - Version 0.4.0.0) - Download - General Help & Support - Example Scripts
Excel - Example Scripts - Wiki
Word - Wiki
PowerPoint (2015-06-06 - Version 0.0.5.0) - Download - General Help & Support

Tutorials:
ADO - Wiki

 

Share this post


Link to post
Share on other sites

How do you know the names of their heading styles?

And upon testing this code with my posted document:

$oRangeFound = _Word_DocFind($oDoc, Default, Default, Default, Default, Default, Default, Default, Default, Default, "_Heading 1")
ConsoleWrite($oRangeFound.Text & @LF)

It produce an error:

==> Variable must be of type "Object".:
ConsoleWrite($oRangeFound.Text & @LF)
ConsoleWrite($oRangeFound^ ERROR

Share this post


Link to post
Share on other sites

I opened the document and checked the used style of the heading.(it is being displayed in the ribbon).
 
My bad. You need to replace
"_Word_DocFind" with "_Word_DocFindEx".

1 person likes this

My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2017-04-18 - Version 1.4.8.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX (NEW 2017-02-27 - Version 1.3.1.0) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2015-04-01 - Version 0.4.0.0) - Download - General Help & Support - Example Scripts
Excel - Example Scripts - Wiki
Word - Wiki
PowerPoint (2015-06-06 - Version 0.0.5.0) - Download - General Help & Support

Tutorials:
ADO - Wiki

 

Share this post


Link to post
Share on other sites

Hello

 

Thank Water, I used your script It works enough well, but it loops on a paragraph, my be becaus after there is a table.

Best regards

wordContextIndex.au3

cIndex.doc

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now

  • Similar Content

    • nikink
      By nikink
      Hi all,
      I have a bit of code that works on my old Win10 PC, that fails on my new Win10 PC, and I think the only significant difference is the version of Autoit - old PC has 3.3.12, new has 3.3.14.
      I couldn't find anything mentioned in the change logs though, so perhaps I'm wrong.
      Anyway, the code to replicate my issue is:
      Test('username', 'DOMAIN') ; THIS ERRORS: ;Test('localun', 'DOMAIN') ; THIS ERRORS: ;Test(' ', ' ') ; THIS ERRORS: ;Test('', '') ; THIS ERRORS: ;Test('localun', '') ; THIS ERRORS: ;Test('', 'DOMAIN') Func Test($un, $dom) $compName = 'PCNAME' $FullName = '.' $Description = '.' ; get the WIM object $objWMIService = ObjGet("winmgmts:\\" & $compName & "\root\cimv2") ; get default user full name and description $objAccount = $objWMIService.Get("Win32_UserAccount.Name='" & $un & "',Domain='" & $dom & "'") If IsObj($objAccount) Then $FullName = $objAccount.FullName $Description = $objAccount.Description EndIf ConsoleWrite($FullName & @CRLF) ConsoleWrite($Description & @CRLF) Return EndFunc  
      On my old PC this code will output just . and . for each of those line currently commented out. Which is fine.
      On my new PC any of those commented out lines of code cause an error, and the script won't even compile.
      $objAccount = $objWMIService.Get("Win32_UserAccount.Name='" & $un & "',Domain='" & $dom & "'") $objAccount = $objWMIService^ ERROR I'm very much a newb with the WMI stuff and objects, but it looks like the .Get property is failing when either $un or $dom aren't valid in v3.3.14, whereas in 3.3.12 the .Get would fail to return an object, which is then caught by the If statement.
      Am I on track with this? Is there some new/better way to code the example so that 3.3.14 will compile it?
    • Neonovaz
      By Neonovaz
      Hello
       
      Is there anyway to store word documents in Autoit GUI? For example I have a instruction sheet that I want to bundle up with the exe.

      So a user simply clicks the icon and the stored document will launch  (Something like how you can add objects like excel sheets in word documents )

      (I Know we can launch word files from script directory)

       
    • nacerbaaziz
      By nacerbaaziz
      Hi all
      I want a way to get the last key pressed.
      I have a program that works with keyboard shortcuts and I want to  give the permission for the user to edit shortcut keys depending on what suits him
      i  want to make read-only edit box and the program writes the latest shortcut key pressed
      Please help me,
      greetings to all
      And thanks in advance
    • Jury
      By Jury
      I've failed to find an example of _Word_DocFindReplace which searches for formatted text (I'm looking for stand alone paragraph marks that are formatted other than normal i.e. Bold Italic, Underlined). 
      The reason being that when converting a Word document to html one of the main problems in the results is that a stand alone paragraph mark is converted to an html space that retains the formatting ...>&nbsp;<... thus showing up as a underline _  in a browser when it should be blank.  I've played around with the script and got it to at least un-bold  the first paragraph mark regardless if it was bold or not but I'd like to clear all formatting from any stand alone paragraph marks in the whole document.  Below is what I've done so far (not much more than in the help file I'm afraid) .  Way down at the bottom of the _Word_DocFindReplace  help  text is this parameter but without any examples to be found :
      $bFormat   [optional] True to have the find operation locate formatting in addition to or instead of the find text (default = False) #include <MsgBoxConstants.au3> #include <Word.au3> $processing = @MyDocumentsDir & '\AutoIt_code\getter\processing\' Global $oWord = _Word_Create() Global $sTestfile = $processing & "Testing.docx" ConsoleWrite($sTestfile & @CRLF) Global $oDoc = _Word_DocOpen($oWord, $sTestfile) If @error Then Exit MsgBox($MB_SYSTEMMODAL, "ERROR", "Error opening file = '" & $sTestfile & "'" & @CRLF & "@error = " & @error & ", @extended = " & @extended) $oRangeFound = _Word_DocFind($oDoc, "^p", Default, Default) If @error Then Exit MsgBox($MB_SYSTEMMODAL, "Word UDF: _Word_DocFind Example", _ "Error locating paragraph control character in the document." & @CRLF & "@error = " & @error & ", @extended = " & @extended) $oRangeFound.Bold = False If @error Then Exit MsgBox($MB_SYSTEMMODAL, "Word UDF: _Word_DocFind Example", _ "Error inserting text after the paragraph control character in the document." & @CRLF & "@error = " & @error & _ ", @extended = " & @extended) MsgBox($MB_SYSTEMMODAL, "Word UDF: _Word_DocFind Example", "Paragraph control character successfully replaced." & @CRLF & _ "Text inserted in paragraph 2.")  
    • nacerbaaziz
      By nacerbaaziz
      hello guys
      how are you؟
      I hope to be fine.
      I have a question  please
      how do I get the menu item that was pressed without that contains a variable؟
      For example I have a menu  of Favorites and I want the script recognizes the existing path in the pressed item
      i'll repeat to tell the item does not contain a variable
      Is there any solution
      if you want to explain more I could write an example of what I want.
      Greetings to all,
      thanks in advance