Jump to content

Grabbing Text from Word


Recommended Posts

I've got a script that opens Word. I need the script to find the word Author: in the document then get all the text to the right of that word but not inlcude Author: . So if this was in the document...

Author: johndoe

I'd need to grab johndoe and assign it to a variable.

Any ideas would be appreciated.

Link to comment
Share on other sites

You may want to check out my Word Automation Library (check signature for link). When I get home tonight I will see what I can whip up for you.

i converted some VBA code that works in word, but not returning the right value for me....

$myword = ObjGet("","word.application")
If Not IsObj($myword) Then
    $myword = ObjCreate("word.application")
EndIf
    $file = FileOpenDialog("Choose word document",@MyDocumentsDir,"*.doc (Word Documents)","","*.doc")
    $mydoc = $myword.Documents.Open($file)
    MsgBox(0,"Author",$myword.activedocument.builtindocumentproperties(3))
Link to comment
Share on other sites

Hi,

Here's one way;

;DocContent.au3
#include <Word.au3>
$s_FilePath1 = @ScriptDir & "\try mydoc.doc"
$oWordApp = _WordCreate ("",0,0)
$oDoc = _WordDocOpen ($oWordApp, $s_FilePath1)
$oContent = _WordGetText($oDoc)
_WordQuit ($oWordApp, 0)
If $oContent Then
    $oContentLine = StringSplit($oContent, @CRLF)
    For $i = 1 To UBound($oContentLine) - 1
        If StringInStr($oContentLine[$i], "Author:") Then
            $ar_var = StringSplit($oContentLine[$i], "Author:", 1)
            $var = $ar_var[2]
            MsgBox(0, "", "$var=" & $var)
        EndIf
    Next

EndIf

Func _WordGetText(ByRef $o_object)
    If Not IsObj($o_object) Then
        __WordErrorNotify ("Error", "_WordDocFindReplace", "$_WordStatus_InvalidDataType")
        SetError($_WordStatus_InvalidDataType, 1)
        Return 0
    EndIf
    ;
    If Not __WordIsObjType ($o_object, "document") Then
        __WordErrorNotify ("Error", "_WordDocFindReplace", "$_WordStatus_InvalidObjectType")
        SetError($_WordStatus_InvalidObjectType, 1)
        Return 0
    EndIf
    ;
    Local $return
    
    $return = $o_object.content.Text
    If Not $return Then
        __WordErrorNotify ("Warning", "_WordDocContent", "$_WordStatus_NoMatch")
        SetError($_WordStatus_NoMatch)
        Return 0
    Else
        SetError($_WordStatus_Success)
        Return $return
    EndIf
EndFunc   ;==>_WordGetText
Best, Randall Edited by randallc
Link to comment
Share on other sites

Hi, or with these ones; objget, so shouldn't matter..[if file already open or not...]

[bigDaddy, can use something like these 3 funcs in your "Word.au3"?]

1._WordDocObjGet

2._WordQuit2

3._WordGetText

;DocContent.au3
#include <Word.au3>
$s_FilePath1=@ScriptDir & "\try mydoc.doc"
;$oWordApp = _WordCreate ("",0,0)
;$oDoc = _WordDocOpen ($oWordApp, $s_FilePath1)
$oDoc = _WordDocObjGet($s_FilePath1, 0, 0)
$oContent = _WordGetText($oDoc)
_WordQuit2 ($oDoc, 0)
If $oContent Then
    $oContentLine = StringSplit($oContent, @CRLF)
    For $i = 1 To UBound($oContentLine) - 1
        If StringInStr($oContentLine[$i], "Author:") Then
            $ar_var = StringSplit($oContentLine[$i], "Author:", 1)
            $var = $ar_var[2]
            MsgBox(0, "", "$var=" & $var)
        EndIf
    Next
EndIf
Func _WordDocObjGet($s_FilePath = "blank.doc", $b_tryAttach = 0, $b_visible = 1, $b_takeFocus = 1)
;===============================================================================
;
; Function Name:    _WordDocObjGet()
; Description:      Create a Microsoft Office Word Object
; Parameter(s):     $s_FilePath     - : specifies the file to get  open upon creation
;~ ; Parameter(s):     $s_FilePath      - Optional: specifies the file on open upon creation
;                   $b_tryAttach    - Optional: specifies whether to try to attach to an existing window
;                                       0 = (Default) do not try to attach
;                                       1 = Try to attach to an existing window
;                   $b_visible      - Optional: specifies whether the window will be visible
;                                       0 = Window is hidden
;                                       1 = (Default) Window is visible
;                   $b_takeFocus    - Optional: specifies whether to bring the attached window to focus
;                                       0 =  Do Not Bring window into focus
;                                       1 = (Default) bring window into focus
; Requirement(s):   AutoIt3 Beta with COM support (post 3.1.1)
; Return Value(s):  On Success  - Returns an object variable pointing to a Word.Document object
;                   On Failure  - Returns 0 and sets @ERROR
;                   @ERROR      - 0 ($_WordStatus_Success) = No Error
;                               - 1 ($_WordStatus_GeneralError) = General Error
;                               - 3 ($_WordStatus_InvalidDataType) = Invalid Data Type
;                               - 4 ($_WordStatus_InvalidObjectType) = Invalid Object Type
;                   @Extended   - Set to true (1) or false (0) depending on the success of $f_tryAttach
; Author(s):        randallc, Bob Anthony (Code based off IE.au3)
;
;===============================================================================
;
    Local $o_Result, $o_object, $o_win, $result, $b_mustUnlock = 0
    
    If Not $b_visible Then $b_takeFocus = 0 ; Force takeFocus to 0 for hidden window
    If $s_FilePath = "blank" or $s_FilePath = -1 Then $b_tryAttach = 0 ; There is currently no way of attaching to a blank document
    
    If $b_tryAttach Then
        Local $o_Result = _WordAttach($s_FilePath)
        If IsObj($o_Result) Then
            If $b_takeFocus Then
                $o_win = $o_Result.ActiveWindow
                $h_hwnd = __WordGetHWND($o_win)
                If IsHWnd($h_hwnd) Then WinActivate($h_hwnd)
            EndIf
            SetError($_WordStatus_Success)
            SetExtended(1)
            Return $o_Result
        EndIf
    EndIf
    
    If Not $b_visible Then
        $result = __WordLockSetForegroundWindow($WORD_LSFW_LOCK)
        If $result Then $b_mustUnlock = 1
    EndIf
    
;~  Local $o_object = ObjCreate("Word.Application")
    Local $o_object = Objget($s_FilePath)
    If Not IsObj($o_object) Then
        __WordErrorNotify("Error", "_WordDocObjGet", "", "Word Object GET Failed")
        SetError($_WordStatus_GeneralError)
        Return 0
    EndIf
    
    $o_object.activate
    $o_object.application.visible = $b_visible
;~  $o_object.visible = $b_visible
    
    If $b_mustUnlock Then
        $result = __WordLockSetForegroundWindow($WORD_LSFW_UNLOCK)
        If Not $result Then __WordErrorNotify("Warning", "_WordCreate", "", "Foreground Window Unlock Failed!")
        ; If the unlock doesn't work we will have created an unwanted modal window
    EndIf
    
;~  If $s_FilePath = "blank" Then
;~      _WordDocAdd($o_object, 0)
;~  ElseIf $s_FilePath <> "" Then
;~      _WordDocOpen($o_object, $s_FilePath)
;~  EndIf
    SetError(@error)
    Return $o_object
EndFunc   ;==>_WordCreate
Func _WordQuit2(ByRef $o_object, $i_SaveChanges = -2, $i_OriginalFormat = 1, $b_RouteDocument = 0)
;===============================================================================
;
; Function Name:    _WordQuit()
; Description:      Close the window and remove the object reference to it
; Parameter(s):     $o_object           - Object variable of a Word.Application OR Word.document
;                   $i_SaveChanges      - Optional: specifies the save action for the document
;                                            0 = Do not save changes
;                                           -1 = Save changes
;                                           -2 = (Default) Prompt to save changes
;                   $i_OriginalFormat   - Optional: specifies the save format for the document
;                                           0 = Word Document
;                                           1 = (Default) Original Document Format
;                                           2 = Prompt User
;                   $b_RouteDocument    - Optional: specifies whether to route the document to the next recipient
;                                           0 = (Default) do not route
;                                           1 = route to next recipient
; Requirement(s):   AutoIt3 Beta with COM support (post 3.1.1)
; Return Value(s):  On Success  - Returns 1
;                   On Failure  - Returns 0 and sets @ERROR
;                   @ERROR      - 0 ($_WordStatus_Success) = No Error
;                               - 1 ($_WordStatus_GeneralError) = General Error
;                               - 3 ($_WordStatus_InvalidDataType) = Invalid Data Type
;                               - 4 ($_WordStatus_InvalidObjectType) = Invalid Object Type
;                   @Extended   - Contains invalid parameter number
; Author(s):        Bob Anthony (Code based off IE.au3)
;
;===============================================================================
    If Not IsObj($o_object) Then
        __WordErrorNotify("Error", "_WordQuit", "$_WordStatus_InvalidDataType")
        SetError($_WordStatus_InvalidDataType, 1)
        Return 0
    EndIf
    ;
    If Not __WordIsObjType($o_object, "application") Then
        $o_object=$o_object.application
        If Not __WordIsObjType($o_object, "application") Then
            __WordErrorNotify("Error", "_WordQuit", "$_WordStatus_InvalidObjectType")
            SetError($_WordStatus_InvalidObjectType, 1)
            Return 0
        EndIf
    EndIf
    
    $o_object.Quit ($i_SaveChanges, $i_OriginalFormat, $b_RouteDocument)
    $o_object = 0
    SetError($_WordStatus_Success)
    Return 1
EndFunc   ;==>_WordQuit2
Func _WordGetText(ByRef $o_object)
    If Not IsObj($o_object) Then
        __WordErrorNotify ("Error", "_WordDocFindReplace", "$_WordStatus_InvalidDataType")
        SetError($_WordStatus_InvalidDataType, 1)
        Return 0
    EndIf
    ;
    If Not __WordIsObjType ($o_object, "document") Then
        __WordErrorNotify ("Error", "_WordDocFindReplace", "$_WordStatus_InvalidObjectType")
        SetError($_WordStatus_InvalidObjectType, 1)
        Return 0
    EndIf
    ;
    Local $return
    
    $return = $o_object.content.Text
    If Not $return Then
        __WordErrorNotify ("Warning", "_WordDocContent", "$_WordStatus_NoMatch")
        SetError($_WordStatus_NoMatch)
        Return 0
    Else
        SetError($_WordStatus_Success)
        Return $return
    EndIf
EndFunc   ;==>_WordGetText
Best, Randall Edited by randallc
Link to comment
Share on other sites

Hi,

@big_daddy; OR, probably better for opening docs, can you show example of how to open a document easily if it

1. may or may not already be open?

2. may or may not already exist?

(Will it be possible in 1 line with all those parameters you have already,

or does it require multiline "if..then createfile, close doc etc?)

Best, randall

Edited by randallc
Link to comment
Share on other sites

  • Moderators

@randallc - I hope I understand you correctly, but let me know if I've misunderstood.

The second example for _WordCreate() does this.

; *******************************************************
; Example 2 - Attempt to attach to an existing word window with the specified document open.
;               Create a new word window and open that document if one does not already exist.
; *******************************************************
;
#include <Word.au3>
$oWordApp = _WordCreate (@ScriptDir & "\Test.doc", 1)
; Check @extended return value to see if attach was successful
If @extended Then
    MsgBox(0, "", "Attached to Existing Window")
Else
    MsgBox(0, "", "Created New Window")
EndIf

If the file specified does not already exist it will be created. However if this does happen it will be reported to the console.

Link to comment
Share on other sites

Hi,

@Big_Daddy

Sorry i missed that...

I am trying to make some demos of funcs which accept "either" filepathName OR object;

so it is more compatible with the rest of AutoIt, and "com" functions are hidden, and not in our face; You can tell me if you think there is any validity in this?...

What do you think?

Many Thanks, Randall

eg

_WordQuit2($s_FilePath1

_WordGetText($s_FilePath1

_WordCreate2($s_FilePath

_WordDocObjGet($s_FilePath1

;DocContent.au3
#include <Word2.au3>
$s_FilePath = @ScriptDir & "\try mydoc.doc"
$oContent = _WordGetText ($s_FilePath, 1, 0, 1, 1)
;SYNTAX _WordGetText($s_FilePath1[or objectdoc], $b_tryAttach = 0, $b_visible = 1, $b_takeFocus = 1, $i_Quit = 1)
If $oContent Then
    $oContentLine = StringSplit($oContent, @CRLF)
    For $i = 1 To UBound($oContentLine) - 1
        If StringInStr($oContentLine[$i], "Author:") Then
            $ar_var = StringSplit($oContentLine[$i], "Author:", 1)
            $var = $ar_var[2]
            ExitLoop
        EndIf
    Next
EndIf
MsgBox(0, "", "$var=" & $var)
Edited by randallc
Link to comment
Share on other sites

Hi.

1. Check you have "Word.au3" in your current directory.

2. Check you have "Word2.au3" in your current directory.

3. Start a new script with above (attached below )

4. rename your doc file's name in this last script to read from the file with "Author:" in it

If there are still problems, please post the print out from your console.

Best, randall

[PS There may be attach problems from Word.au3; I have a workaround if you need it [i can post Word3.au3!]; should only be a problem if the Word doc is already open]

Edited by randallc
Link to comment
Share on other sites

Ok...it is still not working. If I use a hard code path (as you have in the example) it works for the most part. When I use a variable as the path I get and error thrown back from Word.au3...see attached gif for error message I get when running the script below.

First here is a description of what I need it to do.....

First I need the program to open the first document in a folder (there are hundreds of docs in this folder)

I need it to grab the first line of text in the document and store it in a variable. (This is always the title of the document)

I then need the script to parse down the document until it finds the word Author: It need to grab any text to the right of that and store it in another variable

I then need the script to open the Word Properties of the document and insert the first variable in the Title section and then insert the second variable in the Author section. Close that properties window and save and close the document file.

I then need the script to loop back and find the next document in that same directory and do the whole process over again until it has gone through all the document in that folder and which time it bails out of the script and displays a done window.

Here is just some basic script that I've tried just to get it not to display the error attached. I can't get it past this error.

;DocContent.au3

#include <Word.au3>

; Shows the filenames of all files in the current directory.

FileChangeDir("C:\documents")

$search = FileFindFirstFile("*.doc")

; Check if the search was successful

If $search = -1 Then

MsgBox(0, "Error", "No files/directories matched the search pattern")

Exit

EndIf

While 1

$file = FileFindNextFile($search)

If @error Then ExitLoop

$s_FilePath = $file

$oWordApp = _WordCreate ("",0,0)

$oDoc = _WordDocOpen ($oWordApp, $s_FilePath)

$oContent = _WordGetText($oDoc)

_WordQuit ($oWordApp, 0)

If $oContent Then

$oContentLine = StringSplit($oContent, @CRLF)

For $i = 1 To UBound($oContentLine) - 1

If StringInStr($oContentLine[$i], "Author:") Then

$ar_var = StringSplit($oContentLine[$i], "Author:", 1)

$var = $ar_var[2]

MsgBox(0, "", "$var=" & $var)

EndIf

Next

EndIf

MsgBox(4096, "File:", $file)

WEnd

; Close the search handle

FileClose($search)

Edited by Agent Orange
Link to comment
Share on other sites

Now that you have thoroughly described what it is your needing I think I can help you.

Any help you can provide would be great. I have been able to get pieces and parts of this to work when they're seperate, but I'm having trouble get the it all to work together in the flow I have described without errors.

Link to comment
Share on other sites

  • Moderators

The below code does exactly as your described. However it will only work once on the same document! I have no idea why this is, but thats just how it is. I beat my head on the desk for like ten minutes trying to figure it out. :)

#include <Word.au3>

; Shows the filenames of all files in the current directory.
FileChangeDir("C:\test\")
$search = FileFindFirstFile("*.doc")

; Check if the search was successful
If $search = -1 Then
    MsgBox(0, "Error", "No files/directories matched the search pattern")
    Exit
EndIf

$oWordApp = _WordCreate ("", 0, 0)

While 1
    $bFoundTitle = False
    $bFoundAuthor = False
    
    $file = FileFindNextFile($search)
    If @error Then ExitLoop
    
    ConsoleWrite(@WorkingDir & "\" & $file & @CR)
    
    $oDoc = _WordDocOpen ($oWordApp, @WorkingDir & "\" & $file)
    $oContent = $oDoc.Range.Text
    If $oContent <> "" Then
        $aContent = StringSplit($oContent, @CRLF)
        If Not @error Then
            For $i = 1 To UBound($aContent) - 1
                If Not $bFoundTitle And $aContent[$i] <> "" Then
                    $sTitle = $aContent[$i]
                    ConsoleWrite(@TAB & $sTitle & @CR)
                    $bFoundTitle = True
                EndIf
                If StringInStr($aContent[$i], "Author:") Then
                    $sAuthor = StringReplace($aContent[$i], "Author: ", "")
                    ConsoleWrite(@TAB & $sAuthor & @CR)
                    $bFoundAuthor = True
                    ExitLoop
                EndIf
            Next
        EndIf
    EndIf
    
    If $bFoundTitle And $bFoundAuthor Then
        $oDoc.BuiltInDocumentProperties (1) = $sTitle
        $oDoc.BuiltInDocumentProperties (3) = $sAuthor
        ConsoleWrite(@TAB & "Properties Set" & @CR)
    EndIf
    _WordDocClose ($oDoc, -1, 1)
WEnd

; Close the search handle
FileClose($search)
_WordQuit ($oWordApp)

BTW - You should really check out SciTE, see my signature for the link.

Link to comment
Share on other sites

hi,

variable as the path I get and error thrown back from Word.au3

Also, you left out the directory in the path;

$oContent = _WordGetText ($s_Dir&$file, 1, 0, 1, 0)
    ;SYNTAX _WordGetText($s_FilePath1[or objectdoc], $b_tryAttach = 0, $b_visible = 1, $b_takeFocus = 1, $i_Quit = 0)
oÝ÷ ØëzV­í¢lºw*ºV¢·FzÔÞÆÚ®¢Ûazâ)Ú÷«²*'Ê®±ç¢é]¢yrµ©jö«¦åy«)^¥«a¶¯j¸nW²¢ë.¦byܲØ^9¸ÞrØ^Dz¢êÞ®º+ë,j«ÞºZµçb¶ÇméhÂÚ.±ç¢¶«wî¶êÞü¨¹ëÞ®wv+[Zuܦk(Z½ë(÷Ö§¢Ø^¯²ß|(®KzËpØmç!jxz®¢Ð-µ§!ªê-ü*Þjײ)ÁZuÜ·öÉÞÆÛޮȨh¥í÷ë-E©ÝjYNE©è¦+¶éâ·'譩趫~éܶ*'jëh×6;DocContentAO.au3
#include <Word2.au3>
Local $s_Dir = @ScriptDir & "\", $ar_DocProps[4]
FileChangeDir($s_Dir)
$search = FileFindFirstFile("*.doc")
If $search = -1 Then
    MsgBox(0, "Error", "No files/directories matched the search pattern")
    Exit
EndIf
While 1
    $file = FileFindNextFile($search)
    If @error Then ExitLoop
    If Not StringInStr($file, "~") Then
        $s_Content = _WordGetText ($s_Dir & $file,1,0,0,0,0)
        ;SYNTAX; _WordGetText($s_FilePath1, $b_tryAttach = 1, $b_visible = 0, $b_takeFocus = 1, $i_Quit = 0, $i_Close = 1)
        If $s_Content Then
            $ar_ContentLine = StringSplit($s_Content, @CRLF)
            $ar_DocProps[1] = $ar_ContentLine[1]
            For $i = 1 To UBound($ar_ContentLine) - 1
                If StringInStr($ar_ContentLine[$i], "Author:") Then
                    $ar_var = StringSplit($ar_ContentLine[$i], "Author:", 1) ;"stringreplace gives text before Author as well as after"
                    $ar_DocProps[3] = $ar_var[2]
                    ExitLoop
                EndIf
            Next
        _WordSetBuiltInDocumentProperties($s_Dir & $file,$ar_DocProps)
        ;SYNTAX; _WordSetBuiltInDocumentProperties($s_File,$ar_Props, $i_Close = 1)
        EndIf
    EndIf
WEnd
FileClose($search)
_WordQuit2 ("",0)
Edited by randallc
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...