Sign in to follow this  
Followers 0
myk3

Reading a word document and seraching for sections

7 posts in this topic

I am trying to parse a word document based on sections in the document..

The word document is broke down into sections 1.0 - 9.0

I am currently using the following

$test=InputBox("File Name","Please input the word document")
BlockInput(1)
$word1=($test&".doc")
$word = ObjCreate("Word.Application")
$word.visible = true
$word.Documents.open(@ScriptDir & "\"& $word1)

$word.Selection.Find.ClearFormatting
$word.Selection.Find.Text = "SUBJECT"
$word.Selection.Find.Replacement.Text = ""
$word.Selection.Find.Forward = True
$word.Selection.Find.Wrap = ("1")
$word.Selection.Find.Execute
$word.Selection.Extend
$word.Selection.Find.ClearFormatting
$word.Selection.Find.Text = "1.0"
$word.Selection.Find.Replacement.Text = ""
$word.Selection.Find.Forward = True
$word.Selection.Find.Wrap = ("2")
$word.Selection.Find.Execute

WinActivate($word1)
ControlSend($word1,"","","^c")
sleep(1000)
run ("notepad.exe")
WinWaitActive("Untitled - ")
send(ClipGet())
send("{BS}{BS}{BS}{BS}{BS}{BS}{BS}{BS}{BS}")
sleep(1000)

When i try to search for a new area of the file it fails and copys the entire file

Share this post


Link to post
Share on other sites



#2 ·  Posted (edited)

Welcome to AutoIt ;)

Any chance you can upload and link us to a copy of the Word file (or a sample word file that has the same structure)? There are several things you may mean by "sections", so it would help to be able to run your script ourselves and see what happens.

Also, I'm not sure from your description what you want it to do at each section...copy the whole section? Erase the section break?

Thanks

Edited by james3mg

"There are 10 types of people in this world - those who can read binary, and those who can't.""We've heard that a million monkeys at a million keyboards could produce the complete works of Shakespeare; now, thanks to the Internet, we know that is not true." ~Robert Wilensky0101101 1001010 1100001 1101101 1100101 1110011 0110011 1001101 10001110000101 0000111 0001000 0001110 0001101 0010010 1010110 0100001 1101110

Share this post


Link to post
Share on other sites

#3 ·  Posted (edited)

Welcome to AutoIt ;)

Any chance you can upload and link us to a copy of the Word file (or a sample word file that has the same structure)? There are several things you may mean by "sections", so it would help to be able to run your script ourselves and see what happens.

Also, I'm not sure from your description what you want it to do at each section...copy the whole section? Erase the section break?

Thanks

I will create a "sample" word doc to use.. This is by no means the same as some docs have more incremental sections as 8.1 8.2 8.3 has..

If you run the script it asks you for the name of the file (dont add the extension) and it should be saved in the same location as the au3.

I would like to copy the section then if possible save it as a variable.. I think i could use clipget for that.. then later once i get this worked out I want to export this to an access database we already have setup..

test1.doc

Edited by myk3

Share this post


Link to post
Share on other sites

To go to a section in Word, try:

; wdGoToDirection
Global Const $wdGoToAbsolute = 1
Global Const $wdGoToFirst = 1
Global Const $wdGoToLast = -1
Global Const $wdGoToNext = 2
Global Const $wdGoToPrevious = 3
Global Const $wdGoToRelative = 2

; wdGoToItem
Global Const $wdGoToBookmark = -1
Global Const $wdGoToSection = 0
Global Const $wdGoToPage = 1
Global Const $wdGoToTable = 2
Global Const $wdGoToLine = 3
Global Const $wdGoToFootNote = 4
Global Const $wdGoToEndNote = 5
Global Const $wdGoToComment = 6
Global Const $wdGoToField = 7
Global Const $wdGoToGraphic = 8
Global Const $wdGoToObject = 9
Global Const $wdGoToEquation = 10
Global Const $wdGoToHeading = 11
Global Const $wdGoToPercent = 12
Global Const $wdGoToSpellingError = 13
Global Const $wdGoToGrammaticalError = 14
Global Const $wdGoToProofReadingError = 15

; ...

; What:=wdGoToSection, Which:=wdGoToFirst, Count:=2
$oDoc.GoTo($wdGoToSection, $wdGoToFirst, 2)

Don't have Word (blessed with OOo) on this box, so untested.

;)


Valuater's AutoIt 1-2-3, Class... Is now in Session!For those who want somebody to write the script for them: RentACoder"Any technology distinguishable from magic is insufficiently advanced." -- Geek's corollary to Clarke's law

Share this post


Link to post
Share on other sites

#5 ·  Posted (edited)

Seems to works for me- it highlights

SUBJECT: (U) This is the subject of this doc

1.0

then exits. So you can add Send("^c") and $found=ClipGet() to the end of your script, and it will give you the whole first section, as I understand it.

Or maybe I misunderstood what you're looking for? ;)

Edit: I did notice as I went to close the Word window, that it acts like I'm holding down shift (that is, if I click somewhere, the highlighting is cut back or extended to include everything from the beginning the document to wherever I clicked)...is there a $word.Selection method that cancels out your command to .Extend? Maybe that's your problem, if your cursor for some reason was sent to the end of the document?

Edited by james3mg

"There are 10 types of people in this world - those who can read binary, and those who can't.""We've heard that a million monkeys at a million keyboards could produce the complete works of Shakespeare; now, thanks to the Internet, we know that is not true." ~Robert Wilensky0101101 1001010 1100001 1101101 1100101 1110011 0110011 1001101 10001110000101 0000111 0001000 0001110 0001101 0010010 1010110 0100001 1101110

Share this post


Link to post
Share on other sites

Edit: I did notice as I went to close the Word window, that it acts like I'm holding down shift (that is, if I click somewhere, the highlighting is cut back or extended to include everything from the beginning the document to wherever I clicked)...is there a $word.Selection method that cancels out your command to .Extend? Maybe that's your problem, if your cursor for some reason was sent to the end of the document?

I have no clue if there is a function in word to cancel the selection.. This is a learning experience for me

Share this post


Link to post
Share on other sites

Does anyone know how to select a bookmark from a word doc? I think if i can select a book mark i can parse it easier

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0