Sign in to follow this  
Followers 0
Fangs78

Get text from MS Word

5 posts in this topic

#1 ·  Posted (edited)

So...

I just heard about AutoIt and found out that it can help me alot with text manipulation

in the windows enviroment. I was dreading that I would have to write my own Win32 API wrapper for Vb.Net (with accessibility.dll thrown into it) to do what I wanted, but lo, along comes autoItX to the rescue.

I've tested it a bit now and I can get editable text from alot of different windows, but when I tested MS Word and Excel, nothing to my interest pops up (only static control text etc).

I've tried both WinGetText and ControlGetText, but maybe I'm not using them right?

I'm pretty sure I can get the text from the MS Word document with AutoItX, but unsure how.

So I'm just checking to see if this is truly possible, before I venture into HOW...or maybe someone has an example? B)

I tried to search the forums, but I'm either a "search-disabled-person" or there isn't a topic about this here.

Thanks for any replies

Edited by Fangs78

Share this post


Link to post
Share on other sites



To extract text from Word, tou could use the New COM support which is avalaible

in the Beta of AutoIT.

Don't Forget to change the file which is in the script.

HTH,

Francis

;AutoIT 3.1 Beta with Com Support

; Francis Lennert

;locals

Local $oWord , $oWordDocuments , $oWordDocument ,$oWordContent ,$TextDoc

$oWord = ObjCreate("Word.Application") ; Connect to Word > Receive a Wordd Object

$oWord.Visible = 1 ; Ask to Show Word

$oWordDocuments = $oWord.Documents ; Ask the Word Object to Receive a Collection of Documents

$oWordDocument = $oWordDocuments.Open("C:\MyDoc.doc") ; Open the file MyDoc.doc, add it to to the Collection and receive the Object

$oWordContent = $oWorddocument.Content ; Ask to Receive the Contents Object of the Object Document

$TextDoc = $oWordContent.Text ; Ask to Extract the Text of the Contents Object in an AutoIt Variant

; Show the text from the documents

MsgBox( 1 ," Hop, the Text is : ", $TextDoc )

So...

I just heard about AutoIt and found out that it can help me alot with text manipulation

in the windows enviroment. I was dreading that I would have to write my own Win32 API wrapper for Vb.Net (with accessibility.dll thrown into it) to do what I wanted, but lo, along comes autoItX to the rescue.

I've tested it a bit now and I can get editable text from alot of different windows, but when I tested MS Word and Excel, nothing to my interest pops up (only static control text etc).

I've tried both WinGetText and ControlGetText, but maybe I'm not using them right?

I'm pretty sure I can get the text from the MS Word document with AutoItX, but unsure how.

So I'm just checking to see if this is truly possible, before I venture into HOW...or maybe someone has an example? B)

I tried to search the forums, but I'm either a "search-disabled-person" or there isn't a topic about this here.

Thanks for any replies

Share this post


Link to post
Share on other sites

#3 ·  Posted (edited)

Thanks for responding!

I found a way to do this using the office API from MS, but if this works without the DLL's from microsoft (and all later versions of office) it would be much better!

Does it require the office API to do this?

By the way. Does AutoIt3X have this functionality yet (beta) or the AutoIt3 only?

I don't want to rely on external scripting in my vb.net app to do this...

Thanks again!

Edited by Fangs78

Share this post


Link to post
Share on other sites

Hello

This works only withe beta of AutoIT (and not AutoITX) and you must have MS Word installed on the Computer.

( I am not sure that the developpers of AutOITX will put a way to connect on COM in the ActiveX because

if you can use AuotITX with your developpement tool, you can connect in the same time on the others COM application like Word, Excel...).

How to use MS DLL to extract text ?

If it's not a secret

;=)

Francis

Thanks for responding!

I found a way to do this using the office API from MS, but if this works without the DLL's from microsoft (and all later versions of office) it would be much better!

Does it require the office API to do this?

By the way. Does AutoIt3X have this functionality yet (beta) or the AutoIt3 only?

I don't want to rely on external scripting in my vb.net app to do this...

Thanks again!

Share this post


Link to post
Share on other sites

Hello

This works only withe beta of AutoIT (and not AutoITX) and you must have MS Word installed on the Computer.

( I am not sure that the developpers of AutOITX will put a way to connect on COM in the ActiveX because

if you can use AuotITX with your developpement tool, you can connect in the same time on the others COM application like Word, Excel...).

How to use MS DLL to extract text ?

If it's not a secret

;=)

Francis

You can extract text using the ms dll by doing what you showed me earlier using

any language really. I guess that is what autoit3 does. Calls the MS DLL functions directly.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0