alanstone Posted January 29, 2010 Share Posted January 29, 2010 (edited) Hi,To save Word documents as UTF-8 text files,from what I understand from the Word 2003 VBA helpor http://msdn.microsoft.com/en-us/library/aa220734(office.11).aspxthis corresponds to oDoc.SaveAs.FileFormat = wdFormatUnicodeText.Encoding = msoEncodingUTF8So I've got...#include <Word.au3>Local $doc = @ScriptDir & "\" & "doc2txt_test.doc"Local $txt = @ScriptDir & "\" & "doc2txt_test.txt"Local $oWord , $oWordDocs , $oWordDoc$oWord = ObjCreate("Word.Application")$oWord.Visible = True $oWordDocs = $oWord.Documents$oWordDoc = $oWordDocs.Open($doc)$oWordDoc.SaveAs($txt, ???) ; <--- wd*/mso* constants don't seem to work, what do you put there ?Is this then saved as UTF-8 with or without BOM ( I need without BOM) ?- AI 3.3.4.0- WXP Home SP3 Edited January 30, 2010 by alanstone Link to comment Share on other sites More sharing options...
daluu Posted January 30, 2010 Share Posted January 30, 2010 (edited) $oWordDoc.SaveAs($txt, ???) ; <--- wd*/mso* constants don't seem to work, what do you put there ?You might want to consider using the actual values of the constants instead of the constant name. The actual values can vary depending on version of Word so you'd have to add logic to check for it, via the $oWord.Version property.You can find out the constant values via the Word's Visual Basic Script Editor Object Browser. Simply look for wdFormatUnicodeText, etc., select it and the bottom pane of object explorer will tell you what the value is.For Word 2003 and likely Word XP (2002), wdFormatUnicodeText = 7. So substitute the constant with the actual integer value.For older Word versions, you'd have to check with the VBA object explorer in those versions of Word. Edited January 30, 2010 by daluu Link to comment Share on other sites More sharing options...
alanstone Posted January 30, 2010 Author Share Posted January 30, 2010 Thanks for this precious hint. So now I have... #include <File.au3> #include <Word.au3> Local $doc = @ScriptDir & "\" & "doc2txt_test.doc" Local $txt = @ScriptDir & "\" & "doc2txt_test.txt" Local $oWord , $oWordDocs , $oWordDoc ; Word 2003 constant value, ; found through Word's Visual Basic Script Editor Object Browser Local $def = 0 ; default value Local $fileformat = 7 ;wdFormatUnicodeText constant value Local $encoding = 65001 ; (&HFDE9) msoEncodingUTF8 constant value $oWord = ObjCreate("Word.Application") $oWord.Visible = True $oWordDocs = $oWord.Documents $oWordDoc = $oWordDocs.Open($doc) With $oWordDoc .FileName = @ScriptDir & "\" & "doc2txt_test.txt" ; <-- generates an error (*) ; .FileName = $txt ; <-- generates a similar error ; .FileName = '"' & $txt & '"' ; <-- generates a similar error .FileFormat = $fileformat .Encoding = $encoding .Close EndWith $oWord.Quit (*) J:\$AutoIt\doc2txt_utf8.au3 (32) : ==> The requested action with this object has failed.: .FileName = @ScriptDir & "\" & "doc2txt_test.txt" .FileName = @ScriptDir & "\" & "doc2txt_test.txt"^ ERROR Link to comment Share on other sites More sharing options...
daluu Posted January 31, 2010 Share Posted January 31, 2010 (edited) Thanks for this precious hint.With $oWordDoc .FileName = @ScriptDir & "\" & "doc2txt_test.txt" ; <-- generates an error (*); .FileName = $txt ; <-- generates a similar error; .FileName = '"' & $txt & '"' ; <-- generates a similar error .FileFormat = $fileformat .Encoding = $encoding .CloseEndWith$oWord.Quit(*) J:\$AutoIt\doc2txt_utf8.au3 (32) : ==> The requested action with this object has failed.:.FileName = @ScriptDir & "\" & "doc2txt_test.txt".FileName = @ScriptDir & "\" & "doc2txt_test.txt"^ ERRORHmm...sorry, can't really help you there. I've never used that save feature, nor AutoIt to do Word automation. I use JScript (a version of Javascript) + COM because it offer error handling of the Word COM object.One suggestion you could try is to do your intended automation of saving file to UTF8 unicode plain text format inside Word with the macro recorder, then clean up the VBA code & port that to AutoIt code. If it works inside Word, the recorded code should work, and if ported correctly, that ported code should work as well.But sometimes the Word COM object may misbehave, which is why I use JScript to fail "gracefully" (or you could ignore error instead, etc.). It may be that the SaveAs function doesn't always succeed and you have to try several times. I don't recall specifically as I worked on that a few years back. I wasn't saving to unicode text but rather to HTML format. I ran across similar errors until I switched from VBScript to using JScript to manipulate the Word object. With JScript try/catch block you can catch the error and do stuff to fail gracefully (e.g. close Word document, release Word object, etc.) or continually try in a loop, etc. until the save succeeds etc. With VBScript, the COM error isn't detected and handled (well) because VBScript can't really handle COM errors, it primarily handles errors on the VBScript side. The error I had and likely yours may be on the Word side. Not sure if AutoIt can handle COM errors well, and whether the error message details is useful enough to find a real fix or workaround. Edited January 31, 2010 by daluu Link to comment Share on other sites More sharing options...
alanstone Posted January 31, 2010 Author Share Posted January 31, 2010 (edited) This makes more sense... Local $def = 0 ; default value With $oWordDoc .SaveAs($txt,$fileformat,$def,"",$def,"",$def,$def,$def,$def,$def,$encoding) .Close EndWith Edited January 31, 2010 by alanstone Link to comment Share on other sites More sharing options...
alanstone Posted January 31, 2010 Author Share Posted January 31, 2010 daluu wrote:> I use JScript (a version of Javascript) + COMDo you mean this http://msdn.microsoft.com/en-us/library/hbxc2t98(VS.85).aspx ? Link to comment Share on other sites More sharing options...
daluu Posted February 1, 2010 Share Posted February 1, 2010 This makes more sense...Local $def = 0 ; default valueWith $oWordDoc .SaveAs($txt,$fileformat,$def,"",$def,"",$def,$def,$def,$def,$def,$encoding) .CloseEndWithYes, that does make more sense. This method of supplying the extra blank/zero valued parameters is also required when using Word automation via .NET. And to make matters worse, the exact parameter count of required blank/zero valued parameters varies by version of Word. And with .NET at least, you have to compile/link against a certain version of Word installed on your development machine, so you can't support multiple versions. Because of that, I swapped over to VBScript/JScript, as I originally started in .NET.Interestingly, with VBScript and JScript over the Word COM interface, you don't have to specify all those blank/zero value parameters. You just follow the API as presented in the Word VBScript editor / object browser. Which makes things a lot easier. Link to comment Share on other sites More sharing options...
daluu Posted February 1, 2010 Share Posted February 1, 2010 daluu wrote: > I use JScript (a version of Javascript) + COM Do you mean this http://msdn.microsoft.com/en-us/library/hbxc2t98(VS.85).aspx ? Yes, that is correct. Here's a snippet of code for my SaveAs to HTML function, for your reference. expandcollapse popupWordObj = new ActiveXObject("Word.Application"); if(WordObj.Version > 9.0){ //Office 2003 & XP HTMLFrmt = 8; }else{ //should check values for Office 97 & Office 2000, etc. HTMLFrmt = 18; } webFilename = "someName.htm"; function SaveActiveDocAsHTML(pFilePath){ var ParentPath, webFile; filesys = new ActiveXObject("Scripting.FileSystemObject"); ParentPath = filesys.GetParentFolderName(pFilePath); webFile = ParentPath + "\\" + webFilename; try{ WordObj.Activedocument.SaveAs(webFile,HTMLFrmt); //WordObj.WdSaveFormat.wdFormatHTML }catch(err){ WordErrHandler("Could not save the file as HTML/webpage. Operation aborted."); } filesys = null; } //WORDERRHANDLER //description: outputs supplied error message, close Word, disposes word object, exits script //input: error message //output: none function WordErrHandler(msg){ //WScript.Echo("MS Word threw an exception"); WScript.Echo(msg); CloseWord(); WScript.Quit(true); } //CLOSEWORD //description: quits Word & clears the Word Application Object //input: none //output: none function CloseWord(){ WordObj.Quit(false); WordObj = null; } Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now