frank10 Posted November 9, 2021 Posted November 9, 2021 I'm trying to find/replace some chars into a .docx like this: #include <Word.au3> Local $oWord = _Word_Create() Local $oDoc = _Word_DocOpen($oWord, "test.docx") Local $oRangeFound, $oRangeText, $oSearchRange = _Word_DocRangeSet($oDoc, -1) ; find at least 2 spaces only after numbers: "wordA 123 wordB wordC" --> "wordA 123 wordB wordC" $oRangeFound = _Word_DocFindReplace($oDoc, "([0-9]*) {2,}", "\1", 2, 0, 0,0,1) It does not work... How to do it?
water Posted November 9, 2021 Posted November 9, 2021 _Word_DocFindReplace does not support Regular Expressions. I'm not sure it is possible with Word at all. My UDFs and Tutorials: Spoiler UDFs: Active Directory (NEW 2024-07-28 - Version 1.6.3.0) - Download - General Help & Support - Example Scripts - Wiki ExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example Scripts OutlookEX (2021-11-16 - Version 1.7.0.0) - Download - General Help & Support - Example Scripts - Wiki OutlookEX_GUI (2021-04-13 - Version 1.4.0.0) - Download Outlook Tools (2019-07-22 - Version 0.6.0.0) - Download - General Help & Support - Wiki PowerPoint (2021-08-31 - Version 1.5.0.0) - Download - General Help & Support - Example Scripts - Wiki Task Scheduler (2022-07-28 - Version 1.6.0.1) - Download - General Help & Support - Wiki Standard UDFs: Excel - Example Scripts - Wiki Word - Wiki Tutorials: ADO - Wiki WebDriver - Wiki
frank10 Posted November 9, 2021 Author Posted November 9, 2021 (edited) But I tried with this and it works: $oRangeFound = _Word_DocFindReplace($oDoc, "NUTRITION([0-9]*)", "NUTRITION#: \1", 2, 0, 0,0,1) ; NUTRITION23 kcal --> NUTRITION#: 23 kcal Also here they make some examples... https://translationjournal.net/journal/15msw.htm Edited November 9, 2021 by frank10
Skysnake Posted November 9, 2021 Posted November 9, 2021 The only way I can think, and I have never done this, is to open the entire document as an object and step through in portions. Run the Regex on a portion at a time...? Skysnake Why is the snake in the sky?
JockoDundee Posted November 9, 2021 Posted November 9, 2021 8 hours ago, frank10 said: How to do it? Word doesn’t seem to support “*” greedy character. What happens if you just omit it: $oRangeFound = _Word_DocFindReplace($oDoc, "([0-9]) {2,}", "\1", 2, 0, 0,0,1) Code hard, but don’t hard code...
frank10 Posted November 10, 2021 Author Posted November 10, 2021 (edited) 10 hours ago, JockoDundee said: Word doesn’t seem to support “*” greedy character. What happens if you just omit it: $oRangeFound = _Word_DocFindReplace($oDoc, "([0-9]) {2,}", "\1", 2, 0, 0,0,1) Doesn't work. It's not the * that gives problem, it seems it is the "{2,}". In fact this works: $oRangeFound = _Word_DocFindReplace($oDoc, "([0-9]*)[ ]*", "\1°", 2, 0, 0,0,1) BUT as * means 0 or more, it changes also numbers followed by one space... (strangely it should change also if number is followed by no space, instead it does not change this...) 123word --> no change 123 word --> 123°word 123 word --> 123° word What I want is change only from 2 spaces or more... Of course I can do workarounds like: $oRangeFound = _Word_DocFindReplace($oDoc, "([0-9]*) ", "\1 ", 2, 0, 0,0,1) $oRangeFound = _Word_DocFindReplace($oDoc, "([0-9]*) ", "\1 ", 2, 0, 0,0,1) But it would be a lot better to find out a similar regExp, also for other future uses... Edited November 10, 2021 by frank10
mikell Posted November 10, 2021 Posted November 10, 2021 https://wordmvp.com/FAQs/General/UsingWildcards.htm
frank10 Posted November 10, 2021 Author Posted November 10, 2021 4 minutes ago, mikell said: https://wordmvp.com/FAQs/General/UsingWildcards.htm Yes, they say that you can use {2,}, but in fact it doesn't work... this: $oRangeFound = _Word_DocFindReplace($oDoc, "([0-9]*)[ ]{2,}", "\1°", 2, 0, 0,0,1) consolewrite("__err:" & @error & "__extended:" & @extended & @crlf) gives: __err:3__extended:-2147352567
water Posted November 10, 2021 Posted November 10, 2021 Extended -2147352567 (decimal) is 0x80020009 (hex) and stands for "General Error". You need a full COM error handler to get more detailed information about the error. Unfortunately this is a bit complex caused by the way AutoIt handles COM errors in the Word UDF. expandcollapse popup#include <Word.au3> #include <MsgBoxConstants.au3> Global $oError = ObjEvent("AutoIt.Error", "__Word_COMErrFuncEX") ; Add your code to find/replace text here by calling the modified _Word_DocFindReplaceEX function! Exit ; #FUNCTION# ==================================================================================================================== ; Author ........: water (based on the Word UDF written by Bob Anthony) ; Modified ......: ; =============================================================================================================================== Func _Word_DocFindReplaceEX($oDoc, $sFindText = Default, $sReplaceWith = Default, $iReplace = Default, $vSearchRange = Default, $bMatchCase = Default, $bMatchWholeWord = Default, $bMatchWildcards = Default, $bMatchSoundsLike = Default, $bMatchAllWordForms = Default, $bForward = Default, $iWrap = Default, $bFormat = Default) If $sFindText = Default Then $sFindText = "" If $sReplaceWith = Default Then $sReplaceWith = "" If $iReplace = Default Then $iReplace = $WdReplaceAll If $vSearchRange = Default Then $vSearchRange = 0 If $bMatchCase = Default Then $bMatchCase = False If $bMatchWholeWord = Default Then $bMatchWholeWord = False If $bMatchWildcards = Default Then $bMatchWildcards = False If $bMatchSoundsLike = Default Then $bMatchSoundsLike = False If $bMatchAllWordForms = Default Then $bMatchAllWordForms = False If $bForward = Default Then $bForward = True If $iWrap = Default Then $iWrap = $WdFindContinue If $bFormat = Default Then $bFormat = False If Not IsObj($oDoc) Then Return SetError(1, 0, 0) Switch $vSearchRange Case -1 $vSearchRange = $oDoc.Application.Selection.Range Case 0 $vSearchRange = $oDoc.Range() Case Else If Not IsObj($vSearchRange) Then Return SetError(2, 0, 0) EndSwitch Local $oFind = $vSearchRange.Find $oFind.ClearFormatting() $oFind.Replacement.ClearFormatting() Local $bReturn = $oFind.Execute($sFindText, $bMatchCase, $bMatchWholeWord, $bMatchWildcards, $bMatchSoundsLike, _ $bMatchAllWordForms, $bForward, $iWrap, $bFormat, $sReplaceWith, $iReplace) If @error Or Not $bReturn Then Return SetError(3, @error, 0) Return 1 EndFunc ;==>_Word_DocFindReplaceEX Func __Word_COMErrFuncEX() Local $bHexNumber = Hex($oError.number, 8) Local $sError = "COM Error Encountered in " & @ScriptName & @CRLF & _ "@AutoItVersion = " & @AutoItVersion & @CRLF & _ "@AutoItX64 = " & @AutoItX64 & @CRLF & _ "@Compiled = " & @Compiled & @CRLF & _ "@OSArch = " & @OSArch & @CRLF & _ "@OSVersion = " & @OSVersion & @CRLF & _ "Scriptline = " & $oError.scriptline & @CRLF & _ "NumberHex = 0x" & $bHexNumber & @CRLF & _ "Number = " & $oError.number & @CRLF & _ "WinDescription = " & StringStripWS($oError.WinDescription, $STR_STRIPTRAILING) & @CRLF & _ "Description = " & StringStripWS($oError.description, $STR_STRIPTRAILING) & @CRLF & _ "Source = " & $oError.Source & @CRLF & _ "HelpFile = " & $oError.HelpFile & @CRLF & _ "HelpContext = " & $oError.HelpContext & @CRLF & _ "LastDllError = " & $oError.LastDllError MsgBox($MB_ICONERROR, "Debug Info", $sError) EndFunc ;==>__AD_ErrorHandler My UDFs and Tutorials: Spoiler UDFs: Active Directory (NEW 2024-07-28 - Version 1.6.3.0) - Download - General Help & Support - Example Scripts - Wiki ExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example Scripts OutlookEX (2021-11-16 - Version 1.7.0.0) - Download - General Help & Support - Example Scripts - Wiki OutlookEX_GUI (2021-04-13 - Version 1.4.0.0) - Download Outlook Tools (2019-07-22 - Version 0.6.0.0) - Download - General Help & Support - Wiki PowerPoint (2021-08-31 - Version 1.5.0.0) - Download - General Help & Support - Example Scripts - Wiki Task Scheduler (2022-07-28 - Version 1.6.0.1) - Download - General Help & Support - Wiki Standard UDFs: Excel - Example Scripts - Wiki Word - Wiki Tutorials: ADO - Wiki WebDriver - Wiki
frank10 Posted November 10, 2021 Author Posted November 10, 2021 Ok, that error says: "Description = The Find What text contains a Pattern Match expression which is not valid." But how should it be written? ([0-9]*)[ ]{2,} As for the links above, it seems correct...
water Posted November 10, 2021 Posted November 10, 2021 I get the impression that MS Word does not fully support Regualr Expressions (https://vlasovstudio.com/regent/documentation/Microsoft-Word-Wildcards-as-Regular-Expressions.html). Unfortunatley I'm not familiar with the Wildcards supported by MS Word. But I suggest to try in MS Word before using _Word_FindReplace. My UDFs and Tutorials: Spoiler UDFs: Active Directory (NEW 2024-07-28 - Version 1.6.3.0) - Download - General Help & Support - Example Scripts - Wiki ExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example Scripts OutlookEX (2021-11-16 - Version 1.7.0.0) - Download - General Help & Support - Example Scripts - Wiki OutlookEX_GUI (2021-04-13 - Version 1.4.0.0) - Download Outlook Tools (2019-07-22 - Version 0.6.0.0) - Download - General Help & Support - Wiki PowerPoint (2021-08-31 - Version 1.5.0.0) - Download - General Help & Support - Example Scripts - Wiki Task Scheduler (2022-07-28 - Version 1.6.0.1) - Download - General Help & Support - Wiki Standard UDFs: Excel - Example Scripts - Wiki Word - Wiki Tutorials: ADO - Wiki WebDriver - Wiki
Solution mikell Posted November 10, 2021 Solution Posted November 10, 2021 4 hours ago, frank10 said: Yes, they say that you can use {2,}, but in fact it doesn't work... 5 hours ago, frank10 said: BUT as * means 0 or more, it changes also numbers followed by one space... (strangely it should change also if number is followed by no space, instead it does not change this...) So did you try this ? "([0-9]*)[ ][ ]*" JockoDundee 1
frank10 Posted November 10, 2021 Author Posted November 10, 2021 (edited) 3 hours ago, mikell said: So did you try this ? "([0-9]*)[ ][ ]*" Thank you mikell: good catch! The only thing, with yours it gets also: 123a --> no change 123 aa --> no change 123 aaa --> ok 123 aaa --> ok wordA 123 wordB wordC --> NOT ok it changes also wordB wordC... Instead with this: "([0-9])[ ][ ]*" it's perfect! 123a --> no change 123 aa --> no change 123 aaa --> ok 123 aaa --> ok wordA 123 wordB wordC --> no change Good workaround. Edited November 10, 2021 by frank10
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now