Jump to content

word find chars like regexp


frank10
 Share

Go to solution Solved by mikell,

Recommended Posts

I'm trying to find/replace some chars into a .docx like this:

#include <Word.au3>

Local $oWord = _Word_Create()
Local $oDoc = _Word_DocOpen($oWord, "test.docx")
Local $oRangeFound, $oRangeText, $oSearchRange = _Word_DocRangeSet($oDoc, -1)

; find at least 2 spaces only after numbers: "wordA 123  wordB  wordC" --> "wordA 123 wordB  wordC"
$oRangeFound = _Word_DocFindReplace($oDoc, "([0-9]*) {2,}", "\1", 2, 0, 0,0,1)

It does not work...

How to do it?

 

Link to comment
Share on other sites

_Word_DocFindReplace does not support Regular Expressions.
I'm not sure it is possible with Word at all.

My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2022-02-19 - Version 1.6.1.0) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example Scripts
OutlookEX (2021-11-16 - Version 1.7.0.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX_GUI (2021-04-13 - Version 1.4.0.0) - Download
Outlook Tools (2019-07-22 - Version 0.6.0.0) - Download - General Help & Support - Wiki
PowerPoint (2021-08-31 - Version 1.5.0.0) - Download - General Help & Support - Example Scripts - Wiki
Task Scheduler (NEW 2022-07-28 - Version 1.6.0.1) - Download - General Help & Support - Wiki

Standard UDFs:
Excel - Example Scripts - Wiki
Word - Wiki

Tutorials:
ADO - Wiki
WebDriver - Wiki

 

Link to comment
Share on other sites

10 hours ago, JockoDundee said:

Word doesn’t seem to support “*” greedy character.

What happens if you just omit it:

$oRangeFound = _Word_DocFindReplace($oDoc, "([0-9]) {2,}", "\1", 2, 0, 0,0,1)

 

Doesn't work.

It's not the * that gives problem, it seems it is the "{2,}".

In fact this works:

$oRangeFound = _Word_DocFindReplace($oDoc, "([0-9]*)[ ]*", "\1°", 2, 0, 0,0,1)

BUT as * means 0 or more, it changes also numbers followed by one space... (strangely it should change also if number is followed by no space, instead it does not change this...)

123word     --> no change
123 word    --> 123°word
123  word   --> 123° word



What I want is change only from 2 spaces or more...

Of course I can do workarounds like:

$oRangeFound = _Word_DocFindReplace($oDoc, "([0-9]*)   ", "\1 ", 2, 0, 0,0,1)
$oRangeFound = _Word_DocFindReplace($oDoc, "([0-9]*)  ", "\1 ", 2, 0, 0,0,1)

But it would be a lot better to find out a similar regExp, also for other future uses...

Edited by frank10
Link to comment
Share on other sites

4 minutes ago, mikell said:

Yes, they say that you can use {2,}, but in fact it doesn't work...

this:

$oRangeFound = _Word_DocFindReplace($oDoc, "([0-9]*)[ ]{2,}", "\1°", 2, 0, 0,0,1)
consolewrite("__err:" & @error & "__extended:" & @extended & @crlf)

gives:

__err:3__extended:-2147352567

 

Link to comment
Share on other sites

Extended -2147352567 (decimal) is 0x80020009 (hex) and stands for "General Error".
You need a full COM error handler to get more detailed information about the error.
Unfortunately this is a bit complex caused by the way AutoIt handles COM errors in the Word UDF.

#include <Word.au3>
#include <MsgBoxConstants.au3>
Global $oError = ObjEvent("AutoIt.Error", "__Word_COMErrFuncEX")

; Add your code to find/replace text here by calling the modified _Word_DocFindReplaceEX function!

Exit

; #FUNCTION# ====================================================================================================================
; Author ........: water (based on the Word UDF written by Bob Anthony)
; Modified ......:
; ===============================================================================================================================
Func _Word_DocFindReplaceEX($oDoc, $sFindText = Default, $sReplaceWith = Default, $iReplace = Default, $vSearchRange = Default, $bMatchCase = Default, $bMatchWholeWord = Default, $bMatchWildcards = Default, $bMatchSoundsLike = Default, $bMatchAllWordForms = Default, $bForward = Default, $iWrap = Default, $bFormat = Default)
    If $sFindText = Default Then $sFindText = ""
    If $sReplaceWith = Default Then $sReplaceWith = ""
    If $iReplace = Default Then $iReplace = $WdReplaceAll
    If $vSearchRange = Default Then $vSearchRange = 0
    If $bMatchCase = Default Then $bMatchCase = False
    If $bMatchWholeWord = Default Then $bMatchWholeWord = False
    If $bMatchWildcards = Default Then $bMatchWildcards = False
    If $bMatchSoundsLike = Default Then $bMatchSoundsLike = False
    If $bMatchAllWordForms = Default Then $bMatchAllWordForms = False
    If $bForward = Default Then $bForward = True
    If $iWrap = Default Then $iWrap = $WdFindContinue
    If $bFormat = Default Then $bFormat = False
    If Not IsObj($oDoc) Then Return SetError(1, 0, 0)
    Switch $vSearchRange
        Case -1
            $vSearchRange = $oDoc.Application.Selection.Range
        Case 0
            $vSearchRange = $oDoc.Range()
        Case Else
            If Not IsObj($vSearchRange) Then Return SetError(2, 0, 0)
    EndSwitch
    Local $oFind = $vSearchRange.Find
    $oFind.ClearFormatting()
    $oFind.Replacement.ClearFormatting()
    Local $bReturn = $oFind.Execute($sFindText, $bMatchCase, $bMatchWholeWord, $bMatchWildcards, $bMatchSoundsLike, _
            $bMatchAllWordForms, $bForward, $iWrap, $bFormat, $sReplaceWith, $iReplace)
    If @error Or Not $bReturn Then Return SetError(3, @error, 0)
    Return 1
EndFunc   ;==>_Word_DocFindReplaceEX

Func __Word_COMErrFuncEX()
    Local $bHexNumber = Hex($oError.number, 8)
        Local $sError = "COM Error Encountered in " & @ScriptName & @CRLF & _
            "@AutoItVersion = " & @AutoItVersion & @CRLF & _
            "@AutoItX64 = " & @AutoItX64 & @CRLF & _
            "@Compiled = " & @Compiled & @CRLF & _
            "@OSArch = " & @OSArch & @CRLF & _
            "@OSVersion = " & @OSVersion & @CRLF & _
            "Scriptline = " & $oError.scriptline & @CRLF & _
            "NumberHex = 0x" & $bHexNumber & @CRLF & _
            "Number = " & $oError.number & @CRLF & _
            "WinDescription = " & StringStripWS($oError.WinDescription, $STR_STRIPTRAILING) & @CRLF & _
            "Description = " & StringStripWS($oError.description, $STR_STRIPTRAILING) & @CRLF & _
            "Source = " & $oError.Source & @CRLF & _
            "HelpFile = " & $oError.HelpFile & @CRLF & _
            "HelpContext = " & $oError.HelpContext & @CRLF & _
            "LastDllError = " & $oError.LastDllError
        MsgBox($MB_ICONERROR, "Debug Info", $sError)
EndFunc   ;==>__AD_ErrorHandler

 

My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2022-02-19 - Version 1.6.1.0) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example Scripts
OutlookEX (2021-11-16 - Version 1.7.0.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX_GUI (2021-04-13 - Version 1.4.0.0) - Download
Outlook Tools (2019-07-22 - Version 0.6.0.0) - Download - General Help & Support - Wiki
PowerPoint (2021-08-31 - Version 1.5.0.0) - Download - General Help & Support - Example Scripts - Wiki
Task Scheduler (NEW 2022-07-28 - Version 1.6.0.1) - Download - General Help & Support - Wiki

Standard UDFs:
Excel - Example Scripts - Wiki
Word - Wiki

Tutorials:
ADO - Wiki
WebDriver - Wiki

 

Link to comment
Share on other sites

I get the impression that MS Word does not fully support Regualr Expressions (https://vlasovstudio.com/regent/documentation/Microsoft-Word-Wildcards-as-Regular-Expressions.html).
Unfortunatley I'm not familiar with the Wildcards supported by MS Word. But I suggest to try in MS Word before using _Word_FindReplace.

My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2022-02-19 - Version 1.6.1.0) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example Scripts
OutlookEX (2021-11-16 - Version 1.7.0.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX_GUI (2021-04-13 - Version 1.4.0.0) - Download
Outlook Tools (2019-07-22 - Version 0.6.0.0) - Download - General Help & Support - Wiki
PowerPoint (2021-08-31 - Version 1.5.0.0) - Download - General Help & Support - Example Scripts - Wiki
Task Scheduler (NEW 2022-07-28 - Version 1.6.0.1) - Download - General Help & Support - Wiki

Standard UDFs:
Excel - Example Scripts - Wiki
Word - Wiki

Tutorials:
ADO - Wiki
WebDriver - Wiki

 

Link to comment
Share on other sites

  • Solution
4 hours ago, frank10 said:

Yes, they say that you can use {2,}, but in fact it doesn't work...

 

5 hours ago, frank10 said:

BUT as * means 0 or more, it changes also numbers followed by one space... (strangely it should change also if number is followed by no space, instead it does not change this...)

So did you try this ?
"([0-9]*)[ ][ ]*"

Link to comment
Share on other sites

3 hours ago, mikell said:

 

So did you try this ?
"([0-9]*)[ ][ ]*"

Thank you mikell: good catch!

The only thing, with yours it gets also:

123a                        --> no change
123 aa                      --> no change
123  aaa                    --> ok
123   aaa                   --> ok
wordA 123 wordB   wordC     --> NOT ok it changes also wordB   wordC...

Instead with this:

"([0-9])[ ][ ]*"

it's perfect!

123a                        --> no change
123 aa                      --> no change
123  aaa                    --> ok
123   aaa                   --> ok
wordA 123 wordB   wordC     --> no change

 

Good workaround.

Edited by frank10
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...