leuce

Regex: how to exclude a word from the found string

8 posts in this topic

Hello everyone

I'm trying to perform some regex find/replace on an XML file in which only some tag pairs must be affected, and the clue to know which tag pairs must be affected comes *after* the tag pair.

Here's a simplified example of my text:

<target>some text</target> some text <target>some text </target> some text <target>some text</target> <reference> some text <target> some text </target>some text

I want to replace "<target>some text</target> <reference>" with "<target>{{some text}}</target> <reference>".  But I can't figure out how to tell the regex not to include all the other "target"s that come before the text that I really want to match.

My limited regex knowledge tells me that this ought to work:

"(<target>)(.+?)^(<target>)(</target> <reference>)", "$1{{$2}}$4"

...but it doesn't.  If I omit the ^(<target>), then AutoIt matches everything from the first <target> all the way up to the <reference>.

Can this be done?

Thanks

Samuel

 

Share this post


Link to post
Share on other sites



By the way, I solved my problem by using StringReverse:

#include <MsgBoxConstants.au3>
$foo = "<target>some text</target> some text <target>some text </target> some text <target>some text</target> <reference> some text <target> some text </target>some text"
$foofoo = StringReverse ($foo)
$bar = StringRegExpReplace ($foofoo, "(>ecnerefer< >tegrat/<)(txet emos)(>tegrat<)", "$1}}$2{{$3")
$barbar = StringReverse ($bar)
MsgBox (0, "", $barbar, 0)

...but I'm thinking that this must be possible using regex instead of a kludge.

Share this post


Link to post
Share on other sites

#3 ·  Posted (edited)

 

$foo = "<target>some text</target> some text <target>some text </target> some text <target>some text</target> <reference> some text <target> some text </target>some text"
msgbox(0, '' , stringreplace($foo , "<target>some text</target> <reference>" , "<target>{{some text}}</target> <reference>"))

 

Edited by iamtheky

,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-.
|(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/
(_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_)
| | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) (
| | | | |)| | \ / | | | | | |)| | `--. | |) \ | |
`-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_|
'-' '-' (__) (__) (_) (__)

Share this post


Link to post
Share on other sites
2 minutes ago, iamtheky said:

 

$foo = "<target>some text</target> some text <target>some text </target> some text <target>some text</target> <reference> some text <target> some text </target>some text"
msgbox(0, '' , stringreplace($foo , "<target>some text</target> <reference>" , "<target>{{some text}}</target> <reference>"))

 

Thanks, but "some text" in my example is a placeholder for some text.  I don't know in advance what that text is going to be.

Share this post


Link to post
Share on other sites

leuce,

You need to make the RegEx non-greedy - then it looks for the shortest match:

$sText = "<target>some text</target> some text <target>some text </target> some text <target>some text</target> <reference> some text <target> some text </target>some text"

$sNewText = StringRegExpReplace($sText, "(?U)(<target>)(some text)(</target> <reference>)", "$1{{$2}}$3")

ConsoleWrite($sNewText & @CRLF)

M23


Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind._______My UDFs:

Spoiler

ArrayMultiColSort ---- Sort arrays on multiple columns
ChooseFileFolder ---- Single and multiple selections from specified path treeview listing
Date_Time_Convert -- Easily convert date/time formats, including the language used
ExtMsgBox --------- A highly customisable replacement for MsgBox
GUIExtender -------- Extend and retract multiple sections within a GUI
GUIFrame ---------- Subdivide GUIs into many adjustable frames
GUIListViewEx ------- Insert, delete, move, drag, sort, edit and colour ListView items
GUITreeViewEx ------ Check/clear parent and child checkboxes in a TreeView
Marquee ----------- Scrolling tickertape GUIs
NoFocusLines ------- Remove the dotted focus lines from buttons, sliders, radios and checkboxes
Notify ------------- Small notifications on the edge of the display
Scrollbars ----------Automatically sized scrollbars with a single command
StringSize ---------- Automatically size controls to fit text
Toast -------------- Small GUIs which pop out of the notification area

 

Share this post


Link to post
Share on other sites
8 minutes ago, Melba23 said:

leuce,

You need to make the RegEx non-greedy - then it looks for the shortest match:

$sText = "<target>some text</target> some text <target>some text </target> some text <target>some text</target> <reference> some text <target> some text </target>some text"

$sNewText = StringRegExpReplace($sText, "(?U)(<target>)(some text)(</target> <reference>)", "$1{{$2}}$3")

ConsoleWrite($sNewText & @CRLF)

M23

Thanks, Melba23, but "some text" in my example is just some text.  So I'd have to update your suggestion with something like .+?, i.e. "anything".  But this:

"(?U)(<target>)(.+)(</target> <reference>)", "$1{{$2}}$3")

yields this:

<target>{{some text</target> some text <target>some text </target> some text <target>some text}}</target> <reference> some text <target> some text </target>some text

instead of this:

<target>some text</target> some text <target>some text </target> some text <target>{{some text}}</target> <reference> some text <target> some text </target>some text

 

Share this post


Link to post
Share on other sites

leuce,

I see the problem, but I am afraid I have no idea how to solve it. We will need to wait for one of the gurus to pass by.

M23


Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind._______My UDFs:

Spoiler

ArrayMultiColSort ---- Sort arrays on multiple columns
ChooseFileFolder ---- Single and multiple selections from specified path treeview listing
Date_Time_Convert -- Easily convert date/time formats, including the language used
ExtMsgBox --------- A highly customisable replacement for MsgBox
GUIExtender -------- Extend and retract multiple sections within a GUI
GUIFrame ---------- Subdivide GUIs into many adjustable frames
GUIListViewEx ------- Insert, delete, move, drag, sort, edit and colour ListView items
GUITreeViewEx ------ Check/clear parent and child checkboxes in a TreeView
Marquee ----------- Scrolling tickertape GUIs
NoFocusLines ------- Remove the dotted focus lines from buttons, sliders, radios and checkboxes
Notify ------------- Small notifications on the edge of the display
Scrollbars ----------Automatically sized scrollbars with a single command
StringSize ---------- Automatically size controls to fit text
Toast -------------- Small GUIs which pop out of the notification area

 

Share this post


Link to post
Share on other sites

leuce,

Having just said that, this seems to work:

$sText = "<target> some text </target> some text <target> some text </target> some text <target>some text</target> <reference> some text <target> some text </target>some text"

$sNewText = StringRegExpReplace($sText, "(?U)(<target>)([^<]*)(</target> <reference>)", "$1{{$2}}$3")

ConsoleWrite($stext & @CRLF & $sNewText & @CRLF)

The RegEx is asked not to include any "<" characters in the intervening text - which means that only the actual text we are looking for will work (unless you have any of them in the text itself).

M23


Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind._______My UDFs:

Spoiler

ArrayMultiColSort ---- Sort arrays on multiple columns
ChooseFileFolder ---- Single and multiple selections from specified path treeview listing
Date_Time_Convert -- Easily convert date/time formats, including the language used
ExtMsgBox --------- A highly customisable replacement for MsgBox
GUIExtender -------- Extend and retract multiple sections within a GUI
GUIFrame ---------- Subdivide GUIs into many adjustable frames
GUIListViewEx ------- Insert, delete, move, drag, sort, edit and colour ListView items
GUITreeViewEx ------ Check/clear parent and child checkboxes in a TreeView
Marquee ----------- Scrolling tickertape GUIs
NoFocusLines ------- Remove the dotted focus lines from buttons, sliders, radios and checkboxes
Notify ------------- Small notifications on the edge of the display
Scrollbars ----------Automatically sized scrollbars with a single command
StringSize ---------- Automatically size controls to fit text
Toast -------------- Small GUIs which pop out of the notification area

 

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now