Jump to content

Regex: how to exclude a word from the found string


Recommended Posts

Hello everyone

I'm trying to perform some regex find/replace on an XML file in which only some tag pairs must be affected, and the clue to know which tag pairs must be affected comes *after* the tag pair.

Here's a simplified example of my text:

<target>some text</target> some text <target>some text </target> some text <target>some text</target> <reference> some text <target> some text </target>some text

I want to replace "<target>some text</target> <reference>" with "<target>{{some text}}</target> <reference>".  But I can't figure out how to tell the regex not to include all the other "target"s that come before the text that I really want to match.

My limited regex knowledge tells me that this ought to work:

"(<target>)(.+?)^(<target>)(</target> <reference>)", "$1{{$2}}$4"

...but it doesn't.  If I omit the ^(<target>), then AutoIt matches everything from the first <target> all the way up to the <reference>.

Can this be done?

Thanks

Samuel

 

Link to comment
Share on other sites

By the way, I solved my problem by using StringReverse:

#include <MsgBoxConstants.au3>
$foo = "<target>some text</target> some text <target>some text </target> some text <target>some text</target> <reference> some text <target> some text </target>some text"
$foofoo = StringReverse ($foo)
$bar = StringRegExpReplace ($foofoo, "(>ecnerefer< >tegrat/<)(txet emos)(>tegrat<)", "$1}}$2{{$3")
$barbar = StringReverse ($bar)
MsgBox (0, "", $barbar, 0)

...but I'm thinking that this must be possible using regex instead of a kludge.

Link to comment
Share on other sites

 

$foo = "<target>some text</target> some text <target>some text </target> some text <target>some text</target> <reference> some text <target> some text </target>some text"
msgbox(0, '' , stringreplace($foo , "<target>some text</target> <reference>" , "<target>{{some text}}</target> <reference>"))

 

Edited by iamtheky

,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-.
|(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/
(_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_)
| | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) (
| | | | |)| | \ / | | | | | |)| | `--. | |) \ | |
`-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_|
'-' '-' (__) (__) (_) (__)

Link to comment
Share on other sites

2 minutes ago, iamtheky said:

 

$foo = "<target>some text</target> some text <target>some text </target> some text <target>some text</target> <reference> some text <target> some text </target>some text"
msgbox(0, '' , stringreplace($foo , "<target>some text</target> <reference>" , "<target>{{some text}}</target> <reference>"))

 

Thanks, but "some text" in my example is a placeholder for some text.  I don't know in advance what that text is going to be.

Link to comment
Share on other sites

  • Moderators

leuce,

You need to make the RegEx non-greedy - then it looks for the shortest match:

$sText = "<target>some text</target> some text <target>some text </target> some text <target>some text</target> <reference> some text <target> some text </target>some text"

$sNewText = StringRegExpReplace($sText, "(?U)(<target>)(some text)(</target> <reference>)", "$1{{$2}}$3")

ConsoleWrite($sNewText & @CRLF)

M23

Public_Domain.png.2d871819fcb9957cf44f4514551a2935.png Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind

Open spoiler to see my UDFs:

Spoiler

ArrayMultiColSort ---- Sort arrays on multiple columns
ChooseFileFolder ---- Single and multiple selections from specified path treeview listing
Date_Time_Convert -- Easily convert date/time formats, including the language used
ExtMsgBox --------- A highly customisable replacement for MsgBox
GUIExtender -------- Extend and retract multiple sections within a GUI
GUIFrame ---------- Subdivide GUIs into many adjustable frames
GUIListViewEx ------- Insert, delete, move, drag, sort, edit and colour ListView items
GUITreeViewEx ------ Check/clear parent and child checkboxes in a TreeView
Marquee ----------- Scrolling tickertape GUIs
NoFocusLines ------- Remove the dotted focus lines from buttons, sliders, radios and checkboxes
Notify ------------- Small notifications on the edge of the display
Scrollbars ----------Automatically sized scrollbars with a single command
StringSize ---------- Automatically size controls to fit text
Toast -------------- Small GUIs which pop out of the notification area

 

Link to comment
Share on other sites

8 minutes ago, Melba23 said:

leuce,

You need to make the RegEx non-greedy - then it looks for the shortest match:

$sText = "<target>some text</target> some text <target>some text </target> some text <target>some text</target> <reference> some text <target> some text </target>some text"

$sNewText = StringRegExpReplace($sText, "(?U)(<target>)(some text)(</target> <reference>)", "$1{{$2}}$3")

ConsoleWrite($sNewText & @CRLF)

M23

Thanks, Melba23, but "some text" in my example is just some text.  So I'd have to update your suggestion with something like .+?, i.e. "anything".  But this:

"(?U)(<target>)(.+)(</target> <reference>)", "$1{{$2}}$3")

yields this:

<target>{{some text</target> some text <target>some text </target> some text <target>some text}}</target> <reference> some text <target> some text </target>some text

instead of this:

<target>some text</target> some text <target>some text </target> some text <target>{{some text}}</target> <reference> some text <target> some text </target>some text

 

Link to comment
Share on other sites

  • Moderators

leuce,

I see the problem, but I am afraid I have no idea how to solve it. We will need to wait for one of the gurus to pass by.

M23

Public_Domain.png.2d871819fcb9957cf44f4514551a2935.png Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind

Open spoiler to see my UDFs:

Spoiler

ArrayMultiColSort ---- Sort arrays on multiple columns
ChooseFileFolder ---- Single and multiple selections from specified path treeview listing
Date_Time_Convert -- Easily convert date/time formats, including the language used
ExtMsgBox --------- A highly customisable replacement for MsgBox
GUIExtender -------- Extend and retract multiple sections within a GUI
GUIFrame ---------- Subdivide GUIs into many adjustable frames
GUIListViewEx ------- Insert, delete, move, drag, sort, edit and colour ListView items
GUITreeViewEx ------ Check/clear parent and child checkboxes in a TreeView
Marquee ----------- Scrolling tickertape GUIs
NoFocusLines ------- Remove the dotted focus lines from buttons, sliders, radios and checkboxes
Notify ------------- Small notifications on the edge of the display
Scrollbars ----------Automatically sized scrollbars with a single command
StringSize ---------- Automatically size controls to fit text
Toast -------------- Small GUIs which pop out of the notification area

 

Link to comment
Share on other sites

  • Moderators

leuce,

Having just said that, this seems to work:

$sText = "<target> some text </target> some text <target> some text </target> some text <target>some text</target> <reference> some text <target> some text </target>some text"

$sNewText = StringRegExpReplace($sText, "(?U)(<target>)([^<]*)(</target> <reference>)", "$1{{$2}}$3")

ConsoleWrite($stext & @CRLF & $sNewText & @CRLF)

The RegEx is asked not to include any "<" characters in the intervening text - which means that only the actual text we are looking for will work (unless you have any of them in the text itself).

M23

Public_Domain.png.2d871819fcb9957cf44f4514551a2935.png Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind

Open spoiler to see my UDFs:

Spoiler

ArrayMultiColSort ---- Sort arrays on multiple columns
ChooseFileFolder ---- Single and multiple selections from specified path treeview listing
Date_Time_Convert -- Easily convert date/time formats, including the language used
ExtMsgBox --------- A highly customisable replacement for MsgBox
GUIExtender -------- Extend and retract multiple sections within a GUI
GUIFrame ---------- Subdivide GUIs into many adjustable frames
GUIListViewEx ------- Insert, delete, move, drag, sort, edit and colour ListView items
GUITreeViewEx ------ Check/clear parent and child checkboxes in a TreeView
Marquee ----------- Scrolling tickertape GUIs
NoFocusLines ------- Remove the dotted focus lines from buttons, sliders, radios and checkboxes
Notify ------------- Small notifications on the edge of the display
Scrollbars ----------Automatically sized scrollbars with a single command
StringSize ---------- Automatically size controls to fit text
Toast -------------- Small GUIs which pop out of the notification area

 

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...