Sign in to follow this  
Followers 0
fett8802

{SOLVED} Need help with StringRegExpReplace

31 posts in this topic

#1 ·  Posted (edited)

Hello All,

So I'm trying to generate an html file based on an input. It's an added feature to a program I've been creating for my workplace. The user will write their code like this:

For bolded text: *This is bolded text*

Now, what I need my function to do is replace ONLY the *'s that are directly next to something that isn't a blank space. Also, it needs to be able to to *T as '<b>T' and t* as 't</b>'. However, it should NOT replace an * all by itself.

Now, I don't need the function written (as it is going to be doing a LOT of substitution), but I've seen other people get help on StringRegExpReplace and it's always kind of confused me.

I didn't see an exact way to do it in the help file, and was hoping some of you more knowledgeable then me on this particular function could give me some help.

Thanks for you help!

-Fett

Edited by fett8802

[sub]My UDF[/sub][sub] - Basics and Time extensions. Great for those new at AutoIt, also contains some powerful time extensions for pros.[/sub][sub]ScrabbleIt[/sub][sub] - Scrabble done in pure AutoIt. (In Progress)[/sub][sub]Nerd Party Extreme | My Portfolio | [email="fett8802@gmail.com"]Contact Me[/email][/sub]

Share this post


Link to post
Share on other sites



fett8802,

I am not very good at SREs, so this can surely be bettered: :x

$sText = "*This is * bolded* *As i*s this*"

$sText = StringRegExpReplace($sText, "( |\A)(\*)(\S)", "$1<b>$3")
$sText = StringRegExpReplace($sText, "(\S)(\*)( |\z)", "$1</b>$3")

ConsoleWrite($sText & @CRLF)

Explanation of first SRE:

( |\A) = Capturing group of either a space or the beginning of a string
(\*)   = Capturing group of a single "*"
(\S)   = Capturing group of any non-whitespace character

$1<b>$3 = First captured group followed by "<b>" and the third group

Second SRE is essentially the same, but with "end of string" and a reversed order. You can see they ignore the * characters with no adjacent space.

No doubt a real SRE guru will be along in a minute to give you a sexy solution, but this does at least work! :P

M23


Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind._______My UDFs:

Spoiler

ArrayMultiColSort ---- Sort arrays on multiple columns
ChooseFileFolder ---- Single and multiple selections from specified path treeview listing
Date_Time_Convert -- Easily convert date/time formats, including the language used
ExtMsgBox --------- A highly customisable replacement for MsgBox
GUIExtender -------- Extend and retract multiple sections within a GUI
GUIFrame ---------- Subdivide GUIs into many adjustable frames
GUIListViewEx ------- Insert, delete, move, drag, sort, edit and colour ListView items
GUITreeViewEx ------ Check/clear parent and child checkboxes in a TreeView
Marquee ----------- Scrolling tickertape GUIs
NoFocusLines ------- Remove the dotted focus lines from buttons, sliders, radios and checkboxes
Notify ------------- Small notifications on the edge of the display
Scrollbars ----------Automatically sized scrollbars with a single command
StringSize ---------- Automatically size controls to fit text
Toast -------------- Small GUIs which pop out of the notification area

 

Share this post


Link to post
Share on other sites

Thanks Melba!

So, (for my understanding) this string: "( |\A)(\*)(\S)"

is calling " *a" or " *f" etc. , not all " " and "*" and "d", but all three together in a row as a group of three because of the quotes?

No wonder those strings looked uber-confusing in other people's code. Yikes.

Anyway, thanks for the lesson Melba, it is much appreciated!

-Fett


[sub]My UDF[/sub][sub] - Basics and Time extensions. Great for those new at AutoIt, also contains some powerful time extensions for pros.[/sub][sub]ScrabbleIt[/sub][sub] - Scrabble done in pure AutoIt. (In Progress)[/sub][sub]Nerd Party Extreme | My Portfolio | [email="fett8802@gmail.com"]Contact Me[/email][/sub]

Share this post


Link to post
Share on other sites

fett8802,

Nothing to do with the quotes - they are just a required part of the SRER syntax. It is the justaposition of the 3 elements that is the key - it will only match when it finds all 3 together. :x

M23


Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind._______My UDFs:

Spoiler

ArrayMultiColSort ---- Sort arrays on multiple columns
ChooseFileFolder ---- Single and multiple selections from specified path treeview listing
Date_Time_Convert -- Easily convert date/time formats, including the language used
ExtMsgBox --------- A highly customisable replacement for MsgBox
GUIExtender -------- Extend and retract multiple sections within a GUI
GUIFrame ---------- Subdivide GUIs into many adjustable frames
GUIListViewEx ------- Insert, delete, move, drag, sort, edit and colour ListView items
GUITreeViewEx ------ Check/clear parent and child checkboxes in a TreeView
Marquee ----------- Scrolling tickertape GUIs
NoFocusLines ------- Remove the dotted focus lines from buttons, sliders, radios and checkboxes
Notify ------------- Small notifications on the edge of the display
Scrollbars ----------Automatically sized scrollbars with a single command
StringSize ---------- Automatically size controls to fit text
Toast -------------- Small GUIs which pop out of the notification area

 

Share this post


Link to post
Share on other sites

Absolutely beautiful. That solves all of my problems. You have made my day!

Hopefully I can use SRE's more in the future.

-Fett


[sub]My UDF[/sub][sub] - Basics and Time extensions. Great for those new at AutoIt, also contains some powerful time extensions for pros.[/sub][sub]ScrabbleIt[/sub][sub] - Scrabble done in pure AutoIt. (In Progress)[/sub][sub]Nerd Party Extreme | My Portfolio | [email="fett8802@gmail.com"]Contact Me[/email][/sub]

Share this post


Link to post
Share on other sites

This can be catchy but I'd gamble the following:

StringRegExpReplace($s, '\*(\w+)\*', '<b>$1</b>')

For instance with the input

$s = 'this * *is* text but not all * ** *** are replaced even *this is not * and *neither* this *--* but that number *666* is (but not *1.23*)!'

you get

'this * <b>is</b> text but not all * ** *** are replaced even *this is not * and <b>neither</b> this *--* but that number <b>666</b> is (but not *1.23*)!'

You may want to play with multiline option and such, if necessary.


This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites

#7 ·  Posted (edited)

I am finding it easier to learn some SRER by busting each step out individually first to make sure Ive covered all the bases:

$string = "* *b *1 ** *B *0 *"


$out1 = StringRegExpReplace ($string , "\*" , "<b>")    ; changes all asterix to <b> tags

msgbox (0, '' , $string & @CRLF & $out1)

$out2 = StringRegExpReplace ($out1 , "<b><b>" , "**") ; finds any doubled up <b> tags and changes them back to asterix

msgbox (0, '' , $string & @CRLF & $out2)

$out3 = StringRegExpReplace ($out2 , "<b>\z" , " *") ; checks the end for <b> tag and changes it as there would be nothing after

msgbox (0, '' , $string & @CRLF & $out3)

$out4 = StringRegExpReplace ($out3 , "<b>\h" , "* ") ; checks for <b> tags with white space after and changes them back

msgbox (0, '' , $string & @CRLF & $out4)
Edited by iamtheky

,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-.
|(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/
(_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_)
| | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) (
| | | | |)| | \ / | | | | | |)| | `--. | |) \ | |
`-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_|
'-' '-' (__) (__) (_) (__)

Share this post


Link to post
Share on other sites

Ok, so I guess my question now (for learning) is, why does it need to be "$1<b>$3" and not just "</b>"


[sub]My UDF[/sub][sub] - Basics and Time extensions. Great for those new at AutoIt, also contains some powerful time extensions for pros.[/sub][sub]ScrabbleIt[/sub][sub] - Scrabble done in pure AutoIt. (In Progress)[/sub][sub]Nerd Party Extreme | My Portfolio | [email="fett8802@gmail.com"]Contact Me[/email][/sub]

Share this post


Link to post
Share on other sites

Explanation of my solution:

\* match an asterisk which must be immediately followed by ...

(\w+) one or more words (they are captured as $1) immediately followed by ...

\* another asterisk

The replace part is then obvious.

It would be easy to be more specific about what constitutes a word (look at how the pattern matches various numbers).


This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites

I am finding it easier to learn some SRER by busting each step out individually first to make sure Ive covered all the bases:

$string = "* *b *1 ** *B *0 *"


$out1 = StringRegExpReplace ($string , "\*" , "<b>")    ; changes all asterix to <b> tags

msgbox (0, '' , $string & @CRLF & $out1)

$out2 = StringRegExpReplace ($out1 , "<b><b>" , "**") ; finds any doubled up <b> tags and changes them back to asterix

msgbox (0, '' , $string & @CRLF & $out2)

$out3 = StringRegExpReplace ($out2 , "<b>\z" , " *") ; checks the end for <b> tag and changes it as there would be nothing after

msgbox (0, '' , $string & @CRLF & $out3)

$out4 = StringRegExpReplace ($out3 , "<b>\h" , "* ") ; checks for <b> tags with white space after and changes them back

msgbox (0, '' , $string & @CRLF & $out4)

your second expression would be better as

$out2 = StringRegExpReplace($out1, "(?i)(<b>)\1+", "$1")

The next one should be

$out3 = StringRegExpReplace($out2, "(?i)(<b>)\s*$", "$1")

George

Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.

Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.***

The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number.

Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else.

"Old age and treachery will always overcome youth and skill!"

Share this post


Link to post
Share on other sites

fett8802,

why does it need to be "$1<b>$3" and not just "</b>"

I assume that was adddressed to me. :x

The SRER searches for "something", "*", "something". If we simply replaced all of that with just "</b>", we would lose the 2 "something"s on either side. So we made sure that we captured them (by putting brackets around them ) and then we replace all 3 sections by the 2 "something"s with a new section in the middle. That sounds horribly complicated - but is actually reasonably simple if you take it slowly. :P

As I said earlier, I am not an expert on SREs by any means. I freely admit that I find them the hardest thing I have ever tried to learn in computing. So you should also look carefully as jchd's suggestions - he is far better at SREs than I ever will be. :shifty:

M23


Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind._______My UDFs:

Spoiler

ArrayMultiColSort ---- Sort arrays on multiple columns
ChooseFileFolder ---- Single and multiple selections from specified path treeview listing
Date_Time_Convert -- Easily convert date/time formats, including the language used
ExtMsgBox --------- A highly customisable replacement for MsgBox
GUIExtender -------- Extend and retract multiple sections within a GUI
GUIFrame ---------- Subdivide GUIs into many adjustable frames
GUIListViewEx ------- Insert, delete, move, drag, sort, edit and colour ListView items
GUITreeViewEx ------ Check/clear parent and child checkboxes in a TreeView
Marquee ----------- Scrolling tickertape GUIs
NoFocusLines ------- Remove the dotted focus lines from buttons, sliders, radios and checkboxes
Notify ------------- Small notifications on the edge of the display
Scrollbars ----------Automatically sized scrollbars with a single command
StringSize ---------- Automatically size controls to fit text
Toast -------------- Small GUIs which pop out of the notification area

 

Share this post


Link to post
Share on other sites

Don't believe that one second. The actual reason is that I've lost all my hair trying RegExps while Melba has kept all of it. To me this guy is just lazy, hairy but lazy :x


This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites

The SRER searches for "something", "*", "something". If we simply replaced all of that with just "</b>", we would lose the 2 "something"s on either side. So we made sure that we captured them (by putting brackets around them ) and then we replace all 3 sections by the 2 "something"s with a new section in the middle. That sounds horribly complicated - but is actually reasonably simple if you take it slowly. :x

That actually makes perfect sense. So, does the number to the right of the $ denote the "something" order?

Like, ( |A)(\*)(\S) ----- ( |A) is the $1, (\*) is the $2, and (\S) is the $3? So then I could put any length in $2 as long as I denote $1 and $3? That makes sense.

Thanks Melba!


[sub]My UDF[/sub][sub] - Basics and Time extensions. Great for those new at AutoIt, also contains some powerful time extensions for pros.[/sub][sub]ScrabbleIt[/sub][sub] - Scrabble done in pure AutoIt. (In Progress)[/sub][sub]Nerd Party Extreme | My Portfolio | [email="fett8802@gmail.com"]Contact Me[/email][/sub]

Share this post


Link to post
Share on other sites

fett8802,

does the number to the right of the $ denote the "something" order?

Not quite - it is the index of the capturing group (the ones in parentheses). You can also have non-capturing groups, perhaps to aid the location of the searhed-for characters in the string, but obviously there is no way to reuse them later. :x

Told you it was complex! :P

M23


Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind._______My UDFs:

Spoiler

ArrayMultiColSort ---- Sort arrays on multiple columns
ChooseFileFolder ---- Single and multiple selections from specified path treeview listing
Date_Time_Convert -- Easily convert date/time formats, including the language used
ExtMsgBox --------- A highly customisable replacement for MsgBox
GUIExtender -------- Extend and retract multiple sections within a GUI
GUIFrame ---------- Subdivide GUIs into many adjustable frames
GUIListViewEx ------- Insert, delete, move, drag, sort, edit and colour ListView items
GUITreeViewEx ------ Check/clear parent and child checkboxes in a TreeView
Marquee ----------- Scrolling tickertape GUIs
NoFocusLines ------- Remove the dotted focus lines from buttons, sliders, radios and checkboxes
Notify ------------- Small notifications on the edge of the display
Scrollbars ----------Automatically sized scrollbars with a single command
StringSize ---------- Automatically size controls to fit text
Toast -------------- Small GUIs which pop out of the notification area

 

Share this post


Link to post
Share on other sites

fett8802,

I assume that was adddressed to me. :x

The SRER searches for "something", "*", "something". If we simply replaced all of that with just "</b>", we would lose the 2 "something"s on either side. So we made sure that we captured them (by putting brackets around them ) and then we replace all 3 sections by the 2 "something"s with a new section in the middle. That sounds horribly complicated - but is actually reasonably simple if you take it slowly. :P

As I said earlier, I am not an expert on SREs by any means. I freely admit that I find them the hardest thing I have ever tried to learn in computing. So you should also look carefully as jchd's suggestions - he is far better at SREs than I ever will be. :shifty:

M23

Excuses, excuses. jchd has the right pattern and I'm sure you know how to test it by now. New version released today.

For a beginner you were at least on the right track but as jchd pointed out it can be done with a single SRER instead of two.


George

Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.

Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.***

The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number.

Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else.

"Old age and treachery will always overcome youth and skill!"

Share this post


Link to post
Share on other sites

George,

Downloaded thanks. :x

I agree with you (as nearly always :P ) - as soon as I saw jchd's version I knew it was the way to go. But I may as well use my feeble efforts as learning points. After all you guys have forgotten what it was like when you started - I still have the fresh scar tissue on my frontal lobes! :shifty:

M23


Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind._______My UDFs:

Spoiler

ArrayMultiColSort ---- Sort arrays on multiple columns
ChooseFileFolder ---- Single and multiple selections from specified path treeview listing
Date_Time_Convert -- Easily convert date/time formats, including the language used
ExtMsgBox --------- A highly customisable replacement for MsgBox
GUIExtender -------- Extend and retract multiple sections within a GUI
GUIFrame ---------- Subdivide GUIs into many adjustable frames
GUIListViewEx ------- Insert, delete, move, drag, sort, edit and colour ListView items
GUITreeViewEx ------ Check/clear parent and child checkboxes in a TreeView
Marquee ----------- Scrolling tickertape GUIs
NoFocusLines ------- Remove the dotted focus lines from buttons, sliders, radios and checkboxes
Notify ------------- Small notifications on the edge of the display
Scrollbars ----------Automatically sized scrollbars with a single command
StringSize ---------- Automatically size controls to fit text
Toast -------------- Small GUIs which pop out of the notification area

 

Share this post


Link to post
Share on other sites

We all have that scar. I actually don't even like SREs but in the long run it's well worth the effort to learn them. I hated them in other languages as well but I finally saw that they were by far the better way to accomplish some tasks and usually save several lines of code in the process.


George

Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.

Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.***

The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number.

Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else.

"Old age and treachery will always overcome youth and skill!"

Share this post


Link to post
Share on other sites

#18 ·  Posted (edited)

I need to quit but I'll expand a bit more tomorrow for those interested.

Edited by jchd

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites

thanks GEO. I am enjoying these threads that gauge the vastness of my SRER ineptitude (its hella-vast).


,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-.
|(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/
(_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_)
| | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) (
| | | | |)| | \ / | | | | | |)| | `--. | |) \ | |
`-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_|
'-' '-' (__) (__) (_) (__)

Share this post


Link to post
Share on other sites

#20 ·  Posted (edited)

Does the AutoIt RegEx engine support positive look ahead and look behind? If so, you could use:

$out1 = StringRegExpReplace ($string , "\B\*(?=[^*\s])"  , "<b>" ) ; changes all asterix to <b> tags if begin on a word boundary and next to a non-asterisk or non-space.
$out1 = StringRegExpReplace ($out1   , "(?<=[^*\s])\*\B" , "</b>") ; changes all asterix to </b> tags if end on a word boundary and next to a non-asterisk or non-space.

I haven't tried the above code, but the regex would work in other languages like Perl.

It won't convert "**", "hello*there", nor "* hello there *", but I don't know if that's a good or bad thing based on the requirements. :x

Edited by GPinzone

Gerard J. Pinzonegpinzone AT yahoo.com

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0