Jump to content

StringRegExpReplace


Recommended Posts

  • Moderators

Ok, I'm running into an issue with SRE-Replace.

I'm wanting to replace a specific character that isn't already in a word... I thought I had a quick fix with:

Local $sFindString = 'a'
Local $sString = 'a bat is a nice idea if you are playing baseball a'
$sString = StringRegExpReplace($sString, '(\w*\W*[^a-zA-Z0-9]+)' & $sFindString & '([^a-zA-Z0-9]+\W*\w*)', '!')
MsgBox(0, '', $sString)
But... that doesn't seem to be the case. Maybe the 3hrs of sleep in the last 2 days is catching up, but does anyone have an idea on how to replace the 3 stand alone a(s) only?... I purposely put the "a" at the begining ... End ... and with a space surrounding it as that's the scenerios I can see happening.

I've tried \n* ... \n ... \N ... \N? ... \N?* for the line feeds but nada. I know as soon as I see it I'll be peeved from fatigue (a poet and didn't know it).

Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.

Link to comment
Share on other sites

Hi SmOke_n;

It is a pain that the AutoIt Regexp engine does not use somtihing to indicate "start of line" (^) and "End of line" ($). It might come down to habit thought.

Anyhow maybe you can use the "start of word" and "end of word" indicators?

Local $sFindString = 'a'
    Local $sString = 'a bat is a nice idea if you are playing baseball and have sun a'
    ;$sString = StringRegExpReplace($sString, '(\w*\W*[^a-zA-Z0-9]+)' & $sFindString & '([^a-zA-Z0-9]+\W*\w*)', '!')
    $sString = StringRegExpReplace($sString, '(\<' & $sFindString & '\>)', '!')
    MsgBox(0, '', $sString)
Link to comment
Share on other sites

  • Moderators

Hi SmOke_n;

It is a pain that the AutoIt Regexp engine does not use somtihing to indicate "start of line" (^) and "End of line" ($). It might come down to habit thought.

Anyhow maybe you can use the "start of word" and "end of word" indicators?

Local $sFindString = 'a'
    Local $sString = 'a bat is a nice idea if you are playing baseball and have sun a'
    ;$sString = StringRegExpReplace($sString, '(\w*\W*[^a-zA-Z0-9]+)' & $sFindString & '([^a-zA-Z0-9]+\W*\w*)', '!')
    $sString = StringRegExpReplace($sString, '(\<' & $sFindString & '\>)', '!')
    MsgBox(0, '', $sString) oÝ÷ Ûú®¢×©ä±K^.z˲.ØZ½è"½éæº[b¦W¬qéÞ®*,­ûay· yâ
¶¬¶¸§©Ý¶Ú¶¬¶¸§ªÞiØ Ú+ë-¢w¨®Ø^¢Ø^¯¬iØ Ùbë(jëh×6
    Local $sFindString = '9a1'
    Local $sString = '9a1 bat is 9a1 nice idea if you are playing baseball and have sun 9a1'
\d* or ([0-9]+) doesn't seem to work in that situation in front of \< or behind it. I guess also special characters could throw it off as well... but you could always string replace them... but you run into the situation of them not being a "word" character.

Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.

Link to comment
Share on other sites

  • Moderators

Ugh!!

What a crappy work around

Local $sFindString = '9a1', $aWord = _StripNonWordChars($sFindString)
Local $sString = '9a1 bat is 9a1 nice idea if you are playing baseball and have sun 9a1'
$sString = StringRegExpReplace($sString, $aWord[1] & '(\<' & $aWord[2] & '\>)' & $aWord[3], '!')
MsgBox(0, '', $sString) oÝ÷ Ù«­¢+Ù1½°ÀÌØíÍ¥¹MÑÉ¥¹ôÌäìÅÕ½ÐíÅÕ½ÐìÌäì°ÀÌØí]½Éô}MÑÉ¥Á9½¹]½É
¡ÉÌ ÀÌØíÍ¥¹MÑÉ¥¹¤)1½°ÀÌØíÍMÑÉ¥¹ôÌäìåÄÐ¥ÌÅÕ½ÐíÅÕ½Ðì¹¥¥¥å½ÔÉÁ±å¥¹Í±°¹¡ÙÍÕ¸åÄÌäì(ÀÌØíÍMÑÉ¥¹ôMÑÉ¥¹IáÁIÁ± ÀÌØíÍMÑÉ¥¹°ÀÌØí]½ÉlÅtµÀìÌäì ÀäÈì±ÐìÌäìµÀìÀÌØí]½ÉlÉtµÀìÌäìÀäÈìÐì¤ÌäìµÀìÀÌØí]½ÉlÍt°ÌäìÌÌìÌäì¤)5Í  ½à À°ÌäìÌäì°ÀÌØíÍMÑÉ¥¹¤oÝ÷ Ø:²}ý·
+«zØhºZºÚ"µÍØØ[    ÌÍÜÑ[Ý[ÈH ÌÎNÉ][ÝÉØI][ÝÉÌÎNË   ÌÍØUÛÜHÔÝÛÛÜÚÊ  ÌÍÜÑ[Ý[ÊBØØ[    ÌÍÜÔÝ[ÈH  ÌÎNÎXLH]È   ][ÝØI][ÝÈXÙHYXHY[ÝHH^Z[ÈÙX[[]HÝ[   ][ÝÉØI][ÝÉÌÎNÂÌÍÜÔÝ[ÈHÝ[ÔYÑ^XÙJ   ÌÍÜÔÝ[Ë   ÌÍØUÛÜÌWH [È ÌÎNÊ ÌLÉÉÌÎNÈ  [È ÌÍØUÛÜÌH  [È ÌÎNÉÌLÉÝÊIÌÎNÈ    [È ÌÍØUÛÜÌ×K    ÌÎNÉÌÌÎÉÌÎNÊBÙÐÞ
    ÌÎNÉÌÎNË  ÌÍÜÔÝ[ÊHoÝ÷ Ø  Ý¢ºÞrÙrêÞjëh×6Local $sFindString = '"a', $aWord = _StripNonWordChars($sFindString)
Local $sString = '9a1 bat is "a" nice idea if you are playing baseball and have sun "a'
$sString = StringRegExpReplace($sString, $aWord[1] & '(\<' & $aWord[2] & '\>)' & $aWord[3], '!')
MsgBox(0, '', $sString) oÝ÷ Ù@Ŷ-®*M¢u¨­Ð¡j»ºÚ"µÍ[ÈÔÝÛÛÜÚÊ  ÌÍÜÔÝ[ÊBSØØ[    ÌÍØUÛÜÍK  ÌÍÚPÐÂSØØ[   ÌÍÜÔK   ÌÍÚSK    ÌÍÚS ÌÍÜÓZYRY
Ý[ÒÐ[[ÓY
    ÌÍÜÔÝ[ËJJHÜÂBBTÝ[ÓY
    ÌÍÜÔÝ[ËJHH    ÌÎN×ÉÌÎNÊH[ÂBBJÝ[ÒÐ[[ÔYÚ
    ÌÍÜÔÝ[ËJJHÜÂBBTÝ[ÔYÚ
    ÌÍÜÔÝ[ËJHH    ÌÎN×ÉÌÎNÊH[BIÌÍØUÛÜÌHH ÌÍÜÔÝ[ÂBT]    ÌÍØUÛÜQ[YQÜ   ÌÍÚPÐÈHHÈÝ[Ó[   ÌÍÜÔÝ[ÊBBIÌÍÜÓZYHÝ[ÓZY
    ÌÍÜÔÝ[Ë   ÌÍÚPÐËJBBRYÝ[ÒÐ[J   ÌÍÜÓZY
HÜ ÌÍÜÓZYH ÌÎN×ÉÌÎNÈ[BBIÌÍØUÛÜÌWHHÝ[ÓZY
    ÌÍÜÔÝ[Ë   ÌÍÚPÐÈHK   ÌÍÚPÐÈHJBBBIÌÍÜÔÝ[ÈHÝ[Õ[SY
    ÌÍÜÔÝ[Ë   ÌÍÚPÐÈHJBBBQÜ ÌÍÞÐÈHÝ[Ó[   ÌÍÜÔÝ[ÊHÈHÝHBBBBIÌÍÜÓZYHÝ[ÓZY
    ÌÍÜÔÝ[Ë   ÌÍÞÐËJBBBBRYÝ[ÒÐ[J  ÌÍÜÓZY
HÜ ÌÍÜÓZYH ÌÎN×ÉÌÎNÈ[BBBBIÌÍØUÛÜÌHHÝ[Õ[TYÚ
    ÌÍÜÔÝ[ËÝ[Ó[ ÌÍÜÔÝ[ÊHH
    ÌÍÞÐÊJBBBBBIÌÍØUÛÜÌ×HHÝ[ÔXÙJ ÌÍÜÔÝ[Ë   ÌÍØUÛÜÌK  ÌÎNÉÌÎNÊBBBBBQ^]ÛÜBBBQ[YBBS^BBRY    ÌÍØUÛÜÌHH ÌÎNÉÌÎNÈ[ ÌÍØUÛÜÌHH ÌÍÜÔÝ[ÂBBQ^]ÛÜBQ[YS^T]  ÌÍØUÛÜ[[

This will serve 1 of my needs, but not both... SRE is so powerful and a time saver... this is a shame :nuke:

Edit:

Blah... Scratch that!

I just thought there could be a scenerio of 9a1a1 or something in which case $aWord[3] could be 1a1, when it should be 1, would need to do a string reverse to go backwards, then another to straighten it out... pfft... forget it, I look at an even worse work around.

Edit2:

Seemed to have fixed the StripChar udf.... didn't need stringreverse just reverse the loop :P

Edited by SmOke_N

Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.

Link to comment
Share on other sites

Hi again,

This is how I would like to write this

StringRegExpReplace($string, '(\W+|^\W*)(' & $sFindString & ')(\W+|\W*$)', '\1' & $replacement & '\3') oÝ÷ Øzkh+,¶­¢)ÝÆ­zËZ®Ú)Þ¢·Zµ¦§wMú¶§v'µç§våw¨­Ö­i©e¶­g¢f²mç²¢ç(ºWç$¶jÖ¬z×ë¢fÞ)âh~*ì¶ÊZqæ§uúèØ^zwm¢Ø^~*ì¶ÊZqæ§vØ^Ø^­ë-È­¶êçyÚ.¶l»-jÛ^×bvz-ÂäjDZ¥ç-yÐn¶+lyé¬)jwtߨ¬è­çb~Ø^{kzË"¢{-jwljZ''¢Û¬zØ^¢¶®¶­sdgVæ27G&æu&VtW&WÆ6T6" Æö6Âb33c·4fæE7G&ærÒb33¶b33²²äõDRFR&WÆ6VÖVçB7G&ær6÷VÆBæ÷BÖ¶R6öæfÆ7G2vFFR7G&ærFòfæBà Æö6Âb33c·57G&ærÒb33³&B2gV÷C¶gV÷C²æ6RFVb÷R&RÆær&6V&ÆÂæBfR7Vâb33° µ&WÆ6RB7F'BöbFF b33c·57G&ærÒ7G&æu&VtW&WÆ6Rb33c·57G&ærÂb33µâb33²fײb33c·4fæE7G&ærfײb33²b3#µr²b33²Âb33²b333²b3#³b33²Â µ&WÆ6RâÖFÆRöbFF b33c·57G&ærÒ7G&æu&VtW&WÆ6Rb33c·57G&ærÂb33²b3#µr²b33²fײb33c·4fæE7G&ærfײb33²b3#µr²b33²Âb33²b3#³b333²b3#³"b33²Â µ&WÆ6RBVæBöbFF b33c·57G&ærÒ7G&æu&VtW&WÆ6Rb33c·57G&ærÂb33²b3#µr²b33²fײb33c·4fæE7G&ærfײb33²b33c²b33²Âb33²b3#³b333²b33²Â ×6t&÷Âb33²b33²Âb33c·57G&ær¤VæDgVæ0

Probably not efficient usage of the regexp engine, but what do you do :P .

PS: I did not test your _StripNonWordChars UDF. But I understand it works for you so the above is just a humble sugestion for later reference.

Link to comment
Share on other sites

  • Moderators

Thanks, I'll compare them in a bit :P

Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.

Link to comment
Share on other sites

Took another hard look at the samples you provided in your workaround and had to rework my own code a bit to make all of the sample pass as I understand the should. Only note that I add spaces to replace chars matching \W when the $stripNonWordChars is given to my StringRegExpReplaceChar (hmm, a realy stupid name but every child needs one :P ) function. The first and last space should probably be striped away.

Func testStringRegExpReplaceChar()
    Local $ret, $expect, $sFindString, $sString
    
    $sFindString = 'a'
    
    $sString = '9a1 bat is 9a1 nice idea if you are playing baseball and have sun 9a1'
    $expect = '9!1 bat is 9!1 nice idea if you are playing baseball and have sun 9!1'
    $ret = StringRegExpReplaceChar($sString, $sFindString, '!', 0)
    If $ret <> $expect Then MsgBox(16, "Test 1",'$ret:=' & $ret & @CRLF & "< == >" & @crlf & '$expect:=' & $expect) 
    
    $expect = ' ! bat is ! nice idea if you are playing baseball and have sun ! '
    $ret = StringRegExpReplaceChar($sString, $sFindString, ' ! ', 1) ;NOTE the extra spaces to replace \W chars
    If $ret <> $expect Then MsgBox(16, "Test 2",'$ret:=' & $ret & @CRLF & "< == >" & @crlf & '$expect:=' & $expect) 

    $expect = '9!1 bat is "!" nice idea if you are playing baseball and have sun "<!"'
    $sString = '9a1 bat is "a" nice idea if you are playing baseball and have sun "<a"'
    $ret = StringRegExpReplaceChar($sString, $sFindString, '!', 0)
    If $ret <> $expect Then MsgBox(16, "Test 3",'$ret:=' & $ret & @CRLF & "< == >" & @crlf & '$expect:=' & $expect) 
    
    $expect = ' ! bat is ! nice idea if you are playing baseball and have sun ! '
    $ret = StringRegExpReplaceChar($sString, $sFindString, ' ! ', 1) ;NOTE the extra spaces to replace \W chars
    If $ret <> $expect Then MsgBox(16, "Test 4",'$ret:=' & $ret & @CRLF & "< == >" & @crlf & '$expect:=' & $expect) 
    
    $sString = '9a1 bat is "a" nice idea if you are playing baseball and have sun "a'
    $expect = '9!1 bat is "!" nice idea if you are playing baseball and have sun "!'
    $ret = StringRegExpReplaceChar($sString, $sFindString, '!', 0)
    If $ret <> $expect Then MsgBox(16, "Test 5",'$ret:=' & $ret & @CRLF & "< == >" & @crlf & '$expect:=' & $expect) 
    
    $expect = ' ! bat is ! nice idea if you are playing baseball and have sun ! '
    $ret = StringRegExpReplaceChar($sString, $sFindString, ' ! ', 1) ;NOTE the extra spaces to replace \W chars
    If $ret <> $expect Then MsgBox(16, "Test 6",'$ret:=' & $ret & @CRLF & "< == >" & @crlf & '$expect:=' & $expect)     
    
    ;SmoKe_N This was your second test in the workarond
    $sString = '9a1 bat is "a" nice idea if you are playing baseball and have sun 9a1'
    $expect = '9!1 bat is "!" nice idea if you are playing baseball and have sun 9!1'
    $ret = StringRegExpReplaceChar($sString, $sFindString, '!', 0)
    If $ret <> $expect Then MsgBox(16, "Test 7",'$ret:=' & $ret & @CRLF & "< == >" & @crlf & '$expect:=' & $expect) 
    
    $expect = ' ! bat is ! nice idea if you are playing baseball and have sun ! '
    $ret = StringRegExpReplaceChar($sString, $sFindString, ' ! ', 1) ;NOTE the extra spaces to replace \W chars
    If $ret <> $expect Then MsgBox(16, "Test 8",'$ret:=' & $ret & @CRLF & "< == >" & @crlf & '$expect:=' & $expect)     
    
    $sString = 'a bat is "a" nice idea if you are playing baseball and have sun 9a1'
    $expect = '! bat is "!" nice idea if you are playing baseball and have sun 9!1'
    $ret = StringRegExpReplaceChar($sString, $sFindString, '!', 0)
    If $ret <> $expect Then MsgBox(16, "Test 7",'$ret:=' & $ret & @CRLF & "< == >" & @crlf & '$expect:=' & $expect) 
    
    $expect = ' ! bat is ! nice idea if you are playing baseball and have sun ! '
    $ret = StringRegExpReplaceChar($sString, $sFindString, ' ! ', 1) ;NOTE the extra spaces to replace \W chars
    If $ret <> $expect Then MsgBox(16, "Test 8",'$ret:=' & $ret & @CRLF & "< == >" & @crlf & '$expect:=' & $expect)     
    
    $sString = 'a bat is "a" nice idea if you are playing baseball and have sun a'
    $expect = '! bat is "!" nice idea if you are playing baseball and have sun !'
    $ret = StringRegExpReplaceChar($sString, $sFindString, '!', 0)
    If $ret <> $expect Then MsgBox(16, "Test 9",'$ret:=' & $ret & @CRLF & "< == >" & @crlf & '$expect:=' & $expect) 
    
    $expect = ' ! bat is ! nice idea if you are playing baseball and have sun ! '
    $ret = StringRegExpReplaceChar($sString, $sFindString, ' ! ', 1) ;NOTE the extra spaces to replace \W chars
    If $ret <> $expect Then MsgBox(16, "Test 10",'$ret:=' & $ret & @CRLF & "< == >" & @crlf & '$expect:=' & $expect)        
    
    ConsoleWrite("All done!" & @LF)
EndFunc

Func StringRegExpReplaceChar($data, $find, $replace, $stripNonWordChars = 0)
    If NOT $stripNonWordChars Then 
        $data = StringRegExpReplace($data, '^' & $find & '(\W+)', $replace & '\1' , 1) ;To cover TEst 7 and 9
        $data = StringRegExpReplace($data, '^(\W*)' & $find & '(\W+)',  '\1' & $replace & '\2' , 1) ;NOTE (\W*) will fail \2        
        $replace = '\1' & $replace & '\2'
    Else        
        $data = StringRegExpReplace($data, '^' & $find & '(\W+)', $replace , 1) ;To cover TEst 7 and 9
        $data = StringRegExpReplace($data, '^(\W*)' & $find & '(\W+)', $replace , 1) ;NOTE (\W*) will fail \2       
    EndIf 

    
    $data = StringRegExpReplace($data, '(\W+)' & $find & '(\W+)', $replace, 0)
    $data = StringRegExpReplace($data, '(\W+)' & $find & '(\W*)$' , $replace, 1) ;Did not work with Test 5
    $data = StringRegExpReplace($data, '(\W+)' & $find & '$' , $replace, 1) ;Hmm, this should not make it better
    Return $data
EndFunc
Link to comment
Share on other sites

Hi SmOke_n;

It is a pain that the AutoIt Regexp engine does not use somtihing to indicate "start of line" (^) and "End of line" ($). It might come down to habit thought.

Actually they are in there. Did I not document them properly? I will make that they work in my next batch of testing when I get all the repeating bugs fixed.

Uten: I will test your patterns with the others to make sure things are working the way they should when I have everything fixed.

Edited by Nutster

David Nuttall
Nuttall Computer Consulting

An Aquarius born during the Age of Aquarius

AutoIt allows me to re-invent the wheel so much faster.

I'm off to write a wizard, a wonderful wizard of odd...

Link to comment
Share on other sites

Yes, I came from the Unix world of AWK, Perl, etc.

It was/is a transistion to use AutoIt's regular expressions.

I was under the assumption that regex expressions were standard.

I know Perl has added its own additions in addition to regex.

The book on my desk was Jeffery Friedl's "Mastering Regular Expressions".

Also miss VI or VIM's regex. SciTe is more robust when programming AutoIt.

ViM

Link to comment
Share on other sites

  • Moderators

Yes, I came from the Unix world of AWK, Perl, etc.

It was/is a transistion to use AutoIt's regular expressions.

I was under the assumption that regex expressions were standard.

I know Perl has added its own additions in addition to regex.

The book on my desk was Jeffery Friedl's "Mastering Regular Expressions".

Also miss VI or VIM's regex. SciTe is more robust when programming AutoIt.

ViM

LOL, I keep reading this over and over trying to figure out the hidden meaning :P:nuke:

Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.

Link to comment
Share on other sites

Looked at this briefly and changed your test code to this:

Local $sFindString = 'a'
Local $sString = 'a bat is a nice idea if you are playing baseball a'
$sString = StringRegExpReplace($sString, '(\<|\s)+' & $sFindString & '(\>|\s)+', '!')
MsgBox(0, '', $sString)

This gets the leading a and the one in the middle of the string but does not replace the last one. Not sure why, maybe there is a problem with \> which should represent the end of word. Also tried $ for end of line but that did not work either. Leaving work now, I'll look at it tonight if I have time.

Hope this helps,

-Don

_____________________________________________________"some people live for the rules, I live for exceptions"Wallpaper Changer - Easily Change Your Windows Wallpaper

Link to comment
Share on other sites

LOL, I keep reading this over and over trying to figure out the hidden meaning :P:nuke:

@SmOke_N,

Your right. I was in a hurry and posted and ran. Oops. I left out something in the middle.

I'm having trouble with copy/paste from my Opera. Sorry

Anyway, it was Uten's reference to "^" and "$" that prompted my reply.

Edited by vim
Link to comment
Share on other sites

Maybe it has been mentioned already but if not, there is one point I think needs cleared up. ^ indicates the start of the string and $ indicates the end of the string. I keep seeing the word "line" used but that, IMO, is a bad use of terminology. A line is a string but a string can be more than just a line. Regular expressions (in AutoIt) operate on strings, not on lines. If you need them to work based on a line, then naturally you will need to use StringSplit() on CR and/or LF to split the string up into lines. Otherwise the engine just sees a big string which may contain CR and/or LF.

Perhaps this information is already clear in the minds of the people discussing this. However, as I said, the terminology being used is "lines" when it needs to be "strings" since a string can contain more than one line and the two indicators in question operate on strings.

Link to comment
Share on other sites

Your right @Valik.

I use line (and it is in my head) because the programs, I learned regexp in, split the data into lines for me. I will try to change my habit of writing line when I should write string.

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...