Jump to content

StringRegExp - replace every n-th occurence of a char


Recommended Posts

Hello,

I have a string that contains for example many times the char A. I want to replace every n-th occurence of that char with another char. Is that possible with just using regular expressions?

The strings around the char A might have a different length. Current approch is splitting the string and looping trough it.

Link to comment
Share on other sites

Ha, regex fun! :) One way:

$string1 = "xyzAxyzAxyzAxyzAxyzAxyzAxyzAxyzAxyzAxyzAxyzAxyzAxyzAxyzAxyz"
$string2 = "AxyzAxyzAxyzAxyzAxyzAxyzAxyzAxyzAxyzAxyzAxyzAxyzAxyzAxyzAxyzA"

doit($string1, "A", "!", 3) ; replace every third
doit($string2, "A", "!", 1) ; replace all

Func doit($string, $char, $replaceby, $n)
    MsgBox(0, 0, "String 1: " & StringRegExpReplace($string, "((.*?" & $char & "){" & ($n - 1) & "}.*?)" & $char, "$1" & $replaceby))
EndFunc   ;==>doit

NOTE: This breaks if you supply regex metacharacters to the function, you'd have to safeguard against this if necessary.

Roses are FF0000, violets are 0000FF... All my base are belong to you.

Link to comment
Share on other sites

Replacing occurance $occur of $out by $in :

$string = "xyz?xyz?xyz?xyz?xyz?xyz?xyz?xyz"
Msgbox(0,"", StringReplaceOcc($string, "?", "00", 4) )

Func StringReplaceOcc($string, $out, $in, $occur)
  Return StringRegExpReplace ($string, '(?s)^(?:.*?\K\Q' & $out & '\E){' & $occur & '}' , $in )  
EndFunc

Edit

works with special chars

Edited by mikell
Link to comment
Share on other sites

Special characters in the search string, sure... But not in the replace string ;) Try replacing stuff with "" or "$1".

But pre- and suffixing Q and E is useful of course. Also your regex is much more elegant, as always :)

Edited by SadBunny

Roses are FF0000, violets are 0000FF... All my base are belong to you.

Link to comment
Share on other sites

You guys are awesome - this how i did it earlier:

Func _String_ReplaceOccurence(Const $_c_sStrng, Const $_c_iOccrnc, Const $_c_sChr, Const $_c_sRplc)
        ;|-----------------------------------------|
        ;| Replace occourence of substring         |
        ;|-----------------------------------------|
        ;| $_c_sStrng = String to process          |
        ;| $_c_iOccrnc = n-th occurence to replace |
        ;| $_c_sChr = Char to replace              |
        ;| $_c_sRplc = Char to insert              |
        ;|-----------------------------------------|

        ;|------------------------------------------------------------------------------------------------------------|
        ;|--------------------------------------------- Local variables ----------------------------------------------|
        ;|------------------------------------------------------------------------------------------------------------|
        Local $_l_aString = StringSplit($_c_sStrng, $_c_sChr, 3)
        Local $_l_iLoop, $_l_String, $_l_iOccrnc = $_c_iOccrnc < 1 ? 1 : $_c_iOccrnc

        ;|------------------------------------------------------------------------------------------------------------|
        ;|-------------------------------------------- Loop trough splits --------------------------------------------|
        ;|------------------------------------------------------------------------------------------------------------|
        For $_l_iLoop = 0 To UBound($_l_aString) - 1
            $_l_String &= $_l_aString[$_l_iLoop]
            If $_l_iLoop < UBound($_l_aString) - 1 Then $_l_String &= (Mod($_l_iLoop + 1, $_l_iOccrnc) = 0 ? $_c_sRplc : $_c_sChr)
        Next

        Return $_l_String
    EndFunc   ;==>_String_ReplaceOccurence
Link to comment
Share on other sites

That works fine too. (But wow, them be some phat comments :) )

My version seems very slightly faster than mikells version (of course tested without the tralala addition), on my machine average 7 seconds versus average 7.5 seconds for a million tries per attempt. If you remove the Q E, which makes the fight fairer, mikells version at 6.5 seconds. Adding the tralala trick of course makes his much slower (takes roughly twice as long), but is well worth the time for safety's sake. Your own version clocks in at about 47 seconds.

But hey, that's for a million tries, so for normal use the difference is negligible. No one is saying that you should use the one or the other. Whatever makes the most sense to you :)

Roses are FF0000, violets are 0000FF... All my base are belong to you.

Link to comment
Share on other sites

@mikell : nice expression. You can do it in one time with just escaping the backslash and the $.

Help page for StringRegExpReplace says :

If a "" needs to be in the replaced string it must be doubled. This is a consequence of the back-references mechanism.
The "" and "$" replacement formats are the only valid back-references formats supported.

$string = "xyz?xyz?xyz?xyz?xyz?xyz?xyz?xyz"

Msgbox(0,"", StringReplaceOcc($string, "?", "0^+*\.$0", 4) )

Func StringReplaceOcc($string, $out, $in, $occur)
    Return StringRegExpReplace ($string, '(?s)^(?:.*?\K\Q' & $out & '\E){' & $occur & '}' , StringRegExpReplace($in, "([\\$])", "\\$1") )  
EndFunc
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...