Jump to content
Sign in to follow this  
trailwalker

StringRegExpReplace - can individual replacements to be unique?

Recommended Posts

trailwalker

Hello all, haven't done any AutoIt scripting in a while but can't find this particular question anywhere..

I have text files that the script will open and parse, looking with regexes for certain keywords. No problem. I need to create replacements that are unique since this will allow the script to go back and re-parse the document and easily find specific instances of each of many keywords. Basically I want to add some unique text combinations that will be "markers" inside the text to find a specific location. These can later be removed with another regex and the text stays in tact. So..

Is there a way to add a unique qualifier to the replacement text for each one using StringRegExpReplace? Even a simple sequential number added would suffice.

Or, I could use StringRegExp and get an array of matches and loop through them, but how can I go to the specific match in the body of original text and perform the replacement when there can be many matches that are identical and I need each replacement to be unique?

<match>text text text text text text <match> text text
text text text text text <match that is different> text
text text text <match still different> text text text
text <match>text text.

This is what I'm after:

<match><MARKER1>text text text text text
text <match><MARKER2> text text text text text
text text <match that is different><MARKER3> text
text text text <match still different><MARKER4> text
text text text <match><MARKER5>text text.

Thanks in advance.

Share this post


Link to post
Share on other sites
JohnQSmith

Use a counter. Start each search from where the previous search left off, incrementing the counter after each replacement.


Whenever someone says "pls" because it's shorter than "please", I say "no" because it's shorter than "yes".

Share this post


Link to post
Share on other sites
trailwalker

Use a counter. Start each search from where the previous search left off, incrementing the counter after each replacement.

Do you mean using the offset? I thought about that but it is based not on the matches but the number of characters from the left of the string. ... or am I not understanding your response correctly?

Edited by trailwalker

Share this post


Link to post
Share on other sites
jchd

You could build something along this line:

Local $text = "<match>text text text text text text <match> text text text text text text text <match that is different> text " & _
    "text text text <match still different> text text text text <match>text text."
ConsoleWrite(Execute('"' & StringRegExpReplace($text, "<match>|<match .*? different>", '" & MyFunc("$1") & "') & '"') & @LF)
Func MyFunc($s)
Static $i = 0
$i += 1
Return("<match><MYMARKER" & $i & ">")
EndFunc

Probably you'd change the static $i into something you can reset across different runs.


This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites
trailwalker

jchd: Thanks! This will be great and I can modify it to do just what I want. The only problem is that since the Execute command builds the string somehow 'on-the-fly', now it breaks when I connect it up to my text file, because it contains quotation marks. Is there a way around that?

Share this post


Link to post
Share on other sites
JohnQSmith

Do you mean using the offset? I thought about that but it is based not on the matches but the number of characters from the left of the string. ... or am I not understanding your response correctly?

Apparently I didn't understand the complete workings of the functions before I started commenting. I put the following together.

It creates an array holding the results of the RegEx just for counting purposes (how many results are there?). It then does a replacement using the same RegEx as before with the addition of a negative lookahead so it doesn't replace things it has already replaced.

I'm just having issues figuring out why my backreference isn't working and it's overwriting the finds instead of replacing them with themselves plus additional marker text.

$inputString = "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse " & _
"sollicitudin velit eu enim imperdiet ultricies. Phasellus et facilisis mi. Sed " & _
"metus odio, condimentum a consequat id, fermentum a odio. Sed ornare viverra " & _
"nisi, et tincidunt ipsum venenatis pellentesque. Aliquam sapien erat, euismod a " & _
"ultrices et, gravida ut massa. Vestibulum semper, velit id pulvinar venenatis, " & _
"libero ipsum vestibulum massa, nec lobortis mauris lorem et tortor. Nullam " & _
"tortor magna, feugiat vel imperdiet id, egestas at eros. Aliquam dui purus, " & _
"sagittis quis malesuada sed, facilisis id dui. Ut eget quam eget urna fermentum " & _
"porttitor malesuada ut nibh. Integer lobortis libero arcu, feugiat pulvinar " & _
"velit. Ut eleifend orci nec leo tempor imperdiet. Donec pulvinar lectus ac quam " & _
"mollis porttitor."

$outputString = $inputString
$results = StringRegExp($inputString, "(?i:l[^s]*?s)", 3)
$count = 1
For $value In $results
   $outputString = StringRegExpReplace( $outputString, "(?i:l[^s]*?s)(?!<Markerd{1,}>)", "1<Marker" & $count & ">", 1)
   $count += 1
Next
MsgBox(0,"", "In: " & $inputString & @CRLF & @CRLF & "Out: " & $outputString & @CRLF)

Whenever someone says "pls" because it's shorter than "please", I say "no" because it's shorter than "yes".

Share this post


Link to post
Share on other sites
jchd

trailwalker,

Try escaping double quotes by doubling them before processing the text subject.

$test = StringReplace($text, '"', '""')

should be fine (untested)

Of course you need to perform the inverse operation afterwards.

Edited by jchd

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites
trailwalker

That works like a charm jchd. Since I will have a lot of these to run separately, I will try to wrap this into a little function to just call it for each set of matches i need to run and locators to place. It should be nice and clean. I'll post it later.

JohnQ, I ran into the same thing and forgot that in this autoit function the backreference is $1 instead of 1. I may have to go that route for part of this but want to avoid it if possible so that I don't overlook any matches where perhaps there will be two markers at the same location for when it was matched for two different reasons.

Share this post


Link to post
Share on other sites
Malkey

Here are two methods that appear to work.

Local $sText = "<match>text text text text text text <match> text text " & _
        "text text text text text <match that is different> text " & _
        "text text text <match still different> text text text " & _
        "text <match>text text."
Local $sTextMod = '"' & StringRegExpReplace($sText, "(<match.*?>)", '" & MyFunc() & "') & '"'

;ConsoleWrite($sTextMod & @LF)
ConsoleWrite(Execute($sTextMod) & @LF)


Func MyFunc()
    Static $iCount = 0
    $iCount += 1
    Return "<match><MARKER" & $iCount & ">"
EndFunc   ;==>MyFunc

;=============== OR ============================================================================

Local $sText = "<match>text text text text text text <match> text text " & _
        "text text text text text <match that is different> text " & _
        "text text text <match still different> text text text " & _
        "text <match>text text."
Local $iCount = 0, $iMark = 1

While 1
    $iCount += 1
    $iMark = StringInStr($sText, "<match", 0, $iCount)
    If $iMark = 0 Then ExitLoop
    $sText = StringRegExpReplace($sText, "(.{0," & $iMark & "})(<match.*?>)", "1<match><MARKER" & $iCount & ">", 1)
WEnd
ConsoleWrite($sText & @LF)

Share this post


Link to post
Share on other sites
UEZ

Here another method:

Local $sText = "<match>text text text text text text <match> text text " & _
        "text text text text text <match that="" is="" different=""> text " & _
        "text text text <match still="" different=""> text text text " & _
        "text <match>text text."

Global $sNew = StringRegExpReplace($sText, "<match.*?>", "<match><MARKER°°°>")
Global $c = 1

Do
    $sNew = StringReplace($sNew, "°°°", $c, 1)
    $c += 1
Until Not @extended

ConsoleWrite( $sNew & @LF)

Br,

UEZ

Edited by UEZ

Please don't send me any personal message and ask for support! I will not reply!

Selection of finest graphical examples at Codepen.io

The own fart smells best!
Her 'sikim hıyar' diyene bir avuç tuz alıp koşma!
¯\_(ツ)_/¯  ٩(●̮̮̃•̃)۶ ٩(-̮̮̃-̃)۶ૐ

Share this post


Link to post
Share on other sites
JohnQSmith

Nice one UEZ. Just needs a little tweak for the <match> backreference.


Whenever someone says "pls" because it's shorter than "please", I say "no" because it's shorter than "yes".

Share this post


Link to post
Share on other sites
trailwalker

Thanks to all. UEZ, this is a simple method I hadn't thought of! I'm off this for a couple of days and will post what I end up going with depending on what fits best in the various types of matches I will need to "mark" in the text using a reusable function.

Share this post


Link to post
Share on other sites
trailwalker

Hi again, just posting the result and what I ended up doing for this and it works great! Thanks to all you who responded. This is getting called a number of times and in different ways, so I wrapped them up in separate functions to just call them out and supply the desired options.

; example insertion into the text
$text = InsertTags($text, $regex, 0, 2)

Func InsertTags($text, $regexp, $bef_aft = 0, $return = 1)
$text = StringReplace($text, '"', '""')
Switch $return
  Case 1 ; normal, return position 1 from regex
   $text = Execute('"' & StringRegExpReplace($text, $regexp, '" & MarkLocator("$1", $bef_aft) & "') & '"')
  Case 2 ; return positions 1 & 5 from regex
   $text = Execute('"' & StringRegExpReplace($text, $regexp, '" & MarkLocator("$1$7", $bef_aft) & "') & '"')
  Case 3 ; return positions 1 & 3 from regex
   $text = Execute('"' & StringRegExpReplace($text, $regexp, '" & MarkLocator("$1$3", $bef_aft) & "') & '"')
EndSwitch
$text = StringReplace($text, '""', '"')
Return($text)
EndFunc

Func MarkLocator($s, $loc = 0, $get = 0)
    Global Static $n = 0
If $get = 0 Then
  $n += 1
  If $loc = 1 Then
   Return($s & "<!--X" & $n & "-->")
  Else
   Return("<!--X" & $n & "-->" & $s)
  EndIf
Else
  Return $n
EndIf
EndFunc

Share this post


Link to post
Share on other sites
jchd

Nice to see you now have working code.


This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

×