Jump to content

Can I add on a back reference to StringRegExp to select that group only?


Guy_
 Share

Recommended Posts

I'm trying to dive a bit deeper into back references, but only used them successfully in StringRexExpReplace before.

I'm now wondering if this too is conceptually possible...? (or the best way to do something similar from just one pattern)

I would have a user defined regex pattern that neatly selects ID and Product name, tagged as named groups <id> and <product>.

Can I in the program catch just the group I want into a variable by adding something behind the user RegExp pattern?

I was hoping to add a "forget everything, just give me this group," e.g. by means of adding '\K{id}', '\g{id}' etc. (tried many variations).

Local $sText, $sUserRegx, $a_IDs, $a_Products

$sText = "ID:1000" & @CRLF & "Computer"

$sUserRegx = "^ID:(?<id>\d{4})\v(?<product>.*)"

; pseudo code

$a_IDs      = StringRegExp( $sText, $sUserRegx & [GIVE ME <id> ONLY], 3)
$a_Products = StringRegExp( $sText, $sUserRegx & [GIVE ME <product> ONLY], 3)

Thanks!  :)

Edited by Guy_
Link to comment
Share on other sites

Naming the captured patterns doesn't help here.

Local $sText, $sUserRegx, $aResult, $sID, $sProduct

$sText = "ID:1000" & @CRLF & "Computer"
$sUserRegx = "^ID:(\d+)\R(.+)"

$aResult = StringRegExp($sText, $sUserRegx, 3)
$sID = $aResult[0]
$sProduct = $aResult[1]

MsgBox(0, "Captured fields", "ID = " & $sID & @CRLF & "Product = " & $sProduct)

 

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Link to comment
Share on other sites

As usual, the question can never be clear enough... Sorry :)

The problem (I think) with your proposal is that when the Product comes before ID, the results are now reversed?

Therefore I want the ability to use tags... (so I maybe could avoid using 2 separate regexp patterns, or other shenanigans :) )

I feel it would be an elegant solution.

Are you 99% sure that what I proposed is not possible...?

You cannot use "forget what went on before (like \K, that does save named groups), just show me this group now" inside a StringRegExp?

Merci  :)

Edited by Guy_
Link to comment
Share on other sites

I might be reasoning wrongly here...

I'll try to wrap my head around my own ideas some more first...  ;)

Esp. since when the program gets the date, it would already be split in two different array rows maybe...

But if I could capture the full thing in one row and then use the first proposed technique, that would work for me though. But I'm guessing groups I specifically wanted captured can not be kept in one row...

Link to comment
Share on other sites

$sText = "Computer" & @CR & "ID:1000"
;~ $sText = "ID:1000" & @CR & "Computer"
$sUserRegx = "(ID:\d+)"



$aResult = execute('assign("sID" , StringTrimLeft(StringRegExp(StringstripCR($sText), $sUserRegx, 3)[0] , 3)) assign("Product" , StringRegExpReplace(StringstripCR($sText), $sUserRegx , ""))')

MsgBox(0, "Captured fields", "ID = " & eval("sID") & @CRLF & "Product = " & eval("Product"))

 

If there is going to be a hint like "ID:" in the string , capture it.  Then you have the data you need to find your first value AND find the remainder which is the other value.

If the strings are going to be a mixture of designators, then you will probably be building a bunch of cases to handle that stuff.

 

Edited by iamtheky

,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-.
|(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/
(_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_)
| | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) (
| | | | |)| | \ / | | | | | |)| | `--. | |) \ | |
`-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_|
'-' '-' (__) (__) (_) (__)

Link to comment
Share on other sites

Interesting code iamthekey, but I'll have to brush up on my Chinese for that!  ;)

I'm not sure it is what I'm asking.

The idea is that there might be hundreds of scenarios that need their specific pattern. Also, sometimes, the ID will not be there, so I cannot depend on an easy selection and the program has to know what was the provided one.

I just got a new idea that might work... Maybe I can manipulate the pattern of the first post so I do capture both bits in one row, and then apply my pattern on that row, but maybe using ...
 

StringRegExpReplace( that whole row, my pattern, named group <ID>)
StringRegExpReplace( that whole row, my pattern, named group <Product>)

 

Edited by Guy_
Link to comment
Share on other sites

Yes, I think that idea of mine can work  :)

What I'll try is replacing both '?<id>' and '?<product>' with '?:' and put parenthesis around the whole pattern.

I'll now have both bits in one row, to which I'll add some surrounding Returns maybe.

Then I'll use the idea from the post above (with the original untouched pattern version).

Edited by Guy_
Link to comment
Share on other sites

53 minutes ago, Guy_ said:

The idea is that there might be hundreds of scenarios that need their specific pattern. Also, sometimes, the ID will not be there, so I cannot depend on an easy selection and the program has to know what was the provided one.

I just got a new idea that might work... Maybe I can manipulate the pattern of the first post so I do capture both bits in one row, and then apply my pattern on that row, but maybe using ...

Hundreds of patterns can't be handled by just one, especially when you mention that some variable and unknown part of what you want to capture isn't there. How can "the program know what was the provided one"?

In the remplace part of StringRegexpReplace, your can't refer to named captured patterns, just capture number using $1, $2, $3 ... or \1, \2, \3, ... These are the only constructs available there.

I'm not saying that what you have to do isn't possible, regexp or not, but that you ought to make explicit, in plain english pseudocode, what you have to do in every possible case.

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Link to comment
Share on other sites

You can use a look-ahead assertion to "revert" the capture order, for example :

#Include <Array.au3>

; Local $sText = "ID:1000" & @CRLF & "Computer"
Local $sText = "Computer" & @CRLF & "ID:1000"

Local $aItems = StringRegExp($sText, "(?|ID:(\d+)\R(\V+)|(?=\V+\RID:(\d+))(\V+))", 3)
_ArrayDisplay($aItems)

 

Edited by jguinch
Link to comment
Share on other sites

"Hundreds of patterns can't be handled by just one..."
> I mean: imagine there are 1000 scenarios for, for example, an ID and Product name on a page.

Someone can make 1000 regex patterns for how to capture one or both of those.

I would love if that could be done in just one pattern per scenario and the program could figure out easily what is what (by the tags).

"In the replace part of StringRegexpReplace, your can't refer to named captured patterns, "

> Aaargh... :(  So much info on named groups and then you can't really use them in a replace...
I felt I was almost there with this solution, but couldn't get the replace back refs to work...

Even though for example, I see (for Perl) ...

"$+{name} inserts the capture in the replacement string." (www.rexegg.com/regex-capture.html)

(For AutoIt we have to look in this one, I believe though?  http://regexkit.sourceforge.net/Documentation/pcre/pcrepattern.html )

$sText = "ID:1000" & @CRLF & "Computer"

$sUserRegx = "^ID:(?<id>\d{4})\R(?<product>.*)"

; inside the code...

$sUserRegx_TEMP = '(?:' & StringRegExpReplace( $sUserRegx, '\?<id>|\?<product>', '?:') & ')'

$a_allBitsInOneRow = StringRegExp($sText, $sUserRegx_TEMP, 3)

If Not @error Then
    $sSafeRow   = @CRLF & $a_allBitsInOneRow[0] & @CRLF ; adding room in case the original pattern depends on it
    $i_ID       = StringRegExpReplace($sSafeRow, $sUserRegx_TEMP, '$+{id}')
    $s_Product  = StringRegExpReplace($sSafeRow, $sUserRegx_TEMP, '$+{product}')

    MsgBox(0,"", "Product: " & $s_Product & @CRLF & "ID: " & $i_ID)
EndIf

 

Edited by Guy_
Link to comment
Share on other sites

50 minutes ago, jguinch said:

"You can use a look-ahead assertion to "revert" the capture order, for example..."

Meaning: there is always a trick to make group 1 into group 2, and vice versa?  :)  I kinda expected that, but figured it might get messy & less elegant (or above my paygrade). I'll study your example, thank you  :)

Upd.: Ok, what I think you are doing is combining 2 separate patterns into one, so you can always easily decide the order. That is still pretty good indeed. I may have to use that instead... ^_^ It might be for the better in general. Thanks!

The drawback might be that patterns are gonna get longer, cos it seems to me you often need to define both objects to precisely get one, so with that already there, adding group names would take less space than sometimes kinda doing the same thing twice, but in reverse. But so be it for now.

Edited by Guy_
Link to comment
Share on other sites

1 hour ago, Guy_ said:

So much info on named groups and then you can't really use them in a replace...

There is no "replace" primitive in  PCRE, only a match engine.

You can always use alternation to handle "this then that" or "that then this". You can get an array of 4 captures, with two of them being empty strings. Concatenating results 0 and 3 yields this, doing the same with result 2 and 4 yields that.

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Link to comment
Share on other sites

1 hour ago, jchd said:

There is no "replace" primitive in  PCRE, only a match engine.

I will look at the basics again. Been out of it a while... ;)
But it does not seem unreasonable if you can StringRexExpReplace with $1, $2, ... what is so different to do it with a named group really?
Anyway, I think I will be happy trying The Jguinch Method  ;)  But thank you everyone!

Link to comment
Share on other sites

Again, the replace machinery isn't part of the legacy PCRE library API. Support of named groups and much more many constructs would require PCRE to make the replacement code work closely with the match process. PCRE by itself isn't that ambitious. It's however quite possible to add support for fancy constructs to a "replace" piece of code but that requires strong (and non trivial) links between the matching code and the replacing code.

Such thing would be slightly easier to implement by using PCRE2 (a more recent release of the PCRE library) but as far as AutoIt goes, don't expect that much in a short laps of time.

Most of the time, named groups and named subroutines are only used within the matching pass.

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...