Jump to content

Recommended Posts

Posted

Regexp Experts,

I am looking for a month in an HTML page.

When I run the expression

(?s)(?i)>(sep.*?)<
I get the expected results, but, when I then try to expand the expression for alternation such as
(?s)(?i)>((sep|oct).*?)<
I get the expected line and another line with "sep" in the array.

I am not looking for a fix, just an explanation as I am trying to understand this cryptic language.

Thanks,

kylomas

Forum Rules         Procedure for posting code

"I like pigs.  Dogs look up to us.  Cats look down on us.  Pigs treat us as equals."

- Sir Winston Churchill

Posted

AZJIO,

One more question, please. The "?:" works as documented here?

Non-capturing group. Behaves just like a normal group, but does not record the matching characters in the array nor can the matched text be used for back-referencing.

I do not understand this. If you have time, a simple explanation, please?

kylomas

Forum Rules         Procedure for posting code

"I like pigs.  Dogs look up to us.  Cats look down on us.  Pigs treat us as equals."

- Sir Winston Churchill

Posted

Using the StringRegExp function, the text of the test string that matches the pattern within a non-capturing group will not appear in the resulting array.

Using the StringRegExp or StringRegExpReplace functions, a non-capture group or groups doesdo not generate a back-reference.

See example.

;Non-capturing group behaviour -  Behaves just like a normal group, but does not record the matching characters in the array

Local $STestString = "abc#123 def#456 ghi#789"

; The two capture groups, "([a-z]+)" and "([0-9]+)", match the letters and the numbers.
; Both letters and numbers appear in the array.
Local $sRes = "Capture groups" & @CRLF
Local $aTheArray = StringRegExp($STestString, "([a-z]+)#([0-9]+)", 3)
For $i = 0 To UBound($aTheArray) - 1
    $sRes &= "$aTheArray[" & $i & "] = " & $aTheArray[$i] & @CRLF
Next
MsgBox(0, "Results 1 Array - Both letters and numbers are in capture groups", $sRes)

; The non-capturing group, "(?:[0-9]+)", matches the numbers. Note: The matching characters,
; the numbers, does not appear in the array.
$sRes = "With one non-capture groups" & @CRLF
Local $aTheArray = StringRegExp($STestString, "([a-z]+)#(?:[0-9]+)", 3)
For $i = 0 To UBound($aTheArray) - 1
    $sRes &= "$aTheArray[" & $i & "] = " & $aTheArray[$i] & @CRLF
Next
MsgBox(0, "Results 2 Array - Matching numbers are in a non-capturing group", $sRes)


;============================== Back-referencing =========================================
;Non-capturing group behaviour - nor can the matched text be used for back-referencing.

; Again two capture groups, "([a-z]+)" and "([0-9]+)".
$sRes = "Back-referencing captured groups" & @CRLF
$sRes &= StringRegExpReplace($STestString, "([a-z]+)#([0-9]+)", 'Back-reference 1 = "1" and Back-reference 2 = "2" or "$2" or "${2}"' & @CRLF)
MsgBox(0, "Results 3 Back-referencing two captured groups used.", $sRes)

; Again, second group, matches the numbers, can not be used for back-referencing.
; There is no "2" or "$2" or "${2}" - back-referencing notation for second back-reference.
$sRes = "Back-referencing - Second group is a non-captured group" & @CRLF
$sRes &= StringRegExpReplace($STestString, "([a-z]+)#(?:[0-9]+)", 'Back-reference 1 = "1" and Back-reference 2 = "2" or "$2" or "${2}"' & @CRLF)
MsgBox(0, "Results 4 Back-referencing - Second group, numbers, is a non-captured group", $sRes)

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...