Modify

#2696 assigned Bug

StringRegExp* return non-participating groups

Reported by: jchd18 Owned by: Jon
Milestone: Component: AutoIt
Version: 3.3.11.4 Severity: None
Keywords: Cc:

Description

Non-participating groups appear as part of the result of our PCRE wrappers, but they should ignore them.

Long version: pattern subroutines created by (?(DEFINE) ...) are named groups for PCRE, thus get numbered as usual and do appear in the ovector parameter of pcre_exec() internal function.

Of course they never actually hold any result since the corresponding ovector tuple is (-1, -1). This is different from when actual capturing groups match an empty string: their ovector tuple has length = 0. We sould test for (-1, -1) and ignore these meaningless entries.

See http://www.autoitscript.com/forum/topic/160802-stringsplit-multiple-whole-words-autoit/#entry1167381 for an example.

PM jchd if needed.

Attachments (0)

Change History (9)

comment:1 by jchd18, on Apr 23, 2014 at 10:40:59 PM

BTW the same effect occurs with groups having repetition count {0} for instance:
Pattern:

(?x) (a){0} (a?) (bc) (?(DEFINE) (?<head> x)) (?(?=$) z?)

Subject:

bc
abc

In this case RegexBuddy clearly shows the non-participating groups when using full detail mode:
Result:

                             Start   Length
Match 1:	bc	     0	     2
Group 1 did not participate in the match
Group 2:		     0	     0
Group 3:	bc	     0	     2
Group "head" did not participate in the match
Match 2:	abc	     3	     3
Group 1 did not participate in the match
Group 2:	a	     3	     1
Group 3:	bc	     4	     2
Group "head" did not participate in the match

comment:2 by jchd18, on Apr 25, 2014 at 9:36:34 AM

As a clue to help locate the failling logic, here's the demonstration that non-participating groups placed after all capturing groups don't cause the ghost capture bug. Only those appearing before or in between capturing groups trigger the issue.

#include <Array.au3>

Local $patterns = [ _
	["(?x)                          (\b\w+)", "No bug: simple capturing group"], _
	["(?x) (?(DEFINE)(?<head> xxx)) (\b\w+)", "Bug: non-participating group before capturing group"], _
	["(?x) (?(DEFINE)(?<head> xxx)) (\b\w+) (?(DEFINE)(?<tail> yyy))", "Bug: adding a non-participating group after the capturing group doesn't matter"], _
	["(?x) (?(DEFINE)(?<head> xxx))  \b\w+  (?(DEFINE)(?<tail> yyy))", "No bug: no capturing group"], _
	["(?x)                          (\b\w+) (?(DEFINE)(?<tail> yyy))", "No bug: non-participating group last"], _
	["(?x) (x){0} (y){0} (z){0}     (\b\w+)", "Bug: more non-participating groups first"] _
]
For $i = 0 To UBound($patterns) - 1
	$res = StringRegExp("There is a bug somewhere.", $patterns[$i][0], 3)
	_ArrayAdd($res, $patterns[$i][1])
	_ArrayDisplay($res, $patterns[$i][1])
Next

comment:3 by Jpm, on Oct 5, 2020 at 6:25:59 PM

Owner: set to Jpm
Status: newassigned

Thanks jchd for testing the complete fix
fix sent to Jon

comment:4 by Jon, on Feb 27, 2022 at 2:37:49 PM

Milestone: 3.3.15.5
Owner: changed from Jpm to Jon
Resolution: Fixed
Status: assignedclosed

Fixed by revision [12632] in version: 3.3.15.5

comment:5 by Jos, on Mar 3, 2022 at 7:04:07 PM

Resolution: Fixed
Status: closedreopened
Last edited on Mar 3, 2022 at 7:05:15 PM by Jos (previous) (diff)

comment:6 by TicketCleanup, on Mar 3, 2022 at 8:00:02 PM

Milestone: 3.3.15.5

Automatic ticket cleanup.

comment:7 by Jon, on Mar 5, 2022 at 6:11:27 PM

Change reverted.

comment:8 by Jpm, on May 9, 2022 at 3:26:47 PM

Owner: changed from Jon to Jpm
Status: reopenedassigned

I resent a fix to Jon,
I hope he wiil integrate it

comment:9 by Jpm, on Mar 16, 2024 at 10:25:24 AM

Owner: changed from Jpm to Jon

Modify Ticket

Action
as assigned The owner will remain Jon.

Add Comment


E-mail address and name can be saved in the Preferences .
 
Note: See TracTickets for help on using tickets.