Modify

Opened 6 years ago

Last modified 6 years ago

#2696 new Bug

StringRegExp* return non-participating groups

Reported by: jchd18 Owned by:
Milestone: Component: AutoIt
Version: 3.3.11.4 Severity: None
Keywords: Cc:

Description

Non-participating groups appear as part of the result of our PCRE wrappers, but they should ignore them.

Long version: pattern subroutines created by (?(DEFINE) ...) are named groups for PCRE, thus get numbered as usual and do appear in the ovector parameter of pcre_exec() internal function.

Of course they never actually hold any result since the corresponding ovector tuple is (-1, -1). This is different from when actual capturing groups match an empty string: their ovector tuple has length = 0. We sould test for (-1, -1) and ignore these meaningless entries.

See http://www.autoitscript.com/forum/topic/160802-stringsplit-multiple-whole-words-autoit/#entry1167381 for an example.

PM jchd if needed.

Attachments (0)

Change History (2)

comment:1 Changed 6 years ago by jchd18

BTW the same effect occurs with groups having repetition count {0} for instance:
Pattern:

(?x) (a){0} (a?) (bc) (?(DEFINE) (?<head> x)) (?(?=$) z?)

Subject:

bc
abc

In this case RegexBuddy clearly shows the non-participating groups when using full detail mode:
Result:

                             Start   Length
Match 1:	bc	     0	     2
Group 1 did not participate in the match
Group 2:		     0	     0
Group 3:	bc	     0	     2
Group "head" did not participate in the match
Match 2:	abc	     3	     3
Group 1 did not participate in the match
Group 2:	a	     3	     1
Group 3:	bc	     4	     2
Group "head" did not participate in the match

comment:2 Changed 6 years ago by jchd18

As a clue to help locate the failling logic, here's the demonstration that non-participating groups placed after all capturing groups don't cause the ghost capture bug. Only those appearing before or in between capturing groups trigger the issue.

#include <Array.au3>

Local $patterns = [ _
	["(?x)                          (\b\w+)", "No bug: simple capturing group"], _
	["(?x) (?(DEFINE)(?<head> xxx)) (\b\w+)", "Bug: non-participating group before capturing group"], _
	["(?x) (?(DEFINE)(?<head> xxx)) (\b\w+) (?(DEFINE)(?<tail> yyy))", "Bug: adding a non-participating group after the capturing group doesn't matter"], _
	["(?x) (?(DEFINE)(?<head> xxx))  \b\w+  (?(DEFINE)(?<tail> yyy))", "No bug: no capturing group"], _
	["(?x)                          (\b\w+) (?(DEFINE)(?<tail> yyy))", "No bug: non-participating group last"], _
	["(?x) (x){0} (y){0} (z){0}     (\b\w+)", "Bug: more non-participating groups first"] _
]
For $i = 0 To UBound($patterns) - 1
	$res = StringRegExp("There is a bug somewhere.", $patterns[$i][0], 3)
	_ArrayAdd($res, $patterns[$i][1])
	_ArrayDisplay($res, $patterns[$i][1])
Next

Guidelines for posting comments:

  • You cannot re-open a ticket but you may still leave a comment if you have additional information to add.
  • In-depth discussions should take place on the forum.

For more information see the full version of the ticket guidelines here.

Add Comment

Modify Ticket

Action
as new The ticket will remain with no owner.
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.