Modify

Opened 10 years ago

Last modified 3 days ago

#2696 assigned Bug

StringRegExp* return non-participating groups

Reported by: jchd18 Owned by: Jon
Milestone: Component: AutoIt
Version: 3.3.11.4 Severity: None
Keywords: Cc:

Description

Non-participating groups appear as part of the result of our PCRE wrappers, but they should ignore them.

Long version: pattern subroutines created by (?(DEFINE) ...) are named groups for PCRE, thus get numbered as usual and do appear in the ovector parameter of pcre_exec() internal function.

Of course they never actually hold any result since the corresponding ovector tuple is (-1, -1). This is different from when actual capturing groups match an empty string: their ovector tuple has length = 0. We sould test for (-1, -1) and ignore these meaningless entries.

See http://www.autoitscript.com/forum/topic/160802-stringsplit-multiple-whole-words-autoit/#entry1167381 for an example.

PM jchd if needed.

Attachments (0)

Change History (9)

comment:1 Changed 10 years ago by jchd18

BTW the same effect occurs with groups having repetition count {0} for instance:
Pattern:

(?x) (a){0} (a?) (bc) (?(DEFINE) (?<head> x)) (?(?=$) z?)

Subject:

bc
abc

In this case RegexBuddy clearly shows the non-participating groups when using full detail mode:
Result:

                             Start   Length
Match 1:	bc	     0	     2
Group 1 did not participate in the match
Group 2:		     0	     0
Group 3:	bc	     0	     2
Group "head" did not participate in the match
Match 2:	abc	     3	     3
Group 1 did not participate in the match
Group 2:	a	     3	     1
Group 3:	bc	     4	     2
Group "head" did not participate in the match

comment:2 Changed 10 years ago by jchd18

As a clue to help locate the failling logic, here's the demonstration that non-participating groups placed after all capturing groups don't cause the ghost capture bug. Only those appearing before or in between capturing groups trigger the issue.

#include <Array.au3>

Local $patterns = [ _
	["(?x)                          (\b\w+)", "No bug: simple capturing group"], _
	["(?x) (?(DEFINE)(?<head> xxx)) (\b\w+)", "Bug: non-participating group before capturing group"], _
	["(?x) (?(DEFINE)(?<head> xxx)) (\b\w+) (?(DEFINE)(?<tail> yyy))", "Bug: adding a non-participating group after the capturing group doesn't matter"], _
	["(?x) (?(DEFINE)(?<head> xxx))  \b\w+  (?(DEFINE)(?<tail> yyy))", "No bug: no capturing group"], _
	["(?x)                          (\b\w+) (?(DEFINE)(?<tail> yyy))", "No bug: non-participating group last"], _
	["(?x) (x){0} (y){0} (z){0}     (\b\w+)", "Bug: more non-participating groups first"] _
]
For $i = 0 To UBound($patterns) - 1
	$res = StringRegExp("There is a bug somewhere.", $patterns[$i][0], 3)
	_ArrayAdd($res, $patterns[$i][1])
	_ArrayDisplay($res, $patterns[$i][1])
Next

comment:3 Changed 3 years ago by Jpm

  • Owner set to Jpm
  • Status changed from new to assigned

Thanks jchd for testing the complete fix
fix sent to Jon

comment:4 Changed 2 years ago by Jon

  • Milestone set to 3.3.15.5
  • Owner changed from Jpm to Jon
  • Resolution set to Fixed
  • Status changed from assigned to closed

Fixed by revision [12632] in version: 3.3.15.5

comment:5 Changed 2 years ago by Jos

  • Resolution Fixed deleted
  • Status changed from closed to reopened
Last edited 2 years ago by Jos (previous) (diff)

comment:6 Changed 2 years ago by TicketCleanup

  • Milestone 3.3.15.5 deleted

Automatic ticket cleanup.

comment:7 Changed 2 years ago by Jon

Change reverted.

comment:8 Changed 23 months ago by Jpm

  • Owner changed from Jon to Jpm
  • Status changed from reopened to assigned

I resent a fix to Jon,
I hope he wiil integrate it

comment:9 Changed 3 days ago by Jpm

  • Owner changed from Jpm to Jon

Guidelines for posting comments:

  • You cannot re-open a ticket but you may still leave a comment if you have additional information to add.
  • In-depth discussions should take place on the forum.

For more information see the full version of the ticket guidelines here.

Add Comment

Modify Ticket

Action
as assigned The owner will remain Jon.
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.