Modify

Opened 15 years ago

Closed 15 years ago

#1464 closed Bug (Fixed)

StringRegExp, Single Char. Pattern with '*' Quantifier.

Reported by: anonymous Owned by: Jon
Milestone: 3.3.5.4 Component: AutoIt
Version: 3.3.4.0 Severity: None
Keywords: StringRegExpReplace StringRegExp Cc:

Description

Well, It took we a wile to be sure.
But I think the following data shows that AutoIt is outputting a wrong(incomplete) RE-result.

Used string 'abbabb'
Used RegExp 'a?' or 'a*' (Problem seems limited to a single character capture pattern with a '*' or '?' quantifier)
Used Replace '*'

Compared against:
TRC=The Regex Coach.exe
RT=www.regextester.com

setting:	(Global)	Global On	Global On	Global Off	Global Off
source:		AutoIt		RT-Repl		TRC-Repl	RT-Repl		TRC-Repl
result:		**bbabb		**b*b**b*b*	**b*b**b*b*	*bbabb		*bbabb
'a??' or 'a*?'					
result:		***bbabb	***b*b***b*b*	*a*b*b*a*b*b*	*abbabb		*abbabb

The think the problem is shown in that:
1) AutoIt seems to be using globalReplace mode, indicated by the leading '*' replacements.

  • They match up with the others when they are in GlobalReplace mode.

2) AutoIt is not showing the other '*' replacements that are present in the others.
Both StringRegExpReplace() and StringRegExp() are consistent and identical in there behavior.
In this case AutoIt seems to consistently stop when a failed matched is encountered. (so using 'b*' or '[a]*' will fail at the start of the string)

ConsoleWrite('Out1 = ' & StringRegExpReplace('abbabb','a*','*') & @CRLF) ; fail. '**bbabb'
ConsoleWrite('Out2 = ' & StringRegExpReplace('abbabb','b*a','*') & @CRLF) ;; ok. '**bb'
ConsoleWrite('Out3 = ' & StringRegExpReplace('abbabb','b*','*') & @CRLF) ;; fail. '*abbabb'
ConsoleWrite('Out4 = ' & StringRegExpReplace('abbabb','b*b','*') & @CRLF) ;; ok. 'a*a*'

(3.3.4.0)&(3.3.5.3),Environment(Language:0409 Keyboard:00000409 OS:WIN_XP/Service Pack 3 CPU:X86 OS:X86)

Change History (7)

comment:1 Changed 15 years ago by Jon

Does pcretest.exe give the same results. What was entered in pcretest?

Changed 15 years ago by anonymous

pcretest.zip (bat,input,output)

comment:2 Changed 15 years ago by anonymous

Pcretest is giving a different output then autoit itself.
Its producing the expected correct output.

Base on pcretest.zip test run.
PCRE version 8.00 2009-10-19
string='abbabb'

RE	/a*/	/a*b/	/b*/	/b*a/	-NoneGlobal-
match	a	ab	*	a

RE	/a*/G	/a*b/G	/b*/G	/b*a/G	-Global-
match	a**a***	abbabb	*bb*bb*	abba
(* = empty match)

comment:3 Changed 15 years ago by anonymous

PS: Never used UTF, so I did not test with that with this prcetest.

Yes, all AutoIt strings are UTF16-LE internally and we have to convert back and forth to UTF8 for the pcre engine. It gets real interesting when trying to deal will character positions. By interesting I mean ball breaking.

comment:5 Changed 15 years ago by anonymous

I have checked a additional number of RegExpressions, and so far I only got positive results. (And, Sorry again for my mixup. Hate it when I make that kind of mistake)

comment:6 Changed 15 years ago by Jon

  • Milestone set to 3.3.5.4
  • Owner set to Jon
  • Resolution set to Fixed
  • Status changed from new to closed

Fixed by revision [5698] in version: 3.3.5.4

Guidelines for posting comments:

  • You cannot re-open a ticket but you may still leave a comment if you have additional information to add.
  • In-depth discussions should take place on the forum.

For more information see the full version of the ticket guidelines here.

Add Comment

Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.