Opened 14 years ago

Closed 14 years ago

#1464 closed Bug (Fixed)

StringRegExp, Single Char. Pattern with '*' Quantifier.

Reported by: anonymous Owned by: Jon
Milestone: Component: AutoIt
Version: Severity: None
Keywords: StringRegExpReplace StringRegExp Cc:


Well, It took we a wile to be sure.
But I think the following data shows that AutoIt is outputting a wrong(incomplete) RE-result.

Used string 'abbabb'
Used RegExp 'a?' or 'a*' (Problem seems limited to a single character capture pattern with a '*' or '?' quantifier)
Used Replace '*'

Compared against:
TRC=The Regex Coach.exe

setting:	(Global)	Global On	Global On	Global Off	Global Off
source:		AutoIt		RT-Repl		TRC-Repl	RT-Repl		TRC-Repl
result:		**bbabb		**b*b**b*b*	**b*b**b*b*	*bbabb		*bbabb
'a??' or 'a*?'					
result:		***bbabb	***b*b***b*b*	*a*b*b*a*b*b*	*abbabb		*abbabb

The think the problem is shown in that:
1) AutoIt seems to be using globalReplace mode, indicated by the leading '*' replacements.

  • They match up with the others when they are in GlobalReplace mode.

2) AutoIt is not showing the other '*' replacements that are present in the others.
Both StringRegExpReplace() and StringRegExp() are consistent and identical in there behavior.
In this case AutoIt seems to consistently stop when a failed matched is encountered. (so using 'b*' or '[a]*' will fail at the start of the string)

ConsoleWrite('Out1 = ' & StringRegExpReplace('abbabb','a*','*') & @CRLF) ; fail. '**bbabb'
ConsoleWrite('Out2 = ' & StringRegExpReplace('abbabb','b*a','*') & @CRLF) ;; ok. '**bb'
ConsoleWrite('Out3 = ' & StringRegExpReplace('abbabb','b*','*') & @CRLF) ;; fail. '*abbabb'
ConsoleWrite('Out4 = ' & StringRegExpReplace('abbabb','b*b','*') & @CRLF) ;; ok. 'a*a*'

(,Environment(Language:0409 Keyboard:00000409 OS:WIN_XP/Service Pack 3 CPU:X86 OS:X86)

Attachments (1) (924 bytes) - added by anonymous 14 years ago. (bat,input,output)

Download all attachments as: .zip

Change History (7)

comment:1 Changed 14 years ago by Jon

Does pcretest.exe give the same results. What was entered in pcretest?

Changed 14 years ago by anonymous (bat,input,output)

comment:2 Changed 14 years ago by anonymous

Pcretest is giving a different output then autoit itself.
Its producing the expected correct output.

Base on test run.
PCRE version 8.00 2009-10-19

RE	/a*/	/a*b/	/b*/	/b*a/	-NoneGlobal-
match	a	ab	*	a

RE	/a*/G	/a*b/G	/b*/G	/b*a/G	-Global-
match	a**a***	abbabb	*bb*bb*	abba
(* = empty match)

comment:3 Changed 14 years ago by anonymous

PS: Never used UTF, so I did not test with that with this prcetest.

Yes, all AutoIt strings are UTF16-LE internally and we have to convert back and forth to UTF8 for the pcre engine. It gets real interesting when trying to deal will character positions. By interesting I mean ball breaking.

comment:5 Changed 14 years ago by anonymous

I have checked a additional number of RegExpressions, and so far I only got positive results. (And, Sorry again for my mixup. Hate it when I make that kind of mistake)

comment:6 Changed 14 years ago by Jon

  • Milestone set to
  • Owner set to Jon
  • Resolution set to Fixed
  • Status changed from new to closed

Fixed by revision [5698] in version:

Guidelines for posting comments:

  • You cannot re-open a ticket but you may still leave a comment if you have additional information to add.
  • In-depth discussions should take place on the forum.

For more information see the full version of the ticket guidelines here.

Add Comment

Modify Ticket

as closed The owner will remain Jon.

E-mail address and user name can be saved in the Preferences.

Note: See TracTickets for help on using tickets.