Sign in to follow this  
Followers 0
MvGulik

RE Trouble ...

7 posts in this topic

#1 ·  Posted (edited)

Not making any progress with some Regular-Expression(RE) pattern's.

Think I have bin staring at them to long and hope someone can shine some light on it/them.

Case 1: (solved)

$sPattern = (?imx) \#cs (?<rec> [^#] | \#cs \g<rec>* \#ce )* \#ce

$sText: aaa\n#cs bbb\n#ce bbb\naaa\n

$result: aaa\n[0:#cs bbb\n#ce][1:\n] bbb\naaa\n

or

$sText: aaa\n#cs bbb\n#cs bbb\n#ce bbb\n#ce bbb\naaa\n

$result: aaa\n[0:#cs bbb\n#cs bbb\n#ce bbb\n#ce][1:\n] bbb\naaa\n

- "[N:(whatever)]" being a RE capture at level N.

- "\n" being a LF linefeed.

What I don't get here is where that "[1:\n]" capture instant is coming from. Or what part of the RE code is responsible for it.

- Still don't fully understand how this recursive capture works. -> my current skill level.

Single file RE testing setup

(Code moved to "(personal code dump): RE_Debug")

Edited by MvGulik

"Straight_and_Crooked_Thinking" : A "classic guide to ferreting out untruths, half-truths, and other distortions of facts in political and social discussions."
"The Secrets of Quantum Physics" : New and excellent 2 part documentary on Quantum Physics by Jim Al-Khalili. (Dec 2014)

"Believing what you know ain't so" ...

Knock Knock ...
 

Share this post


Link to post
Share on other sites



#2 ·  Posted (edited)

Proof that writing down the problem is a good thing.

Looking at it the coin just dropped.

[^#] -> [^#]*

or

$1 = ' \#cs (?<rec> [^#]* | \#cs \g<rec>* \#ce )* \#ce '

... will reuse topic for next RE problem I run into ...

Edited by MvGulik

"Straight_and_Crooked_Thinking" : A "classic guide to ferreting out untruths, half-truths, and other distortions of facts in political and social discussions."
"The Secrets of Quantum Physics" : New and excellent 2 part documentary on Quantum Physics by Jim Al-Khalili. (Dec 2014)

"Believing what you know ain't so" ...

Knock Knock ...
 

Share this post


Link to post
Share on other sites

#3 ·  Posted (edited)

Took me a wile to get a RE code that was doing what I had in mind.

Looking for some comments or suggestion on possible improvement or alternatives to it.

Testers to flush out hidden problems are welcome to.

De target here was to locate:

- valid and active #include directives'

- while skipping comment blocks. (including nested comment blocks)

'(?im) ( ^ \h* #c(?:s|omments-start) (?: (?>[^se] | (?<!#c|#comments-)[se]) | (?R) )* ^\h* #c(?:e|omments-end).* ) | (^\h*)(#)(include.*)'

'(?im) ( ^ \h* #c(?:s|omments-start) (?: (?>[^se] | (?<!#c|#comments-)[se]) | (?R) )* ^ \h* #c(?:e|omments-end).* ) | (^\h*)(#)(include.*)'

Ps: the '#' character in above RE code need to be escaped ('\#') when using it with the RE (?x)/(?imx) option.

;; adjust valid/active '^\h*#include'(s) -> '^\h*###include...' (ignoring those in block comments, including nested blocks)
;; - not speed tested.
$sCode = StringRegExpReplace($sCode, '(?im)(^\h*#c(?:s|omments-start)(?:(?>[^se]|(?<!#c|#comments-)[se])|(?R))*^\h*#c(?:e|omments-end).*)|(^\h*)(#)(include.*)', '$1$2$3$3$3$4') ;; '<<<$1|$2|$3|$3$4>>>'
Edited by MvGulik

"Straight_and_Crooked_Thinking" : A "classic guide to ferreting out untruths, half-truths, and other distortions of facts in political and social discussions."
"The Secrets of Quantum Physics" : New and excellent 2 part documentary on Quantum Physics by Jim Al-Khalili. (Dec 2014)

"Believing what you know ain't so" ...

Knock Knock ...
 

Share this post


Link to post
Share on other sites

#4 ·  Posted (edited)

O dear. I think I just hit a (probably hardcore) RE limitation. :graduated:

On one of my include scripts the above RE returned with a

!>23:05:26 AutoIT3.exe ended.rc:-1073741819

Back to reading doc's.

Edited by MvGulik

"Straight_and_Crooked_Thinking" : A "classic guide to ferreting out untruths, half-truths, and other distortions of facts in political and social discussions."
"The Secrets of Quantum Physics" : New and excellent 2 part documentary on Quantum Physics by Jim Al-Khalili. (Dec 2014)

"Believing what you know ain't so" ...

Knock Knock ...
 

Share this post


Link to post
Share on other sites

I hit a limitation with Autoits implementation of PCRE at one time. I can't remember how many chars it was for the memory, but it kind of smacked me in the face.

I believe my solution was to read my data in chunks at that time. Although it's been quite some time, I can't remember the char limit, or what I was doing exactly at the time, but I could replicate it every time.

Yeah I know, really helpful huh lol. Guess I just wanted you to know you weren't alone.


[center]Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.[/center]

Share this post


Link to post
Share on other sites

Thanks for the hug. :graduated:

Seems that the size of the comment block is the trigger in this case.

On a single (none nested) block the limit seems to be at 5934 characters.

;; from start of initial #CS to closing "omments-", NoError with 5934 characters, Error with 5935 characters.

Seems I need to start considering splitting of commend-blocks ... just don't want to do so yet.


"Straight_and_Crooked_Thinking" : A "classic guide to ferreting out untruths, half-truths, and other distortions of facts in political and social discussions."
"The Secrets of Quantum Physics" : New and excellent 2 part documentary on Quantum Physics by Jim Al-Khalili. (Dec 2014)

"Believing what you know ain't so" ...

Knock Knock ...
 

Share this post


Link to post
Share on other sites

#7 ·  Posted (edited)

Maybe try using StringRegExp() to find the comments.

Then do a StringReplace/Case sensitive and replace them with null data.

Then use your StringRegExpReplace().

It's three steps I know, but maybe a solution.

....

On another point, one thing I've come to terms with. Saving 10 ms of time on processing speed, isn't necessarily worth 10 hours of my time. It's fun to be the hero, but it's better to get it deployed and fix the "minor" issues later ( if you need to at all ).

Edited by SmOke_N

[center]Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.[/center]

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0