Sign in to follow this  
Followers 0
MvGulik

[Solved] StringRegExp() recursion.

20 posts in this topic

#1 ·  Posted (edited)

Just wondering if its possible to split strings like this "[1,0],[[2,0],[[0,0]]],[[3,0]],etc" with StringRegExp() into:

array[0]="[1,0]"

array[2]="[[2,0],[[0,0]]]"

array[3]="[[3,0]]"

---

[edit] small title change.

Edited by MvGulik

"Straight_and_Crooked_Thinking" : A "classic guide to ferreting out untruths, half-truths, and other distortions of facts in political and social discussions."
"The Secrets of Quantum Physics" : New and excellent 2 part documentary on Quantum Physics by Jim Al-Khalili. (Dec 2014)

"Believing what you know ain't so" ...

Knock Knock ...
 

Share this post


Link to post
Share on other sites



If the number is consistent ( always follows the [[0,0]]] rule for 2 dimensions ( of course we're really left to assume what it is you're trying to do), then yes... it's possible.


Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.

Share this post


Link to post
Share on other sites

no fixed dimention depth.

no fixed dimention size.

only fixed thing is that every [..] part is a one dimentional array.

so [[2,0],[[1,2,3]]] is equivalent to: array[ array[2,0], array[ array[1,2,3] ] ]

trying to see if StringRegExp() can unwrap this while keeping the right [..] balance.


"Straight_and_Crooked_Thinking" : A "classic guide to ferreting out untruths, half-truths, and other distortions of facts in political and social discussions."
"The Secrets of Quantum Physics" : New and excellent 2 part documentary on Quantum Physics by Jim Al-Khalili. (Dec 2014)

"Believing what you know ain't so" ...

Knock Knock ...
 

Share this post


Link to post
Share on other sites

It is possible, but a bit complex, since it needs things like look ahead & recursion and the square brackets \[, \] make it look confusing. I don't think that someone, who doesn't have a similar pattern ready 'on his shelf', can write it in a few minutes, even with some experience.


(The signature is placed on the back of this page to not disturb the flow of the thread.)

Share this post


Link to post
Share on other sites

If the number is consistent ( always follows the [[0,0]]] rule for 2 dimensions ( of course we're really left to assume what it is you're trying to do), then yes... it's possible.

I don't think it's possible, at least not with one statement. You'll have to use, a limited count, of submatches. Indeed we don't know what your aim is, if you can change the notation you could do it with StringSplit().

- Heron -

Share this post


Link to post
Share on other sites

#6 ·  Posted (edited)

Complex was kinda expected.

But if its possible, than I could give it at leased a other attempt.

---

just interest in how far StringRegExp() can be pushed in this case.

I already have a alternative function that is doing the job.

Edited by MvGulik

"Straight_and_Crooked_Thinking" : A "classic guide to ferreting out untruths, half-truths, and other distortions of facts in political and social discussions."
"The Secrets of Quantum Physics" : New and excellent 2 part documentary on Quantum Physics by Jim Al-Khalili. (Dec 2014)

"Believing what you know ain't so" ...

Knock Knock ...
 

Share this post


Link to post
Share on other sites

Regular expression are primarily set up to find linear patterns, not nested ones. I will do a little research to see if I can figure this out.


David Nuttall
Nuttall Computer Consulting

An Aquarius born during the Age of Aquarius

AutoIt allows me to re-invent the wheel so much faster.

I'm off to write a wizard, a wonderful wizard of odd...

Share this post


Link to post
Share on other sites

Oww, nice one.

Can almost read how it works without additional doc.

Think its perfect.

Actually, it is.

Time to take a closer look at it.

Thanks for the help and info.


"Straight_and_Crooked_Thinking" : A "classic guide to ferreting out untruths, half-truths, and other distortions of facts in political and social discussions."
"The Secrets of Quantum Physics" : New and excellent 2 part documentary on Quantum Physics by Jim Al-Khalili. (Dec 2014)

"Believing what you know ain't so" ...

Knock Knock ...
 

Share this post


Link to post
Share on other sites

#10 ·  Posted (edited)

Hope it's correct:

#include <Array.au3>

Local $sString = "[1,0],[[2,0],[[0,0]]],[[3,0]]"
Local $avMatches = StringRegExp($sString, '(?x)(\[ (?: [^[\]]++ | (?R) )* \])', 3)

If IsArray($avMatches) Then _ArrayDisplay($avMatches)

You know, now that I see your string in a code box, I see the pattern ( without looking at your code ). Then when I look at your pattern, I am thankful I was actually working before I attempted it (Very nice). Edited by SmOke_N
ebonics

Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.

Share this post


Link to post
Share on other sites

#11 ·  Posted (edited)

This one almost always matches the things surrounding pattern recursion. It's publicly visible on PCRE's man page. The way it differs from a function call's nested arguments is that it's now a single dimension array, but again, the only difference is that you're trying to match nested array elements to an arbitrary depth.

Edit: I guess a simple change would make it visible:

(1,0),((2,0),((0,0))),((3,0))

Another non recursive way is to increase a counter for the opening parentheses and decreasing the counter when the closing parentheses are encountered. Like the calculator, 1+(2+3*(4-5)), the second it encounters the opening ( it increases a variable to 1 then to 2, then decreases to 1 and then 0, so it's a valid expression. The rest is the language elements tokenization I guess.

Edited by Authenticity

Share this post


Link to post
Share on other sites

mm, I need to read better.

RECURSIVE PATTERNS

...

This PCRE pattern solves the nested parentheses problem (assume the

PCRE_EXTENDED option is set so that white space is ignored):

\( ( [^()]++ | (?R) )* \)


"Straight_and_Crooked_Thinking" : A "classic guide to ferreting out untruths, half-truths, and other distortions of facts in political and social discussions."
"The Secrets of Quantum Physics" : New and excellent 2 part documentary on Quantum Physics by Jim Al-Khalili. (Dec 2014)

"Believing what you know ain't so" ...

Knock Knock ...
 

Share this post


Link to post
Share on other sites

Trying to get recursion to work on something longer than one character.

Tried to substitute (?>[^U]+) with different things, like (?!REM+), but none of them did any capturing so far.

$sSource = 'aa X bb X cc Z bb Z aa X bb X cc Z bb Z aa'
        $sPattern = "(?x)  (?'pn'  X  (?:  (?>[^XZ]+) | (?&pn)  )*  Z  )" ;; YES

        $sSource = 'aa UV bb UV cc ZY bb ZY aa UV bb UV cc ZY bb ZY aa'
        $sPattern = "(?x)  (?'pn'  UV  (?:  (?>[^UZ]+) | (?&pn)  )*  ZY  )" ;; YES

;~      $sSource = 'aa UV bb UV cc UY bb UY aa UV bb UV cc UY bb UY aa'
;~      $sPattern = "(?x)  (?'pn'  UV  (?:  (?>[^U]+) | (?&pn)  )*  UY  )" ;; YES: ???

        $sSource = 'aa REMstart bb REMstart cc REMend bb REMend aa REMstart bb REMstart cc REMend bb REMend aa'
        $sPattern = "(?x)  (?'pn'  REMstart  (?:  (?>[^R]+) | (?&pn)  )*  REMend  )" ;; YES: but unsave.

;~      $sSource = 'aa REMstart bb REMstart cc REMend bb REMend aa REMstart bb REMstart cc REMend bb REMend aa'
;~      $sPattern = "(?x)  (?'pn'  REMstart  (?:  (?...???... ) | (?&pn)  )*  REMend  )" ;; ...

Target is of course #cs...#ce.


"Straight_and_Crooked_Thinking" : A "classic guide to ferreting out untruths, half-truths, and other distortions of facts in political and social discussions."
"The Secrets of Quantum Physics" : New and excellent 2 part documentary on Quantum Physics by Jim Al-Khalili. (Dec 2014)

"Believing what you know ain't so" ...

Knock Knock ...
 

Share this post


Link to post
Share on other sites

No wonder I can never get (\R) to work. It should be (?R)


George

Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.

Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.***

The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number.

Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else.

"Old age and treachery will always overcome youth and skill!"

Share this post


Link to post
Share on other sites

#include <Array.au3>

$sSource = 'aa REMstart bb REMstart cc REMend bb REMend ee REMstart ff REMstart gg REMend hh REMend ii'
$sPattern = "(?x)  (REMstart  (?: (?>(?s).*?(?=REMstart|REMend)) | (?1)  )*  REMend  )" ;; ...

$aMatches = StringRegExp($sSource, $sPattern, 3)
_ArrayDisplay($aMatches)

One thing I'm asking now is why #cs followed by another #cs requires two #ce? Isn't anything between #cs to #ce must be commented out?

Share this post


Link to post
Share on other sites

#16 ·  Posted (edited)

One thing I'm asking now is why #cs followed by another #cs requires two #ce?

Not sure if your asking me, as its sound more like a questions for a Dev's. As in "why do Autoit block comments support nesting?".

I'm just following the behavior of Autoit in this case. (Help: The #comments-start and #comments-end directives can be nested.)

Isn't anything between #cs to #ce must be commented out?

?, It is. But only between matching #cs and #ce pairs.

Edit: actually from #cs to next matching #ce OR end of document.

Edited by MvGulik

"Straight_and_Crooked_Thinking" : A "classic guide to ferreting out untruths, half-truths, and other distortions of facts in political and social discussions."
"The Secrets of Quantum Physics" : New and excellent 2 part documentary on Quantum Physics by Jim Al-Khalili. (Dec 2014)

"Believing what you know ain't so" ...

Knock Knock ...
 

Share this post


Link to post
Share on other sites

Unless it's been changed again End of document isn't valid for #cs. I used to do that and then I started getting errors about it and I had to end all the #ce's back in at the end of the document. I haven't tried it again since and it's probably not a good practice anyway.


George

Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.

Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.***

The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number.

Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else.

"Old age and treachery will always overcome youth and skill!"

Share this post


Link to post
Share on other sites

End of document isn't valid for #cs.

Yep.

It was more a observed syntax highlighting behavior note.

- - -

On the Regular recursive pattern part.

Although code wise speaking I worked my way around it. I'm still interested if its possible, or not, to do direct recursive pattern matching for words.


"Straight_and_Crooked_Thinking" : A "classic guide to ferreting out untruths, half-truths, and other distortions of facts in political and social discussions."
"The Secrets of Quantum Physics" : New and excellent 2 part documentary on Quantum Physics by Jim Al-Khalili. (Dec 2014)

"Believing what you know ain't so" ...

Knock Knock ...
 

Share this post


Link to post
Share on other sites

So lookahead does not fit the situation?

Share this post


Link to post
Share on other sites

So lookahead does not fit the situation?

Oops. I mistook the code in your previous message as a quote or something related to your question, and not as a possible solution. sorry about that.

Your code does what I asked for. So it fits.

Thanks.


"Straight_and_Crooked_Thinking" : A "classic guide to ferreting out untruths, half-truths, and other distortions of facts in political and social discussions."
"The Secrets of Quantum Physics" : New and excellent 2 part documentary on Quantum Physics by Jim Al-Khalili. (Dec 2014)

"Believing what you know ain't so" ...

Knock Knock ...
 

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0