Sign in to follow this  
Followers 0
steve8tch

RegExp help needed

18 posts in this topic

This really should be simple.

$str = abcdefghijklmnopqrstuvwxyz

I would like to pick out everthing from the 'def' to the 'tuv' inclusive.

I have tried very hard to create a pattern - but the help file does not give a single example of how to create a pattern file. Sure it gives you all the switches - it just doesn't tell you how to use them.

This is all very fustrating - this a really powerful function - but it seems to me that unless you have used RegExp elsewhere then you've got no chance of using it within Aut3 - its a shame..

Share this post


Link to post
Share on other sites



I know you want to utilize RegExp...but

you can use

$str = abcdefghijklmnopqrstuvwxyz

$str = StringTrimLeft ( $str, 6 )

$str = StringTrimRightt ( $str, 7 )

$str = "ghijklmnopqrs"

8)


NEWHeader1.png

Share this post


Link to post
Share on other sites

This really should be simple.

$str = abcdefghijklmnopqrstuvwxyz

I would like to pick out everthing from the 'def' to the 'tuv' inclusive.

I have tried very hard to create a pattern - but the help file does not give a single example of how to create a pattern file. Sure it gives you all the switches - it just doesn't tell you how to use them.

This is all very fustrating - this a really powerful function - but it seems to me that unless you have used RegExp elsewhere then you've got no chance of using it within Aut3 - its a shame..

<{POST_SNAPBACK}>

I am not an expert but the pattern is "(abc.*tuv)" :)

Share this post


Link to post
Share on other sites

Thankyou for your kind replies.

Do you think that it would be helpful to have a few simple examples like this put in the help file ?

I can't see from the help file how to use brackets, how to use different sets of bracket types together or how to put different parts of a pattern together - perhaps I'm just a little slow.....

Thanks again

Share this post


Link to post
Share on other sites

Thankyou for your kind replies.

Do you think that it would be helpful to have a few simple examples like this put in the help file ?

I can't see from the help file how to use brackets, how to use different sets of bracket types together or how to put different parts of a pattern together

<{POST_SNAPBACK}>

Thats a Great Idea, I have tried and I really dont understand it either

perhaps I'm just a little slow.....

Not even.... you are trying to understand a difficult expression

8)


NEWHeader1.png

Share this post


Link to post
Share on other sites

Thankyou for your kind replies.

Do you think that it would be helpful to have a few simple examples like this put in the help file ?

I can't see from the help file how to use brackets, how to use different sets of bracket types together or how to put different parts of a pattern together - perhaps I'm just a little slow.....

Thanks again

<{POST_SNAPBACK}>

I will try to have another example the one included allow to check if the pattern match without giving even one working.

If you download the .zip installation file you have under bin\tests\regexp the scripts that Nutster is using to check try "regexp test 4.au3" :)

Share this post


Link to post
Share on other sites

I am maintaining the copy of Test RegExp 4.au3 in http://www.autoitscript.com/fileman/users/Nutster, so it is going to have my latest updates. Take a look at my other scripts in there as well.

The best pattern for this question using StringRegExp is "(def.*?tuv)", setting the flag to 1, to store the results in the returned array.

If someone in the documentation team wants to update/fix my regular expression docs, please do so. Drop me a line for any clarification necessary.


David Nuttall
Nuttall Computer Consulting

An Aquarius born during the Age of Aquarius

AutoIt allows me to re-invent the wheel so much faster.

I'm off to write a wizard, a wonderful wizard of odd...

Share this post


Link to post
Share on other sites

Thankyou for your replies

I have been experimenting..

$str = "a b(cd)efghijklmnopqrstuv()wxyz"
$pattern = "(b\(cd\).*uv[()])"
$res = StringRegExp($str,$pattern,3)
MsgBox(0,"","@error = " & @error & @CRLF & "@extended = " & @extended)
For $r = 0 to UBound($res) - 1
    MsgBox(0,"",$res[$r])
Next

From what I can work out - to pick up a '(' or a ')' you need to place a '\' in front of them. ie '\(' or '\)'.

However I can't join them together to find a '()'. If I use '\(\)' - I get @error = 2

If I use [()] then I get no error and a pattern match - only it isn't quite right - there is a ')' missing at the end.

Is there a better way of looking for a '()'.

Thanks

Share this post


Link to post
Share on other sites

Thankyou for your replies

I have been experimenting..

$str = "a b(cd)efghijklmnopqrstuv()wxyz"
$pattern = "(b\(cd\).*uv[()])"
$res = StringRegExp($str,$pattern,3)
MsgBox(0,"","@error = " & @error & @CRLF & "@extended = " & @extended)
For $r = 0 to UBound($res) - 1
    MsgBox(0,"",$res[$r])
Next

From what I can work out - to pick up a '(' or a ')' you need to place a '\' in front of them. ie '\(' or '\)'.

However I can't join them together to find a '()'. If I use '\(\)' - I get @error = 2

If I use [()] then I get no error and a pattern match - only it isn't quite right - there is a ')' missing at the end.

Is there a better way of looking for a '()'.

Thanks

Share this post


Link to post
Share on other sites

This really should be simple.

$str = abcdefghijklmnopqrstuvwxyz

I would like to pick out everthing from the 'def' to the 'tuv' inclusive.

I have tried very hard to create a pattern - but the help file does not give a single example of how to create a pattern file. Sure it gives you all the switches - it just doesn't tell you how to use them.

This is all very fustrating - this a really powerful function - but it seems to me that unless you have used RegExp elsewhere then you've got no chance of using it within Aut3 - its a shame..

<{POST_SNAPBACK}>

You see I was notan expert my pattern was not matching "deftuv" as the Nutster one does.

I will push him to have some Regexp starter in his doc not just a tool to test if the match is ok, or not

Share this post


Link to post
Share on other sites

Thankyou for your replies

I have been experimenting..

CODE

$str = "a b(cd)efghijklmnopqrstuv()wxyz"

$pattern = "(b\(cd\).*uv[()])"

$res = StringRegExp($str,$pattern,3)

MsgBox(0,"","@error = " & @error & @CRLF & "@extended = " & @extended)

For $r = 0 to UBound($res) - 1

    MsgBox(0,"",$res[$r])

Next

From what I can work out - to pick up a '(' or a ')' you need to place a '\' in front of them. ie '\(' or '\)'.

However I can't join them together to find a '()'. If I use '\(\)' - I get @error = 2

If I use [()] then I get no error and a pattern match - only it isn't quite right - there is a ')' missing at the end.

Is there a better way of looking for a '()'.

Thanks

I have done somemore experimenting. I can catch the required string by using the following pattern:

$str = "a b(cd)efghijklmnopqrstuv()wxyz"
$pattern = "(b\(cd\).*uv\(.+?)"
$res = StringRegExp($str,$pattern,3)
MsgBox(0,"","@error = " & @error & @CRLF & "@extended = " & @extended)
For $r = 0 to UBound($res) - 1
    MsgBox(0,"",$res[$r])
Next

This is cheating abit. This pattern is saying - "catch stuff up to the 'open bracket' part of the '()' and then add one more charactor to catch the whole string. This would work in the string above - but not if I changed the string to

$str = "a b(cd)efghijklmnopqrsuv(tuv()wxyz"

So my question now is:

What pattern would we use to catch '()' (without the quotes !!)

Thanks for your help

Share this post


Link to post
Share on other sites

From what I can work out - to pick up a '(' or a ')' you need to place a '\' in front of them. ie '\(' or '\)'.

However I can't join them together to find a '()'. If I use '\(\)' - I get @error = 2

If I use [()] then I get no error and a pattern match - only it isn't quite right - there is a ')' missing at the end.

Is there a better way of looking for a '()'.

Thanks

<{POST_SNAPBACK}>

[()] tries to match with either '(' or ')', not both. '\(\)' should match '()'. I will investigate.

David Nuttall
Nuttall Computer Consulting

An Aquarius born during the Age of Aquarius

AutoIt allows me to re-invent the wheel so much faster.

I'm off to write a wizard, a wonderful wizard of odd...

Share this post


Link to post
Share on other sites

@Nutster - thankyou for your help

- much appreciated.

btw

..the speed that this function appears to pull the data out of a string is astonishing... - very impressed. :)

Share this post


Link to post
Share on other sites

@Nutster - thankyou for your help

- much appreciated.

btw

..the speed that this function appears to pull the data out of a string is astonishing... - very impressed. :)

<{POST_SNAPBACK}>

You're welcome. I have spent a bunch of time (and a few false starts) optimizing it and still have a couple to go, so long as I don't break what is working in the process.

David Nuttall
Nuttall Computer Consulting

An Aquarius born during the Age of Aquarius

AutoIt allows me to re-invent the wheel so much faster.

I'm off to write a wizard, a wonderful wizard of odd...

Share this post


Link to post
Share on other sites

#16 ·  Posted (edited)

[()] tries to match with either '(' or ')', not both. '\(\)' should match '()'. I will investigate.

Has anyone figured it out yet?

I have got a string like this:

<TD class=body_text>&nbsp;FC Utrecht (+0)</TD>

<TD class=body_text>&nbsp;0</TD>

<TD class=body_text>&nbsp;V</TD>

<TD class=body_text>&nbsp;FC Porto</TD>

<TD class=body_text>&nbsp;0</TD></TR>

<TR></TR>

<TR>

<TD class=body_text>&nbsp;FC St.Pauli (+0)</TD>

<TD class=body_text>&nbsp;0</TD>

<TD class=body_text>&nbsp;V</TD>

<TD class=body_text>&nbsp;AGF Aarhus</TD>

<TD class=body_text>&nbsp;0</TD></TR>

<TR></TR>

<TR>

<TD class=body_text>&nbsp;AS St-Etienne (+0)</TD>

<TD class=body_text>&nbsp;0</TD>

<TD class=body_text>&nbsp;V</TD>

<TD class=body_text>&nbsp;AC Milan</TD>

<TD class=body_text>&nbsp;0</TD></TR>

<TR></TR>

...

As you can see there is "(+0)" in it.

I tried this pattern

(?i)<TD class=body_text>&nbsp;([- .a-z]*) \(+0\)</TD>\r\n<TD class=body_text>&nbsp;0</TD>\r\n<TD class=body_text>&nbsp;V</TD>\r\n<TD class=body_text>&nbsp;([- .a-z]*)</TD>\r\n<TD class=body_text>&nbsp;0</TD></TR>

with the "\" to escape the special characters "(" and ")" but somehow it is not working.

Has anybody an idea?

Edited by kurtkafka

Share this post


Link to post
Share on other sites

#17 ·  Posted (edited)

The problem is that + is considered a metacharacter (quantifier) so it's trying to match \( at least 1 time, then the pattern is trying to match 0\) immediately but fails. Try \(\+0\) instead.

Edited by Authenticity

Share this post


Link to post
Share on other sites

\(\+0\) works perfectly.

Thumbs up!

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0