Sign in to follow this  
Followers 0
sohfeyr

RegExp - has anyone seen this library before?

137 posts in this topic

#121 ·  Posted

For those wanting to learn more about regular expressions, I have found an online copy of Mastering Regular Expressions, 2nd Edition here. I am reading it myself in an attempt to better understand regular expressions.

Thanks 'this-is-me' !

Great find. I'm surprised its available online. O'Reilly normally doesn't do that.

I have their CD "The Perl CD Bookshelf", which has very good info.

ViM

Share this post


Link to post
Share on other sites



#122 ·  Posted

No named capturing groups then? Too bad; maybe some day... I'll just be thrilled to finally see RegExps well-supported. :lmao:

Try this:

$s="Why not test this for yourself?"
$p="((?P<named>.\s.).+(?P=named))"
$b=StringRegExp($s,$p,3)
for $i=0 to Ubound($B)-1
  ConsoleWrite("!"&$b[$i]&"!"&@CRLF);
next

However, named groups don't seem, as yet, to work in StringRegExpReplace(). If you send Jon a bottle of champagne maybe he'll consider implementing that.

Share this post


Link to post
Share on other sites

#123 ·  Posted (edited)

I think I must be missing something in this implementation. (Thought about posting in the support forum, but since there were already similar posts in this thread... If it belongs there, fine, if it belongs here, fine.)

$ln = 'CallCommand»_EditLineReplace|33|"»_1"|"»_2"|'
$x = StringRegExp($ln,'\|"([^"]+)"\|',1)
$y = StringRegExp($ln,'\|([^|]+)\|',1)
For $n in $x
ConsoleWrite($n & @crlf)
Next
ConsoleWrite("----" & @crlf)
For $n in $y
ConsoleWrite($n & @crlf)
Next

The output:

»_1
----
33

I was expecting:

»_1
»_2
----
33
"»_1"
"»_2"

Any idea what's wrong? $x[1] and $y[1] both result in errors.

Modes 1 and 3 work as above, but 2 and 4 give me this:

Variable must be of type "Object".: 
For $n in $x 
For $n in $x^ ERROR
Edited by sohfeyr

Share this post


Link to post
Share on other sites

#124 ·  Posted

Share this post


Link to post
Share on other sites

#125 ·  Posted

$ln = 'CallCommand»_EditLineReplace|33|"»_1"|"»_2"|'
with the form of your data, StringSplit() with "|" as delimiter, would be much easier than RegExp...

Cheers

Kurt


__________________________________________________________(l)user: Hey admin slave, how can I recover my deleted files?admin: No problem, there is a nice tool. It's called rm, like recovery method. Make sure to call it with the "recover fast" option like this: rm -rf *

Share this post


Link to post
Share on other sites

#126 ·  Posted

with the form of your data, StringSplit() with "|" as delimiter, would be much easier than RegExp...

Oh, I'm sorry - I forgot to say what I was trying to accomplish.

I want to be able to do things like:

CMD»FNC|Nparam|"Qparam|Rparam"|SParam

so that the returned groups are:

Nparam

Qparam|Rparam

SParam

With StringSplit, I'd get:

Nparam

"Qparam

Rparam"

SParam

Share this post


Link to post
Share on other sites

#127 ·  Posted (edited)

I think I must be missing something in this implementation. (Thought about posting in the support forum, but since there were already similar posts in this thread... If it belongs there, fine, if it belongs here, fine.)

$ln = 'CallCommand»_EditLineReplace|33|"»_1"|"»_2"|'
$x = StringRegExp($ln,'\|"([^"]+)"\|',1)
$y = StringRegExp($ln,'\|([^|]+)\|',1)
For $n in $x
ConsoleWrite($n & @crlf)
Next
ConsoleWrite("----" & @crlf)
For $n in $y
ConsoleWrite($n & @crlf)
Next
oÝ÷ Ûú®¢×Âax,Õ.ç¶î+kxz1§­mç¯x%¡¶¥±æ«r§²×võÙ.®­µêçÉèµÊ+­ç-z÷§~íè%¡¶¥±æ«r¡jܨºwv+-+0ØhºÛazV¬·Ov¢Ø^¯¬zØ^¥«mz¹ðYh¦j×!¢wßv®¶­sbb33c·Ò7G&æu&VtWb33c¶ÆâÂb33²b3#·ÂgV÷C²µâgV÷CµÒ²gV÷C²b33²Ã

Now for the bad news. PHP preg_match_all() returns the correct values (>>_1 and >>_2), however StringRegExp() returns still just >>_1.

I guess there is still a problem with the global search of StringRegExp()!

@Jon. Could you please check that? BTW: Do you know PCRE Workbench? It helps a lot to test patterns!

http://www.renatomancuso.com/software/pcre...reworkbench.htm

Cheers

Kurt

Edited by /dev/null

__________________________________________________________(l)user: Hey admin slave, how can I recover my deleted files?admin: No problem, there is a nice tool. It's called rm, like recovery method. Make sure to call it with the "recover fast" option like this: rm -rf *

Share this post


Link to post
Share on other sites

#128 ·  Posted (edited)

Oh, I'm sorry - I forgot to say what I was trying to accomplish.

I want to be able to do things like:

CMD»FNC|Nparam|"Qparam|Rparam"|SParam

so that the returned groups are:

Nparam

Qparam|Rparam

SParam

O.K. in this case you can use this little tokenizer...

$ln = 'CallCommand»_EditLineReplace|33|"»_1"|"»_2"|'
$ln = 'CMD»FNC|Nparam|"Qparam|Rparam"|SParam'

Global $field_delimiter = '|'
Global $string_delimiter = '"'

$retarray = _tokenizer($ln)

For $n in $retarray
    MsgBox(0, "", $n)
Next

Func _tokenizer($string)
    Local $chars = StringSplit($string, "")
    Local $fieldnr = 1
    Local $instring = 0
    Local $token = ""
    Dim $token_array[2]
    
    For $i = 1 To UBound($chars) - 1
        Switch $chars[$i]
            Case $field_delimiter
                If Not $instring Then
                    $token_array[$fieldnr] = $token
                    $fieldnr = $fieldnr + 1
                    ReDim $token_array[$fieldnr + 1]
                    $token = ""
                Else
                    $token = $token & $chars[$i]
                EndIf
                
            Case $string_delimiter
                $instring = Not $instring
                $token = $token & $chars[$i]
                
            Case Else
                $token = $token & $chars[$i]
        EndSwitch
    Next
    
    $token_array[$fieldnr] = $token
    $token_array[0] = $fieldnr
    Return $token_array
EndFunc   ;==>tokenizer

Cheers

Kurt

Edited by /dev/null

__________________________________________________________(l)user: Hey admin slave, how can I recover my deleted files?admin: No problem, there is a nice tool. It's called rm, like recovery method. Make sure to call it with the "recover fast" option like this: rm -rf *

Share this post


Link to post
Share on other sites

#129 ·  Posted

Share this post


Link to post
Share on other sites

#131 ·  Posted

I got just a single match in pcretest.exe too, so unless someone can understand why I can't fix it. :lmao:

where can I download pcretest.exe?

I thought PCRE itself has no "global" option, which was the reason why you implemented StringRegExp() like php preg_match_all(). Isn't that correct?

BTW: PCRE Workbench returns ">>_1" for the simple Search and ">>_1" + ">>_2" for the Grep tool, at least I interpret it like that. Grep should be equal to the global search of StringRegExp.

Cheers

Kurt


__________________________________________________________(l)user: Hey admin slave, how can I recover my deleted files?admin: No problem, there is a nice tool. It's called rm, like recovery method. Make sure to call it with the "recover fast" option like this: rm -rf *

Share this post


Link to post
Share on other sites

#132 ·  Posted

where can I download pcretest.exe?

I thought PCRE itself has no "global" option, which was the reason why you implemented StringRegExp() like php preg_match_all(). Isn't that correct?

BTW: PCRE Workbench returns ">>_1" for the simple Search and ">>_1" + ">>_2" for the Grep tool, at least I interpret it like that. Grep should be equal to the global search of StringRegExp.

Cheers

Kurt

http://www.autoitscript.com/autoit3/files/beta/autoit/

Do global matches like perl:

/pattern/g


Uber promo code for money off the first ride: uberautoit

https://www.flickr.com/photos/jonathanbennett/ 

Share this post


Link to post
Share on other sites

#133 ·  Posted (edited)

Do global matches like perl:

/pattern/g

Ah, O.K. then it works as it should. His first pattern was wrong...

NON global match:

re> /\|"([^"]+)"/
data> CallCommand»_EditLineReplace|33|"»_1"|"»_2"|
 0: |"\xaf_1"
 1: \xaf_1

global match:

re> /\|"([^"]+)"/g
data> CallCommand»_EditLineReplace|33|"»_1"|"»_2"|
 0: |"\xaf_1"
 1: \xaf_1
 0: |"\xaf_2"
 1: \xaf_2

EDIT: Strange, now it also works with AutoIT !???! It seems I somehow messed up the regexp pattern. Can somebody please check that?

$ln = 'CallCommand»_EditLineReplace|33|"»_1"|"»_2"|'
$x = StringRegExp($ln, '\|"([^"]+)"', 3)
$y = StringRegExp($ln, '\|([^|]+)\|', 1)
For $n in $x
    MsgBox(0, "", $n & @CRLF)
Next
MsgBox(0, "", "----" & @CRLF)
For $n in $y
    MsgBox(0, "", $n & @CRLF)
Next

Cheers

Kurt

Edited by /dev/null

__________________________________________________________(l)user: Hey admin slave, how can I recover my deleted files?admin: No problem, there is a nice tool. It's called rm, like recovery method. Make sure to call it with the "recover fast" option like this: rm -rf *

Share this post


Link to post
Share on other sites

#134 ·  Posted

Ah, O.K. then it works as it should. His first pattern was wrong...

NON global match:

re> /\|"([^"]+)"/
data> CallCommand»_EditLineReplace|33|"»_1"|"»_2"|
 0: |"\xaf_1"
 1: \xaf_1

global match:

re> /\|"([^"]+)"/g
data> CallCommand»_EditLineReplace|33|"»_1"|"»_2"|
 0: |"\xaf_1"
 1: \xaf_1
 0: |"\xaf_2"
 1: \xaf_2

EDIT: Strange, now it also works with AutoIT !???! It seems I somehow messed up the regexp pattern. Can somebody please check that?

$ln = 'CallCommand»_EditLineReplace|33|"»_1"|"»_2"|'
$x = StringRegExp($ln, '\|"([^"]+)"', 3)
$y = StringRegExp($ln, '\|([^|]+)\|', 1)
For $n in $x
    MsgBox(0, "", $n & @CRLF)
Next
MsgBox(0, "", "----" & @CRLF)
For $n in $y
    MsgBox(0, "", $n & @CRLF)
Next

Share this post


Link to post
Share on other sites

#135 ·  Posted

Ah, O.K. then it works as it should. His first pattern was wrong...

Thank you all for your help :ph34r:

Pride dictates I mention I tried several variations on both of those expressions before posting. I respect you guys waaay too much to waste your time on something I haven't already pounded on for a while.

I admit, though, that I didn't pay much attention to Mode 3 because I didn't immediately understand how it was different from Mode 1.

As long as the door is open for feedback on the help file - it would be nice if there was some explanation of the difference between a match and a global match. Perhaps comments could be added to the script in the help file to show the output, and using ConsoleWrite with tabs instead of MsgBox? Just a thought. :lmao:

Oh, I almost forgot: when I run the sample from StringRegExp in the help file, I get:

>"C:\Program Files\AutoIt3\SciTE\AutoIt3Wrapper\AutoIt3Wrapper.exe" /run /beta /ErrorStdOut /in "C:\Program Files\AutoIt3\beta\Examples\Helpfile\StringRegExp.au3" /autoit3dir "C:\Program Files\AutoIt3\beta" /UserParams  
>Running AU3Check (1.54.4.0)  params:  from:C:\Program Files\AutoIt3\beta
C:\Program Files\AutoIt3\beta\Examples\Helpfile\StringRegExp.au3(4,113) : ERROR: StringRegExp() [built-in] called with wrong number of args.
$array = StringRegExp('<test>a</test> <test>b</test> <test>c</Test>', <(?i)test>(.*?)</(?i)test>', 1, $nOffset)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~~~~~~~~^
C:\Program Files\AutoIt3\beta\Examples\Helpfile\StringRegExp.au3 - 1 error(s), 0 warning(s)
!>AU3Check ended.rc:2
>Running:(3.2.1.8):C:\Program Files\AutoIt3\beta\autoit3.exe "C:\Program Files\AutoIt3\beta\Examples\Helpfile\StringRegExp.au3" 
+>AutoIT3.exe ended.rc:0
>Exit code: 0   Time: 49.255

The message boxes do appear, and they (casually, at 11:30 at night) appear to be correct, so this looks like an AU3Check issue to me.

Share this post


Link to post
Share on other sites

#136 ·  Posted

@Jon,

In @Nutsters RE code there was an option for

\# Position. Record the current character location in the test string into the returned content array.

I think this would just add the "offset value" to the returned array. At the moment we would not see the offset value when doing a global search, but it is used internally by Autoit when building the array.

Would it be possible to add this back in. (or is there another switch to get this info - I can't find it :lmao: )

eg

$str = 'abccabccabcc'

$ptn = '(cc)'

return

cc

cc

cc

old StringRegExp

$str = 'abccabccabcc'

$ptn = '(cc)\#'

return

cc

4

cc

8

cc

12 <-- bug in old version - this value was not returned

Thanks again for your help

Background. I have a couple of scripts in production that use this figure to determine an order that other other parts of the script are processed. I now need to add further fuctionality to these scripts and I am keen to use the newer RegExp engine - its an average 10x as quick processing Reg Expressions...

Share this post


Link to post
Share on other sites

#137 ·  Posted

Great find. I'm surprised its available online. O'Reilly normally doesn't do that.

Hmmmm... lemme see:

1) O'Reilly doesn't do that

2) its not on O'Reilly's servers but on some obscure asian server

3) "brought to you by TeamLib" which seems to be some warez group

If I were you, I wouldn't let O'Reilly see this link, could be their lawyers would not like it very much. :lmao:

Best regards

Marc


It's my job to comfort the disturbed and to disturb the comfortable.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0