This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Find out more here. X

# RegExp - has anyone seen this library before?

136 replies to this topic

### #121 vim

vim

Polymath

• Active Members
• 218 posts

Posted 06 October 2006 - 05:05 PM

For those wanting to learn more about regular expressions, I have found an online copy of Mastering Regular Expressions, 2nd Edition here. I am reading it myself in an attempt to better understand regular expressions.

Thanks 'this-is-me' !

Great find. I'm surprised its available online. O'Reilly normally doesn't do that.
I have their CD "The Perl CD Bookshelf", which has very good info.

ViM

### #122 thomasl

thomasl

Wayfarer

• Active Members
• 63 posts

Posted 06 October 2006 - 08:14 PM

No named capturing groups then? Too bad; maybe some day... I'll just be thrilled to finally see RegExps well-supported.

Try this:

$s="Why not test this for yourself?"$p="((?P<named>.\s.).+(?P=named))"
$b=StringRegExp($s,$p,3) for$i=0 to Ubound($B)-1 ConsoleWrite("!"&$b[$i]&"!"&@CRLF); next However, named groups don't seem, as yet, to work in StringRegExpReplace(). If you send Jon a bottle of champagne maybe he'll consider implementing that. ### #123 sohfeyr sohfeyr Prodigy • Active Members • 194 posts Posted 08 October 2006 - 07:09 PM I think I must be missing something in this implementation. (Thought about posting in the support forum, but since there were already similar posts in this thread... If it belongs there, fine, if it belongs here, fine.) $ln = 'CallCommand»_EditLineReplace|33|"»_1"|"»_2"|'
$x = StringRegExp($ln,'\|"([^"]+)"\|',1)
$y = StringRegExp($ln,'\|([^|]+)\|',1)
For $n in$x
ConsoleWrite($n & @crlf) Next ConsoleWrite("----" & @crlf) For$n in $y ConsoleWrite($n & @crlf)
Next

The output:
»_1
----
33

I was expecting:
»_1
»_2
----
33
"»_1"
"»_2"

Any idea what's wrong? $x[1] and$y[1] both result in errors.
Modes 1 and 3 work as above, but 2 and 4 give me this:
Variable must be of type "Object".:
For $n in$x
For $n in$x^ ERROR

Edited by sohfeyr, 08 October 2006 - 07:25 PM.

### #124 sohfeyr

sohfeyr

Prodigy

• Active Members
• 194 posts

Posted 08 October 2006 - 07:41 PM

Okay... I got
\|"([^"]+)"|\|([^|]+)

to work in RegExBuddy, but AutoIt is still only returning one capture, the 33

### #125 /dev/null

/dev/null

Universalist

• MVPs
• 2,946 posts

Posted 08 October 2006 - 07:52 PM

$ln = 'CallCommand»_EditLineReplace|33|"»_1"|"»_2"|' with the form of your data, StringSplit() with "|" as delimiter, would be much easier than RegExp... Cheers Kurt __________________________________________________________(l)user: Hey admin slave, how can I recover my deleted files?admin: No problem, there is a nice tool. It's called rm, like recovery method. Make sure to call it with the "recover fast" option like this: rm -rf * ### #126 sohfeyr sohfeyr Prodigy • Active Members • 194 posts Posted 08 October 2006 - 08:09 PM with the form of your data, StringSplit() with "|" as delimiter, would be much easier than RegExp... Oh, I'm sorry - I forgot to say what I was trying to accomplish. I want to be able to do things like: CMD»FNC|Nparam|"Qparam|Rparam"|SParam so that the returned groups are: Nparam Qparam|Rparam SParam With StringSplit, I'd get: Nparam "Qparam Rparam" SParam ### #127 /dev/null /dev/null Universalist • MVPs • 2,946 posts Posted 08 October 2006 - 08:10 PM [quote name='sohfeyr' post='248723' date='Oct 8 2006, 09:09 PM']I think I must be missing something in this implementation. (Thought about posting in the support forum, but since there were already similar posts in this thread... If it belongs there, fine, if it belongs here, fine.) $ln = 'CallCommand»_EditLineReplace|33|"»_1"|"»_2"|'
$x = StringRegExp($ln,'\|"([^"]+)"\|',1)
$y = StringRegExp($ln,'\|([^|]+)\|',1)
For $n in$x
ConsoleWrite($n & @crlf) Next ConsoleWrite("----" & @crlf) For$n in $y ConsoleWrite($n & @crlf)
Next
oÝ÷ Ûú®¢×Âax,Õ.ç¶î+kxz1§­mç¯x%¡¶¥±æ«r§²×võÙ.®­µêçÉèµÊ+­ç-z÷§~íè%¡¶¥±æ«r¡jÜ¨ºwv+-+0ØhºÛazV¬·Ov¢Ø^¯¬zØ^¥«mz¹ðYh¦j×!¢wßv®¶­sbb33c·Ò7G&æu&VtWb33c¶ÆâÂb33²b3#·ÂgV÷C²µâgV÷CµÒ²gV÷C²b33²Ã

Now for the bad news. PHP preg_match_all() returns the correct values (>>_1 and >>_2), however StringRegExp() returns still just >>_1.

I guess there is still a problem with the global search of StringRegExp()!

@Jon. Could you please check that? BTW: Do you know PCRE Workbench? It helps a lot to test patterns!

http://www.renatomancuso.com/software/pcre...reworkbench.htm

Cheers
Kurt

Edited by /dev/null, 08 October 2006 - 10:31 PM.

__________________________________________________________(l)user: Hey admin slave, how can I recover my deleted files?admin: No problem, there is a nice tool. It's called rm, like recovery method. Make sure to call it with the "recover fast" option like this: rm -rf *

### #128 /dev/null

/dev/null

Universalist

• MVPs
• 2,946 posts

Posted 08 October 2006 - 08:50 PM

Oh, I'm sorry - I forgot to say what I was trying to accomplish.
I want to be able to do things like:
CMD»FNC|Nparam|"Qparam|Rparam"|SParam

so that the returned groups are:
Nparam
Qparam|Rparam
SParam

O.K. in this case you can use this little tokenizer...

AutoIt
$ln = 'CallCommand»_EditLineReplace|33|"»_1"|"»_2"|'$ln = 'CMD»FNC|Nparam|"Qparam|Rparam"|SParam'

Global $field_delimiter = '|' Global$string_delimiter = '"'

$retarray = _tokenizer($ln)

For $n in$retarray
MsgBox(0, "", $n) Next Func _tokenizer($string)
Local $chars = StringSplit($string, "")
Local $fieldnr = 1 Local$instring = 0
Local $token = "" Dim$token_array[2]

For $i = 1 To UBound($chars) - 1
Switch $chars[$i]
Case $field_delimiter If Not$instring Then
$token_array[$fieldnr] = $token$fieldnr = $fieldnr + 1 ReDim$token_array[$fieldnr + 1]$token = ""
Else
$token =$token & $chars[$i]
EndIf

Case $string_delimiter$instring = Not $instring$token = $token &$chars[$i] Case Else$token = $token &$chars[$i] EndSwitch Next$token_array[$fieldnr] =$token
$token_array[0] =$fieldnr
Return $token_array EndFunc ;==>tokenizer Cheers Kurt Edited by /dev/null, 08 October 2006 - 09:23 PM. __________________________________________________________(l)user: Hey admin slave, how can I recover my deleted files?admin: No problem, there is a nice tool. It's called rm, like recovery method. Make sure to call it with the "recover fast" option like this: rm -rf * ### #129 sohfeyr sohfeyr Prodigy • Active Members • 194 posts Posted 08 October 2006 - 09:14 PM O.K. in this case you can use this little tokenizer... Thanks Kurt! I was hoping to do it using reg exps, but that'll do the job nicely for now. ### #130 Jon Jon Up all night to get lucky • Administrators • 10,630 posts Posted 08 October 2006 - 09:30 PM I got just a single match in pcretest.exe too, so unless someone can understand why I can't fix it. ### #131 /dev/null /dev/null Universalist • MVPs • 2,946 posts Posted 08 October 2006 - 09:39 PM I got just a single match in pcretest.exe too, so unless someone can understand why I can't fix it. where can I download pcretest.exe? I thought PCRE itself has no "global" option, which was the reason why you implemented StringRegExp() like php preg_match_all(). Isn't that correct? BTW: PCRE Workbench returns ">>_1" for the simple Search and ">>_1" + ">>_2" for the Grep tool, at least I interpret it like that. Grep should be equal to the global search of StringRegExp. Cheers Kurt __________________________________________________________(l)user: Hey admin slave, how can I recover my deleted files?admin: No problem, there is a nice tool. It's called rm, like recovery method. Make sure to call it with the "recover fast" option like this: rm -rf * ### #132 Jon Jon Up all night to get lucky • Administrators • 10,630 posts Posted 08 October 2006 - 09:44 PM where can I download pcretest.exe? I thought PCRE itself has no "global" option, which was the reason why you implemented StringRegExp() like php preg_match_all(). Isn't that correct? BTW: PCRE Workbench returns ">>_1" for the simple Search and ">>_1" + ">>_2" for the Grep tool, at least I interpret it like that. Grep should be equal to the global search of StringRegExp. Cheers Kurt http://www.autoitscript.com/autoit3/files/beta/autoit/ Do global matches like perl: /pattern/g ### #133 /dev/null /dev/null Universalist • MVPs • 2,946 posts Posted 08 October 2006 - 09:52 PM Do global matches like perl: /pattern/g Ah, O.K. then it works as it should. His first pattern was wrong... NON global match: re> /\|"([^"]+)"/ data> CallCommand»_EditLineReplace|33|"»_1"|"»_2"| 0: |"\xaf_1" 1: \xaf_1 global match: re> /\|"([^"]+)"/g data> CallCommand»_EditLineReplace|33|"»_1"|"»_2"| 0: |"\xaf_1" 1: \xaf_1 0: |"\xaf_2" 1: \xaf_2 EDIT: Strange, now it also works with AutoIT !???! It seems I somehow messed up the regexp pattern. Can somebody please check that? $ln = 'CallCommand»_EditLineReplace|33|"»_1"|"»_2"|'
$x = StringRegExp($ln, '\|"([^"]+)"', 3)
$y = StringRegExp($ln, '\|([^|]+)\|', 1)
For $n in$x
MsgBox(0, "", $n & @CRLF) Next MsgBox(0, "", "----" & @CRLF) For$n in $y MsgBox(0, "",$n & @CRLF)
Next

Cheers
Kurt

Edited by /dev/null, 08 October 2006 - 09:59 PM.

__________________________________________________________(l)user: Hey admin slave, how can I recover my deleted files?admin: No problem, there is a nice tool. It's called rm, like recovery method. Make sure to call it with the "recover fast" option like this: rm -rf *

AutoIt lover

• Active Members
• 2,383 posts

Posted 08 October 2006 - 10:16 PM

[quote name='/dev/null' post='248793' date='Oct 8 2006, 11:52 PM']Ah, O.K. then it works as it should. His first pattern was wrong...

NON global match:

re> /\|"([^"]+)"/
data> CallCommand»_EditLineReplace|33|"»_1"|"»_2"|
0: |"\xaf_1"
1: \xaf_1

global match:

re> /\|"([^"]+)"/g
data> CallCommand»_EditLineReplace|33|"»_1"|"»_2"|
0: |"\xaf_1"
1: \xaf_1
0: |"\xaf_2"
1: \xaf_2

EDIT: Strange, now it also works with AutoIT !???! It seems I somehow messed up the regexp pattern. Can somebody please check that?

$ln = 'CallCommand»_EditLineReplace|33|"»_1"|"»_2"|'$x = StringRegExp($ln, '\|"([^"]+)"', 3)$y = StringRegExp($ln, '\|([^|]+)\|', 1) For$n in $x MsgBox(0, "",$n & @CRLF)
Next
MsgBox(0, "", "----" & @CRLF)
For $n in$y
MsgBox(0, "", $n & @CRLF) Next ### #135 sohfeyr sohfeyr Prodigy • Active Members • 194 posts Posted 09 October 2006 - 06:15 AM Ah, O.K. then it works as it should. His first pattern was wrong... Thank you all for your help Pride dictates I mention I tried several variations on both of those expressions before posting. I respect you guys waaay too much to waste your time on something I haven't already pounded on for a while. I admit, though, that I didn't pay much attention to Mode 3 because I didn't immediately understand how it was different from Mode 1. As long as the door is open for feedback on the help file - it would be nice if there was some explanation of the difference between a match and a global match. Perhaps comments could be added to the script in the help file to show the output, and using ConsoleWrite with tabs instead of MsgBox? Just a thought. Oh, I almost forgot: when I run the sample from StringRegExp in the help file, I get: >"C:\Program Files\AutoIt3\SciTE\AutoIt3Wrapper\AutoIt3Wrapper.exe" /run /beta /ErrorStdOut /in "C:\Program Files\AutoIt3\beta\Examples\Helpfile\StringRegExp.au3" /autoit3dir "C:\Program Files\AutoIt3\beta" /UserParams >Running AU3Check (1.54.4.0) params: from:C:\Program Files\AutoIt3\beta C:\Program Files\AutoIt3\beta\Examples\Helpfile\StringRegExp.au3(4,113) : ERROR: StringRegExp() [built-in] called with wrong number of args.$array = StringRegExp('<test>a</test> <test>b</test> <test>c</Test>', <(?i)test>(.*?)</(?i)test>', 1, $nOffset) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~^ C:\Program Files\AutoIt3\beta\Examples\Helpfile\StringRegExp.au3 - 1 error(s), 0 warning(s) !>AU3Check ended.rc:2 >Running:(3.2.1.8):C:\Program Files\AutoIt3\beta\autoit3.exe "C:\Program Files\AutoIt3\beta\Examples\Helpfile\StringRegExp.au3" +>AutoIT3.exe ended.rc:0 >Exit code: 0 Time: 49.255 The message boxes do appear, and they (casually, at 11:30 at night) appear to be correct, so this looks like an AU3Check issue to me. ### #136 steve8tch steve8tch Universalist • Active Members • 291 posts Posted 09 October 2006 - 10:03 AM @Jon, In @Nutsters RE code there was an option for \# Position. Record the current character location in the test string into the returned content array. I think this would just add the "offset value" to the returned array. At the moment we would not see the offset value when doing a global search, but it is used internally by Autoit when building the array. Would it be possible to add this back in. (or is there another switch to get this info - I can't find it ) eg$str = 'abccabccabcc'
$ptn = '(cc)' return cc cc cc old StringRegExp$str = 'abccabccabcc'
\$ptn = '(cc)\#'
return
cc
4
cc
8
cc
12 <-- bug in old version - this value was not returned

Background. I have a couple of scripts in production that use this figure to determine an order that other other parts of the script are processed. I now need to add further fuctionality to these scripts and I am keen to use the newer RegExp engine - its an average 10x as quick processing Reg Expressions...

### #137 Marc

Marc

Prodigy

• Active Members
• 188 posts

Posted 09 October 2006 - 01:11 PM

Great find. I'm surprised its available online. O'Reilly normally doesn't do that.

Hmmmm... lemme see:

1) O'Reilly doesn't do that
2) its not on O'Reilly's servers but on some obscure asian server
3) "brought to you by TeamLib" which seems to be some warez group

If I were you, I wouldn't let O'Reilly see this link, could be their lawyers would not like it very much.

Best regards
Marc
It's my job to comfort the disturbed and to disturb the comfortable.

#### 0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users