Jump to content

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Find out more here. X
X


Photo

Regular Expression Testing


  • Please log in to reply
138 replies to this topic

#21 Matt @ MPCS

Matt @ MPCS

    Just another AutoIt user trying to help out! :)

  • Active Members
  • PipPipPipPipPipPip
  • 700 posts

Posted 03 November 2004 - 04:45 PM

Alright, it was a long shot. Thanks for all your work Nutster!

*** Matt @ MPCS







#22 trids

trids

    Hmmm .. and what have we here?

  • Active Members
  • PipPipPipPipPipPip
  • 1,004 posts

Posted 03 November 2004 - 05:52 PM

[..]
Should not need the brackets.

$sTarget = "jan|feb"   ; finds "jan" or "feb"

<{POST_SNAPBACK}>

Without grouping, wouldn't that find "janeb" and "jafeb" .. ? :)

#23 Nutster

Nutster

    Developer at Large

  • Developers
  • 1,450 posts

Posted 03 November 2004 - 06:16 PM

Without grouping, wouldn't that find "janeb" and "jafeb" .. ?  :)

<{POST_SNAPBACK}>

From what I have read in the regexp page that was posted earlier was that | tries to match the largest groups, not just single characters. I will try to do it without the brackets and if I can not, then the brackets will be needed to enforce the locality.

David Nuttall

Nuttall Computer Consulting

An Aquarius born during the Age of Aquarius
AutoIt allows me to re-invent the wheel so much faster.

I'm off to write a wizard, a wonderful wizard of odd...


#24 trids

trids

    Hmmm .. and what have we here?

  • Active Members
  • PipPipPipPipPipPip
  • 1,004 posts

Posted 03 November 2004 - 06:22 PM

Aaaah! :) .. so to find "jafeb" and "janeb", it would be something like "ja(n|f)eb" .. ?

Hmm - that makes sense too. Anyway, as you say, later for that :)

#25 Nutster

Nutster

    Developer at Large

  • Developers
  • 1,450 posts

Posted 03 November 2004 - 07:50 PM

Aaaah!  :) .. so to find "jafeb" and "janeb", it would be something like "ja(n|f)eb" .. ?

Hmm - that makes sense too. Anyway, as you say, later for that  :)

<{POST_SNAPBACK}>

I would use "ja[fn]eb", which is working! ;)

Edited by Nutster, 03 November 2004 - 07:52 PM.

David Nuttall

Nuttall Computer Consulting

An Aquarius born during the Age of Aquarius
AutoIt allows me to re-invent the wheel so much faster.

I'm off to write a wizard, a wonderful wizard of odd...


#26 trids

trids

    Hmmm .. and what have we here?

  • Active Members
  • PipPipPipPipPipPip
  • 1,004 posts

Posted 04 November 2004 - 06:08 AM

:)
Brilliant!
:)

#27 sugi

sugi

    Universalist

  • Active Members
  • PipPipPipPipPipPip
  • 441 posts

Posted 04 November 2004 - 10:39 AM

I would use "ja[fn]eb", which is working!   :)

<{POST_SNAPBACK}>

It's still different from "jan|feb".
"jan|feb" will match:
jan
feb
xjanuary
xfebruary
but "ja[nf]eb" will not match the above examples.

As an alternative method to implementing the OR that way, you could always inverse the code you use for Exclusion sets ("[^a-zA-Z]") to make an INclusion set. Just a thought. I know I have seen it implenented like that before somewhere.

Yes, it's sometimes implemented as this, but that's a bug in the expression.
"[jf][ae][nb]" will match:
jan <- that's wanted
feb <- that's wanted
jeb <- that's *NOT* wanted
jab <- that's *NOT* wanted
fan <- that's *NOT* wanted
and there's not really a way to exclude the last third matches without using the pipe.

Edited by sugi, 04 November 2004 - 10:43 AM.


#28 Nutster

Nutster

    Developer at Large

  • Developers
  • 1,450 posts

Posted 09 November 2004 - 07:01 PM

I have added \x for hexidecimal digits (tested Ok), and {x,y} for specific range of repeats. It actually simplified the code from what I had. * is defined at {0,}, + is {1,} and ? is {0,1}.

Testing continues tomorrow after work.

David Nuttall

Nuttall Computer Consulting

An Aquarius born during the Age of Aquarius
AutoIt allows me to re-invent the wheel so much faster.

I'm off to write a wizard, a wonderful wizard of odd...


#29 trids

trids

    Hmmm .. and what have we here?

  • Active Members
  • PipPipPipPipPipPip
  • 1,004 posts

Posted 10 November 2004 - 08:56 AM

Sounds good :) .. shout if you want any help testing pre-release versions.

#30 Nutster

Nutster

    Developer at Large

  • Developers
  • 1,450 posts

Posted 24 November 2004 - 03:00 PM

Finally! The testing is complete and I have submitted to Jon.

David Nuttall

Nuttall Computer Consulting

An Aquarius born during the Age of Aquarius
AutoIt allows me to re-invent the wheel so much faster.

I'm off to write a wizard, a wonderful wizard of odd...


#31 trids

trids

    Hmmm .. and what have we here?

  • Active Members
  • PipPipPipPipPipPip
  • 1,004 posts

Posted 24 November 2004 - 03:15 PM

:) .. Magic! Can't wait! :)

#32 Josbe

Josbe

    Infrequent ghost ☺

  • Active Members
  • PipPipPipPipPipPip
  • 1,585 posts

Posted 24 November 2004 - 04:37 PM

:)  .. Magic! Can't wait!  :)

<{POST_SNAPBACK}>

Me too! ;)

#33 SlimShady

SlimShady

    AutoIt lover

  • Active Members
  • PipPipPipPipPipPip
  • 2,383 posts

Posted 25 November 2004 - 10:38 AM

I haven't used regular expression much in my life.
But I will use it if I understand the syntax.
Anyway. In Crimson Editor I used reg exp to replace a string.
Crimson Editor has an operator \0.
Which stands for "everything matched".
I was wondering if you added that.
On second thought... I'm not sure if it makes sense to add it.

#34 Nutster

Nutster

    Developer at Large

  • Developers
  • 1,450 posts

Posted 25 November 2004 - 06:36 PM

I haven't used regular expression much in my life.
But I will use it if I understand the syntax.
Anyway. In Crimson Editor I used reg exp to replace a string.
Crimson Editor has an operator \0.
Which stands for "everything matched".
I was wondering if you added that.
On second thought... I'm not sure if it makes sense to add it.

<{POST_SNAPBACK}>

You can store groups in an array and I have also added \# which stores the current cursor position. I have some plans as what goes in next, and that can be added to the RegExp TODO list.

David Nuttall

Nuttall Computer Consulting

An Aquarius born during the Age of Aquarius
AutoIt allows me to re-invent the wheel so much faster.

I'm off to write a wizard, a wonderful wizard of odd...


#35 Jon

Jon

    Up all night to get lucky

  • Administrators
  • 10,292 posts

Posted 25 November 2004 - 09:51 PM

After adding the code I must admit I thought it was going to be more like php. (my only real exposure to it). All the php code I've seen seems to rely completely on a couple of functions:

preg_match() (Which seems to do what RegExp() does - just match)

and
preg_replace() which seems to be used in nearly every line of php like this:

// Ensure that spacing is preserved
$txt = preg_replace("/\t/", "&nbsp;&nbsp;&nbsp;&nbsp;", $txt);
$txt = preg_replace( "#\s{2}#", " &nbsp;", $txt );

Is this possible? Seems very useful.

I wasn't keen on the Set/Close functions either - is there really enough of a performance hit to require these? If it's not a massive diff then I'll probably remove.

#36 Nutster

Nutster

    Developer at Large

  • Developers
  • 1,450 posts

Posted 26 November 2004 - 06:28 PM

After adding the code I must admit I thought it was going to be more like php.  (my only real exposure to it).  All the php code I've seen seems to rely completely on a couple of functions:

preg_match() (Which seems to do what RegExp() does - just match)

and
preg_replace() which seems to be used in nearly every line of php like this:

// Ensure that spacing is preserved
$txt = preg_replace("/\t/", "&nbsp;&nbsp;&nbsp;&nbsp;", $txt);
$txt = preg_replace( "#\s{2}#", " &nbsp;", $txt );

Is this possible?  Seems very useful.

I wasn't keen on the Set/Close functions either - is there really enough of a performance hit to require these?  If it's not a massive diff then I'll probably remove.

<{POST_SNAPBACK}>

RegExpReplace() goes on the RegExp TO DO list. As far as the set and close functions, I will benchmark them this weekend. I could adjust it to cache the last few regular expressions, instead.

David Nuttall

Nuttall Computer Consulting

An Aquarius born during the Age of Aquarius
AutoIt allows me to re-invent the wheel so much faster.

I'm off to write a wizard, a wonderful wizard of odd...


#37 trids

trids

    Hmmm .. and what have we here?

  • Active Members
  • PipPipPipPipPipPip
  • 1,004 posts

Posted 28 November 2004 - 03:25 PM

Thanks for the shiny new toys, Nutster & Jon :)

I downloaded the RegExp version on Sunday, and here's some initial feedback..
  • The help file says that the the 3rd parameter in RegExp(), which identifies a variable that will receive hits, will be created in the same scope as DIM if it does not exist already: I couldn't get it to create such a variable if it didn't already exist, and always had to define it up-front.
  • Suggestion: could the array of hits that is returned by RegExp() please have an element with index=0, which indicates the highest index (like StringSplit() does)? Makes it easier to process the array.
  • The RegExp topic in the helpfile has hyperlinks that don't work: to RegExpSet and DIM topics. The only ones that do work are those under the "Related" paragraph.
  • There is no tab expression ("\t")..?
  • The Example for RegExpClose() is wrong.
HTH :)

#38 Nutster

Nutster

    Developer at Large

  • Developers
  • 1,450 posts

Posted 29 November 2004 - 05:26 PM

As far as the set and close functions, I will benchmark them this weekend.  I could adjust it to cache the last few regular expressions, instead.

<{POST_SNAPBACK}>

I have done the benchmarking. The savings are < 5% for time. I would say then to scrap the RegExpSet and RegExpClose functions and have F_RegExp cache the last so many in that list instead.

David Nuttall

Nuttall Computer Consulting

An Aquarius born during the Age of Aquarius
AutoIt allows me to re-invent the wheel so much faster.

I'm off to write a wizard, a wonderful wizard of odd...


#39 Nutster

Nutster

    Developer at Large

  • Developers
  • 1,450 posts

Posted 29 November 2004 - 05:31 PM

Thanks for the shiny new toys, Nutster & Jon  :)

I downloaded the RegExp version on Sunday, and here's some initial feedback..

  • The help file says that the the 3rd parameter in RegExp(), which identifies a variable that will receive hits, will be created in the same scope as DIM if it does not exist already: I couldn't get it to create such a variable if it didn't already exist, and always had to define it up-front.
  • Suggestion: could the array of hits that is returned by RegExp() please have an element with index=0, which indicates the highest index (like StringSplit() does)? Makes it easier to process the array.
  • The RegExp topic in the helpfile has hyperlinks that don't work: to RegExpSet and DIM topics. The only ones that do work are those under the "Related" paragraph.
  • There is no tab expression ("\t")..?
  • The Example for RegExpClose() is wrong.
HTH  :)

<{POST_SNAPBACK}>

Um let's see.
  • I will check. That's what I get for always turning on "MustDeclareVars". ;)
  • That is what UBound is for. I personally dislike that feature of StringSplit.
  • RegExpSet is coming out. When I rewrite the docs, I will try to build the correct links to the Keywords.
  • Didn't think of tab. Should be easy enough to add.
  • RegExpClose is coming out.

David Nuttall

Nuttall Computer Consulting

An Aquarius born during the Age of Aquarius
AutoIt allows me to re-invent the wheel so much faster.

I'm off to write a wizard, a wonderful wizard of odd...


#40 Lazycat

Lazycat

    Coding cat

  • MVPs
  • 1,174 posts

Posted 30 November 2004 - 07:07 AM

I have begin testing regexp yesterday, and I could say this is a very nice thing! But I'm not regexp guru, so one trouble here. It's should be possible to explicitly set "." character. I suppose that should be "\.". But all my tryings are fail :) There is code (i'm trying to realize simple file mask equivalent):

$line = "C:\Documents and Settings\User\NTUSER.DAT" If RegExp($line, '[.]*\.DAT$') Then            ; *.dat     Msgbox(0, "RegExp", "Pattern found") Else     Msgbox(0, "RegExp", "Pattern NOT found") Endif


But this example is working:

$line = "C:\Documents and Settings\User\NTUSER\DAT" If RegExp($line, '[.]*\\DAT$') Then            ; *.dat     Msgbox(0, "RegExp", "Pattern found") Else     Msgbox(0, "RegExp", "Pattern NOT found") Endif


So escape of "." is not working?
Koda homepage (http://www.autoitscript.com/fileman/users/lookfar/formdesign.html) (Bug Tracker)My Autoit script page (http://www.autoitscript.com/fileman/users/Lazycat/)




0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users