Thomymaster

Help on regexp for _ArrayFindAll

24 posts in this topic

#1 ·  Posted (edited)

Hi

I have some code where i want to search an array for specific patterns:

Local $aTest[6] = ["0001-1.jpg", "0001-2.jpg","0001-3.jpg","0001-5.jpg", "alt 0001-11.jpg", "0001-6.jpg"]
_ArrayDisplay($aTest)
Local $aResult=_ArrayFindAll($aTest,"^[\d].*$",0,0,0,3)
_ArrayDisplay($aResult)

The results should not include the "alt 0001-11.jpg" which it does by now.

The "0001" comes from a variable $sPrefix

To make the search more failure-tolerant, i only want the results that match:

<$sPrefix: unlimited number of numberic characters from 0 to 9>-<only one numberic character from 0 to 9>.jpg

I cannot figure out how to construct the regexp, my problem is the "-" which needs to be in between the 2 numeric values as well as the ".jpg" in the end.

Additionally i don't know how to integrate the variable in the regexp.

 

Best,

Thomas

Edited by Thomymaster

Share this post


Link to post
Share on other sites



"(?i)^" & $prefix & "\-\d\.jpg"

Matches any lower- or uppercase jpg combinations as well.


This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites

Like this should work try it:
 

Local $aTest[7] = ["0001-1.jpg", "0001-2.jpg","0001-3.jpg","0001-5.jpg", "alt 0001-11.jpg", "0001-6.jpg", "0001-25.JPG"]
_ArrayDisplay($aTest)
Local $aResult=_ArrayFindAll($aTest,"^[\d].*\-(\d)?\.jpg$",0,0,0,3)
_ArrayDisplay($aResult)

Regards
Alien.

Share this post


Link to post
Share on other sites

#4 ·  Posted (edited)

alien4u,
A little trouble using your regex  :)

#Include <Array.au3>

Local $aTest[5] = ["test", _ 
            "0cheers-0.jpg-0.jpg", _ 
            "0001 " & @tab & " -2.jpg", _ 
            "0 oh my god -3.jpg", _ 
            "4 : where is the number ? -.jpg"]
Local $aResult=_ArrayFindAll($aTest,"^[\d].*\-(\d)?\.jpg$",0,0,0,3)
_ArrayDisplay($aResult)

 

Edited by mikell

Share this post


Link to post
Share on other sites
1 hour ago, mikell said:

alien4u,
A little trouble using your regex  :)

#Include <Array.au3>

Local $aTest[4] = ["test", "0cheers-0.jpg-1.jpg", "0001 " & @tab & " -2.jpg", "0 oh my god -3.jpg"]
Local $aResult=_ArrayFindAll($aTest,"^[\d].*\-(\d)?\.jpg$",0,0,0,3)
_ArrayDisplay($aResult)

 

You are right I juts build the Regex base on the information provide by @Thomymaster

- <$sPrefix: unlimited number of numberic characters from 0 to 9>-<only one numberic character from 0 to 9>.jpg

Regards
Alien.

Share this post


Link to post
Share on other sites

#6 ·  Posted (edited)

Anyways this one is better, also base on what the user request:
 

Local $aResult=_ArrayFindAll($aTest,"^[\d].*\-(\d)?(\.jpg)$|^[\d].*\-(\d)?(\.JPG)$",0,0,0,3)

Or
 

Local $aResult=_ArrayFindAll($aTest,"^[\d]*\-(\d)?(\.jpg)$|^[\d]*\-(\d)?(\.JPG)$",0,0,0,3)

Regards
Alien.

Edited by alien4u

Share this post


Link to post
Share on other sites

You can save the alternation by using (?i) like I did. Also all of you seem to forget the meaning of variable $prefix.


This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites

alien4u,
A little effort more and you will arrive to jchd's expression 
:rolleyes:

Quote

also base on what the user request:

And what is the user request ?

Quote

<$sPrefix: unlimited number of numberic characters from 0 to 9>-<only one numberic character from 0 to 9>.jpg

So
<$sPrefix: unlimited number of numberic characters from 0 to 9> => \d+  or  $sPrefix
- => \-
<only one numberic character from 0 to 9> => \d
.jpg => \.jpg

that means : "^\d+\-\d\.jpg"  or  "^" & $sPrefix & "\-\d\.jpg" which is exactly the regex from jchd, if you add the needed (?i) for case insensitivity

:)

Share this post


Link to post
Share on other sites

Those rules are also doable without regexp :)

#include<array.au3>

Local $aTest[6] = ["0001-1.jpg", "0001-2.jpg","0001-300.jpg","0001-5.jpg", "alt 000111.jpg", "0001-6.jpg"]

$sPrefix = "0001"

   for $i = ubound($aTest) - 1 to 0 step -1
      If stringleft($aTest[$i] , 4) <> $sPrefix OR stringmid($aTest[$i] , 5 , 1) <> "-" OR stringright($aTest[$i] , 4) <> ".jpg" OR stringlen($aTest[$i]) <> 10 Then _ArrayDelete($aTest , $i)
   next

_ArrayDisplay($aTest)

 


,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-.
|(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/
(_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_)
| | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) (
| | | | |)| | \ / | | | | | |)| | `--. | |) \ | |
`-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_|
'-' '-' (__) (__) (_) (__)

Share this post


Link to post
Share on other sites
1 hour ago, mikell said:

alien4u,
A little effort more and you will arrive to jchd's expression 
:rolleyes:

And what is the user request ?

So
<$sPrefix: unlimited number of numberic characters from 0 to 9> => \d+  or  $sPrefix
- => \-
<only one numberic character from 0 to 9> => \d
.jpg => \.jpg

that means : "^\d+\-\d\.jpg"  or  "^" & $sPrefix & "\-\d\.jpg" which is exactly the regex from jchd, if you add the needed (?i) for case insensitivity

:)

Completely right I learn more from regex today thanks to you.
:)

Regards
Alien.

Share this post


Link to post
Share on other sites
1 hour ago, iamtheky said:

Those rules are also doable without regexp :)

I agree... if you add

OR (not IsNumber(stringmid($aTest[$i] , 6 , 1)))

so "0001-a.jpg" will not match  :D

... but that means 6 func calls against one !

Share this post


Link to post
Share on other sites

Damn missed that, So you learn 6 things instead of 1 :-)


,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-.
|(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/
(_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_)
| | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) (
| | | | |)| | \ / | | | | | |)| | `--. | |) \ | |
`-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_|
'-' '-' (__) (__) (_) (__)

Share this post


Link to post
Share on other sites
1 hour ago, iamtheky said:

Damn missed that, So you learn 6 things instead of 1 :-)

To learn is good your example as you said you could learn 6 things.
For efficient and be short and write less is better 1 func call against one like mikell said.
Besides Regular Expressions Rocks :D

Regards
Alien.

Share this post


Link to post
Share on other sites
1 hour ago, alien4u said:

Besides Regular Expressions Rocks :D

Totally agreed  :D

Share this post


Link to post
Share on other sites

efficiency how? In speed of authorship, or speed of execution, or how quickly the person who has to support that line can understand and modify it?  

shorter for you to write <> faster to run.  And the number of functions does not directly equate to speed of execution.  Take a look at any of the threads where we race a bunch of different solutions, the one line regexp does not always win.  It's super fun, but the first step in reaching for a weapon, is understanding why you are reaching for that weapon.


,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-.
|(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/
(_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_)
| | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) (
| | | | |)| | \ / | | | | | |)| | `--. | |) \ | |
`-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_|
'-' '-' (__) (__) (_) (__)

Share this post


Link to post
Share on other sites

"The first thing to learn about regex is when not to use it"  ?   ;)

1 person likes this

Share this post


Link to post
Share on other sites
1 hour ago, iamtheky said:

efficiency how? In speed of authorship, or speed of execution, or how quickly the person who has to support that line can understand and modify it?  

shorter for you to write <> faster to run.  And the number of functions does not directly equate to speed of execution.  Take a look at any of the threads where we race a bunch of different solutions, the one line regexp does not always win.  It's super fun, but the first step in reaching for a weapon, is understanding why you are reaching for that weapon.

In my particular case Efficiency on anything that involve the solution to a problem.
I'm not Programmer by Profession, Not Hardware Technician, Not Sys Admin and Not Network Manager.
Maybe a mix of all this, from Batch, Bash, C, C++, Java, PHP, Python, AutoIT and from Windows, OSX, Linux and medium scale Networking to Cell Phone Repairs.
So for me is only about Solving a problem and as far as I know Regular Expression are standard(with just a few differences) from Language to Language right?
So yes Regular Expression Rocks.

About someone else supporting that line it depends on how much lines are we talking about:
In one hand: Someone wrote a 300 lines of JavaScript code and someone else have to support that code.
In the other hand: Someone wrote the same code but this time with only 160 lines of code using regex, and someone else have to support that code.
Which is going to be easy to support? 300 lines vs 160 lines and the person who is going to support it only need to understand regex?
I think it depend on the point of view and of course there is multiple ways to skin a cat and is important to know when or why use regex, when will be better or worse.
Reading here should be enough to realize when you should not use regex:
http://www.regular-expressions.info/catastrophic.html

Regards
Alien.

Share this post


Link to post
Share on other sites

If your goal is mere functionality, with no regard to optimal functionality then by all means lean on regexps.  They will serve well.  

However, native string operations are syntactically similar in most of those as well. While PCRE is an elective, and the differences between it and re and re2 seem to be more than a few.  So interoperability is a tough argument to accept. 

And the number of lines is of no regard, there is ridiculousness by @mikell and @jguinch all over this forum that would confuse even the most seasoned veteran.


,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-.
|(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/
(_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_)
| | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) (
| | | | |)| | \ / | | | | | |)| | `--. | |) \ | |
`-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_|
'-' '-' (__) (__) (_) (__)

Share this post


Link to post
Share on other sites

Thanks you @iamtheky for your suggestions and your teaching is also good to know there is not shame on using native string operations if you are not going to have an obviously better efficiency or less amount of work to do.

So both ways to do things are valid unless the difference between them  are huge against efficiency or amount of work.

Thanks to everyone, each one of you teach me something different today and it was a nice debate.

Regards
Alien.

Share this post


Link to post
Share on other sites

I left @jchd off that list:  but his regex-fu is also insane.  The recent IP validation as example, post #5 here:

 

 


,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-.
|(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/
(_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_)
| | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) (
| | | | |)| | \ / | | | | | |)| | `--. | |) \ | |
`-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_|
'-' '-' (__) (__) (_) (__)

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now