Sign in to follow this  
Followers 0
4Eyes

How do I use classes with StringRegExp()

10 posts in this topic

Folks,

I'm trying to test if a string representing a surname (ie family name) is valid. I want to test for chars a-z, A-Z, comma, full stop, apostrophe and hyhen. I'm trying to use StringRegExp() to at least verify that the string is alpha by using the class alpha, but clearly I'm not interpreting the help file correctly. As shown below I've attempted many different ways but I just can't see what I'm doing wrong. Could somebody please advise how to use classes?

; Nogo, returns 0
;ConsoleWrite(StringRegExp("test", "[:alpha:]") & @CRLF)

; Nogo, errors at [
;ConsoleWrite("StringRegExp("test", [:alpha:]) & @CRLF)

; Nogo, returns 0
;ConsoleWrite("StringRegExp("test", ":alpha:") & @CRLF)

; Nogo, returns 0
;ConsoleWrite("StringRegExp("test", "[alpha]") & @CRLF)

; Nogo, unable to parse line
;ConsoleWrite("StringRegExp("test", :alpha:) & @CRLF)

; Nogo, returns 0
;ConsoleWrite(StringRegExp("test", ":alpha:") & @CRLF)

; Nogo, returns 0
;ConsoleWrite(StringRegExp("test", ":a-z:") & @CRLF)

; Nogo, returns 0
;ConsoleWrite(StringRegExp("test", "a-z") & @CRLF)

; Returns 1
;ConsoleWrite(StringRegExp("test", "[a-z]") & @CRLF)

; Returns 1, but shouldn't
;ConsoleWrite(StringRegExp("test", "[a-e]") & @CRLF)

; Returns 0
;ConsoleWrite(StringRegExp("test", "[a-d]") & @CRLF)

; Returns 1, at least I understand this one
;ConsoleWrite("Return = " & StringRegExp("test", "tes", 0) & @CRLF)

Share this post


Link to post
Share on other sites



;ConsoleWrite("StringRegExp("test", "[[:alpha:]]") & @CRLF)


George

Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.

Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.***

The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number.

Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else.

"Old age and treachery will always overcome youth and skill!"

Share this post


Link to post
Share on other sites

GEOsoft,

Ok, thanks. I don't understand that at all, considering it's not shown in the help file that way, but now tempting fate...

ConsoleWrite(StringRegExp("test1", "[[:alpha:]]") & @CRLF)

this also returns 1.

Surely I'd read that line as "Are all of the chars in test1 alpha only?". If so then it should now return 0 with the number 1 appended.

I'm very confused.

Regards,

4Eyes

Share this post


Link to post
Share on other sites

#4 ·  Posted (edited)

No. You simply asked it to see if the regex was valid, no flag is the same as using the 0 flag. It is valid, so it returned 1.

If you want an actual return from the RegEx then you have to return it as an array. Use either 1 o3 3 depending on how many matches you expect?. Now for the bad news. Even when returning the array, in your test regex you would have only returned "t" which is the first letter because you didn't quantify it. Secondly comma, full stop, hyphen and apostrophe are NOT alpha characters.

What you want is

$aRtn = StringRegExp("test1", "(?i)[a-z,.\-']+", 1)
If NOT @Error Then ConsoleWrite($aRtn[0] & @CRLF);; ALWAYS error check after the regex or you will be banging your head on the pavement for hours.

To explain that

(?i) = case insensitive so it matches a-z OR A-Z

anything in the square brackets will match if they exist and you don't have to escape most punctuation like full stop or question marks ONLY if they are inside the square brackets.

the + is a quantifier which says match at least once and continue until there is a non-matching character.

The 1 (or 3) will return an array and arrays in SRE's are always 0 based.

EDIT: I should add that the regex given is only for the example you gave. Given a more complete and accurate example the regexp may require drastic changes. If you are just starting with SREs the make sure to ask when you need help, they're not the easiest code to learn.

Edited by GEOSoft

George

Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.

Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.***

The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number.

Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else.

"Old age and treachery will always overcome youth and skill!"

Share this post


Link to post
Share on other sites

#5 ·  Posted (edited)

whatever Edited by MvGulik

"Straight_and_Crooked_Thinking" : A "classic guide to ferreting out untruths, half-truths, and other distortions of facts in political and social discussions."
"The Secrets of Quantum Physics" : New and excellent 2 part documentary on Quantum Physics by Jim Al-Khalili. (Dec 2014)

"Believing what you know ain't so" ...

Knock Knock ...
 

Share this post


Link to post
Share on other sites

Aha. So thats how you use them.

* Checks doc's again, "So [0-9] is equivalent to [[:digit:]].", Right. :(

... or,

ConsoleWrite(StringRegExp("test1", "[[^:digit:]]") & @CRLF)

[0-9] is the same as [:digit:] and it's also the same as \d which of course is the opposite of \D. Okay so the \D was only in for it's confussion factor. With most things, when they are placed in upper case, it's the equivalent of NOT so \D will match non-digits the same as \S will match non-space and \H will match non-horizontal space. I also often use \V for match any non-vertical space character.

George

Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.

Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.***

The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number.

Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else.

"Old age and treachery will always overcome youth and skill!"

Share this post


Link to post
Share on other sites

#7 ·  Posted (edited)

 

... or,

ConsoleWrite(StringRegExp("test1", "[[^:digit:]]") & @CRLF)

That should be 

ConsoleWrite(StringRegExp("test1", "[^[:digit:]]+") & @CRLF) 

Edited by Bowmore

"Programming today is a race between software engineers striving to build bigger and better idiot-proof programs, and the universe trying to build bigger and better idiots. So far, the universe is winning."- Rick Cook

Share this post


Link to post
Share on other sites

4Eyes,

ConsoleWrite(StringRegExp("test1", "[[:alpha:]]") & @CRLF)

You should have read the above line as "Match the test string with one alpha character" which is true, which returns 1.

If you wish to match all five characters in test string then the regular expression pattern would need to be "[[:alpha:]]{5}" to check for 5 alpha characters.

The regular expression patterns "(?i)[a-z,.\-']+" and "[[:alpha:],.\-']+" are equivalent.

Local $sTestStr = "testR,-.'"

; This StringRegExp() return 1 for true if all the characters in the test string are alpha characters (letters).
ConsoleWrite(StringRegExp($sTestStr, "[[:alpha:]]{" & StringLen($sTestStr) & "}") & @CRLF)

;The next two regular expression patterns are equivalent
ConsoleWrite(StringRegExp($sTestStr, "(?i)[a-z,.\-']{" & StringLen($sTestStr) & "}") & @CRLF)
ConsoleWrite(StringRegExp($sTestStr, "[[:alpha:],.\-']{" & StringLen($sTestStr) & "}") & @CRLF)

Share this post


Link to post
Share on other sites

#9 ·  Posted (edited)

Guys,

Thanks for your advice. RegEx's really seem to be a bit of a black art. I read and reread the help section, and looked at several code examples which were unfortunately quite basic so I gleaned little from them. I'll see if I can work things out from here. Malkey, I actually understand most of the example you gave. Thanks.

BTW, you may be interested to know why I asked about this. I'm writing a prog to be used to check patient data for a medical practice. 4 fields from the data are used to then create a directory structure for each patient. The fields are a card number relative to each patient, title (Mr Mrs etc), given name and family name. I found one record where the given name was entered as ?. That would not have been pretty.

I'm seriously wondering if it might be easier to test the strings char by char not using a StringRegExp(). I get the feeling StringRegExp() is rather slow. I've got 3000 records with 2 fields each to test, and that will only grow. Hmmmm? I'll see I can do some benchmarking.

Thanks again to all,

4Eyes

Edited by 4Eyes

Share this post


Link to post
Share on other sites

You'd rather use regexps than char-by-char loops. With some experience you'll find them extremely powerful (sometimes too powerful!), very concise and efficient.

The counterpart of this power is that you have to think twice before coding a pattern: express first what you need and what you don't want with precision in plain language, then translate this to regexp subpatterns. That requires a bit of experience but is worse it in the long run. Don't forget to visit the PCRE site highlighted in the StringRegExp documentation since that's where you'll find complete and definitive precision about the available PCRE matching wildcards, classes and conditional patterns that couldn't be exposed in full in the abridged version presented in the AutoIt documentation.

I've found Regex Coach (free) to be a tool of choice for debugging or dissection of a PCRE regexp (match or replace) without having to write a single line of code.

If you're serious in developping applications where validating inputs is necessary, then a good use of regexp is a must.


This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0