Sign in to follow this  
Followers 0
readmedottxt

Regular expression help

7 posts in this topic

I need to match exactly 3 numbers then a hyphen.

This always fails:

$strInput = "000-"
$varRegExp = "^\b\d{3}\-\b"

ConsoleWrite(@CRLF & StringRegExp($strInput, $varRegExp) & @CRLF)

So:

I'm matching from the start of the string and word boundary

I'm matching exactly three numbers

I've escaped the hyphen

I'm at the end of the word boundary

What have I done wrong?

Share this post


Link to post
Share on other sites



#2 ·  Posted (edited)

Definition of a word boundary:

Before the first character in the string, if the first character is a word character.

After the last character in the string, if the last character is a word character.

Between two characters in the string, where one is a word character and the other is not a word character

Although it works, I suggest using (?m) if you will be using "^" as the start of the line and "$" as the end of the line

This should work:

$varRegExp = "(?m)^\d{3}-"

EDIT: I may not have been very clear with the definition...the "-" kicks the expression out because it is included in the word boundary in your original post. This will also work:

$varRegExp = "(?m)^\b\d{3}\b-"
Edited by Varian

Share this post


Link to post
Share on other sites

Try this.

#include <Array.au3>

$strInput = "before000- after123+"
$varRegExp = "\d{3}-"

$aArray = StringRegExp($strInput, $varRegExp, 3)

_ArrayDisplay($aArray)

Think the OP was pretty specific about his target.

I'm matching from the start of the string and word boundary


"Straight_and_Crooked_Thinking" : A "classic guide to ferreting out untruths, half-truths, and other distortions of facts in political and social discussions."
"The Secrets of Quantum Physics" : New and excellent 2 part documentary on Quantum Physics by Jim Al-Khalili. (Dec 2014)

"Believing what you know ain't so" ...

Knock Knock ...
 

Share this post


Link to post
Share on other sites

The OP is asking the impossible with the supplied test string.

A word boundary manifests when there is a non-word character adjacent to a word character.

"-" is a non-word character. (See "\w", Matching Characters under StringRegExp in AutoIt Help file)

So for "\b" to match after "-", a word character must follow "-" to create a word boundary.

$strInput = "000-a"
$varRegExp = "^\d{3}-\b"

ConsoleWrite(@CRLF & StringRegExp($strInput, $varRegExp) & @CRLF)

Share this post


Link to post
Share on other sites

Think the OP was pretty specific about his target.

To reinforce the point that both me and Malkey stated, the problem was that the word boundary included the "-" in the OP's post. The hyphen, not being a word character, made the expression fail. My edit showed how to use the word boundary by putting the hyphen outside of it if the OP wanted to keep the word boundary parameter.

Share this post


Link to post
Share on other sites

To reinforce the point that both me and Malkey stated, the problem was that the word boundary included the "-" in the OP's post. The hyphen, not being a word character, made the expression fail. My edit showed how to use the word boundary by putting the hyphen outside of it if the OP wanted to keep the word boundary parameter.

Thank you - that's what it was, I was including the hyphen as a word character, which it isn't. In the past I've just used \b statements to match an exact sequence until now when it caused an issue. The ^ and $ operators are what I should have used, I haven't used (?m) before and its good to know about it.

Cheers

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0