Jump to content

Regular expression help


Recommended Posts

I need to match exactly 3 numbers then a hyphen.

This always fails:

$strInput = "000-"
$varRegExp = "^\b\d{3}\-\b"

ConsoleWrite(@CRLF & StringRegExp($strInput, $varRegExp) & @CRLF)

So:

I'm matching from the start of the string and word boundary

I'm matching exactly three numbers

I've escaped the hyphen

I'm at the end of the word boundary

What have I done wrong?

Link to comment
Share on other sites

Definition of a word boundary:

Before the first character in the string, if the first character is a word character.

After the last character in the string, if the last character is a word character.

Between two characters in the string, where one is a word character and the other is not a word character

Although it works, I suggest using (?m) if you will be using "^" as the start of the line and "$" as the end of the line

This should work:

$varRegExp = "(?m)^\d{3}-"

EDIT: I may not have been very clear with the definition...the "-" kicks the expression out because it is included in the word boundary in your original post. This will also work:

$varRegExp = "(?m)^\b\d{3}\b-"
Edited by Varian
Link to comment
Share on other sites

Try this.

#include <Array.au3>

$strInput = "before000- after123+"
$varRegExp = "\d{3}-"

$aArray = StringRegExp($strInput, $varRegExp, 3)

_ArrayDisplay($aArray)

Think the OP was pretty specific about his target.

I'm matching from the start of the string and word boundary

"Straight_and_Crooked_Thinking" : A "classic guide to ferreting out untruths, half-truths, and other distortions of facts in political and social discussions."
"The Secrets of Quantum Physics" : New and excellent 2 part documentary on Quantum Physics by Jim Al-Khalili. (Dec 2014)

"Believing what you know ain't so" ...

Knock Knock ...
 

Link to comment
Share on other sites

The OP is asking the impossible with the supplied test string.

A word boundary manifests when there is a non-word character adjacent to a word character.

"-" is a non-word character. (See "\w", Matching Characters under StringRegExp in AutoIt Help file)

So for "\b" to match after "-", a word character must follow "-" to create a word boundary.

$strInput = "000-a"
$varRegExp = "^\d{3}-\b"

ConsoleWrite(@CRLF & StringRegExp($strInput, $varRegExp) & @CRLF)
Link to comment
Share on other sites

Think the OP was pretty specific about his target.

To reinforce the point that both me and Malkey stated, the problem was that the word boundary included the "-" in the OP's post. The hyphen, not being a word character, made the expression fail. My edit showed how to use the word boundary by putting the hyphen outside of it if the OP wanted to keep the word boundary parameter.
Link to comment
Share on other sites

To reinforce the point that both me and Malkey stated, the problem was that the word boundary included the "-" in the OP's post. The hyphen, not being a word character, made the expression fail. My edit showed how to use the word boundary by putting the hyphen outside of it if the OP wanted to keep the word boundary parameter.

Thank you - that's what it was, I was including the hyphen as a word character, which it isn't. In the past I've just used \b statements to match an exact sequence until now when it caused an issue. The ^ and $ operators are what I should have used, I haven't used (?m) before and its good to know about it.

Cheers

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...