Check if a string fits a given regular expression pattern.
StringRegExp ( "test", "pattern" [, flag ] [, offset ] ] )
Parameters
| test | The string to check |
| pattern | The regular expression to compare. |
| flag | [optional] A number to indicate how the function behaves. See below for details. The default is 0. |
| offset | [optional] The string position to start the match (starts at 1) The default is 1. |
| Flag | Values |
| 0 | Returns 1 (matched) or 0 (no match) |
| 1 | Return array of matches. |
| 2 | Return array of matches including the full match (Perl / PHP style). |
| 3 | Return array of global matches. |
| 4 | Return an array of arrays containing global matches including the full match (Perl / PHP style). |
Return Value
Flag = 0 :| @Error | Meaning |
| 2 | Bad pattern. @Extended = offset of error in pattern. |
| @Error | Meaning |
| 0 | Array is valid. Check @Extended for next offset |
| 1 | Array is invalid. No matches. |
| 2 | Bad pattern, array is invalid. @Extended = offset of error in pattern. |
| @Error | Meaning |
| 0 | Array is valid. |
| 1 | Array is invalid. No matches. |
| 2 | Bad pattern, array is invalid. @Extended = offset of error in pattern. |
Remarks
Regular expression notation is a compact way of specifying a pattern for strings that can be searched. Regular expressions are character strings in which plain text characters indicate what text should exist in the target string, and a some characters are given special meanings to indicate what variability is allowed in the target string. AutoIt regular expressions are normally case-sensitive.| [ ... ] | Match any character in the set. e.g. [aeiou] matches any lower-case vowel. A contiguous set can be defined using a dash between the starting and ending characters. e.g. [a-z] matches any lower case character. To include a dash (-) in a set, use it as the first or last character of the set. To include a closing bracket in a set, use it as the first character of the set. e.g. [][] will match either [ or ]. Note that special characters do not retain their special meanings inside a set, with the exception of \\, \^, \-,\[ and \] match the escaped character inside a set. |
| [^ ... ] | Match any character not in the set. e.g. [^0-9] matches any non-digit. To include a caret (^) in a set, put it after the beginning of the set or escape it (\^). |
| [:class:] | Match a character in the given class of characters. Valid classes are: alpha (any alphabetic character), alnum (any alphanumeric character), lower (any lower-case letter), upper (any upper-case letter), digit (any decimal digit 0-9), xdigit (any hexadecimal digit, 0-9, A-F, a-f), space (any whitespace character), blank (only a space or tab), print (any printable character), graph (any printable character except spaces), cntrl (any control character [ascii 127 or <32]) or punct (any punctuation character). So [0-9] is equivalent to [[:digit:]]. |
| [^:class:] | Match any character not in the class, but only if the first character. |
| ( ... ) | Group. The elements in the group are treated in order and can be repeated together. e.g. (ab)+ will match "ab" or "abab", but not "aba". A group will also store the text matched for use in back-references and in the array returned by the function, depending on flag value. |
| (?i) | Case-insensitivity flag. This does not operate as a group. It tells the regular expression engine to do case-insensitive matching from that point on. |
| (?-i) | (default) Case-sensitivity flag. This does not operate as a group. It tells the regular expression engine to do case-sensitive matching from that point on. |
| (?i ... ) | Case-insensitive group. Behaves just like a normal group, but performs case-insensitive matches within the group. |
| (?-i ... ) | Case-sensitive group. Behaves just like a normal group, but performs case-sensitive matches within the group. Primarily for use after (-i) flag or inside a case-insensitive group. |
| (?: ... ) | Non-capturing group. Behaves just like a normal group, but does not record the matching characters in the array nor can the matched text be used for back-referencing. |
| (?i: ... ) | Case-insensitive non-capturing group. Behaves just like a non-capturing group, but performs case-insensitive matches within the group. |
| (?-i: ... ) | Case-sensitive non-capturing group. Behaves just like a non-capturing group, but performs case-sensitive matches within the group. |
| (?m) | ^ and $ match newlines within data. |
| (?s) | . matches anything including newline. (by default "." don't match newline) |
| (?x) | Ignore whitespace and # comments. |
| (?U) | Invert greediness of quantifiers. |
| . | Match any single character (except newline). |
| | | Or. The expression on one side or the other can be matched. |
| \ | Escape a special character (have it match the actual character) or introduce a special character type (see below). |
| \\ | Match an actual backslash (\). |
| \a | Alarm, that is, the BEL character (chr(7)). |
| \A | Match only at beginning of string. |
| \b | Matches at a word boundary. |
| \B | Matches when not at a word boundary. |
| \c | Match a control character, based on the next character. For example, \cM matches ctrl-M. |
| \d | Match any digit (0-9). |
| \D | Match any non-digit. |
| \e | Match an escape character (chr(27)). |
| \E | end case modification. |
| \f | Match an formfeed character (chr(12)). |
| \h | any horizontal whitespace character. |
| \H | any character that is not a horizontal whitespace character. |
| \l | Match lowercase next char. |
| \L | Match lowercase till \E. |
| \n | Match a linefeed (@LF, chr(10)). |
| \Q | quote (disable) pattern metacharacters till \E. |
| \r | Match a carriage return (@CR, chr(13)). |
| \s | Match any whitespace character: Chr(9) through Chr(13) which are Horizontal Tab, Line Feed, Vertical Tab, Form Feed, and Carriage Return, and the standard space ( Chr(32) ). |
| \S | Match any non-whitespace character. |
| \t | Match a tab character (chr(9)). |
| \u | Match uppercase next char. |
| \U | Match uppercase till \E. |
| \v | any vertical whitespace character. |
| \V | any character that is not a vertical whitespace character |
| . | |
| \w | Match any "word" character: a-z, A-Z, 0-9 or underscore (_). |
| \W | Match any non-word character. |
| \### | Match the ascii character whose code is given or back-reference. Can be up to 3 octal digits. Match back-reference if found. Match the prior group number given exactly. For example, ([:alpha:])\1 would match a double letter. |
| \x## | Match the ascii character whose code is given in hexadecimal. Can be up to 2 digits. |
| \z | Match only at end of string. |
| \Z | Match only at end of string, or before newline at the end. |
| {x} | Repeat the previous character, set or group exactly x times. |
| {x,} | Repeat the previous character, set or group at least x times. |
| {0,x} | Repeat the previous character, set or group at most x times. |
| {x, y} | Repeat the previous character, set or group between x and y times, inclusive. |
| * | Repeat the previous character, set or group 0 or more times. Equivalent to {0,} |
| + | Repeat the previous character, set or group 1 or more times. Equivalent to {1,} |
| ? | The previous character, set or group may or may not appear. Equivalent to {0, 1} |
| ? (after a repeating character) | Find the smallest match instead of the largest. |
| [:alnum:] | letters and digits |
| [:alpha:] | letters |
| [:ascii:] | character codes 0 - 127 |
| [:blank:] | space or tab only |
| [:cntrl:] | control characters |
| [:digit:] | decimal digits (same as \d) |
| [:graph:] | printing characters, excluding space |
| [:lower:] | lower case letters |
| [:print:] | printing characters, including space |
| [:punct:] | printing characters, excluding letters and digits |
| [:space:] | white space (not quite the same as \s, it include VT: chr(11) ) |
| [:upper:] | upper case letters |
| [:word:] | "word" characters (same as \w) |
| [:xdigit:] | hexadecimal digits |
Related
StringInStr, StringRegExpReplace
Example
;Option 1, using offset
$nOffset = 1
While 1
$array = StringRegExp('<test>a</test> <test>b</test> <test>c</Test>', '<(?i)test>(.*?)</(?i)test>', 1, $nOffset)
If @error = 0 Then
$nOffset = @extended
Else
ExitLoop
EndIf
for $i = 0 to UBound($array) - 1
msgbox(0, "RegExp Test with Option 1 - " & $i, $array[$i])
Next
WEnd
;Option 2, single return, php/preg_match() style
$array = StringRegExp('<test>a</test> <test>b</test> <test>c</Test>', '<(?i)test>(.*?)</(?i)test>', 2)
for $i = 0 to UBound($array) - 1
msgbox(0, "RegExp Test with Option 2 - " & $i, $array[$i])
Next
;Option 3, global return, old AutoIt style
$array = StringRegExp('<test>a</test> <test>b</test> <test>c</Test>', '<(?i)test>(.*?)</(?i)test>', 3)
for $i = 0 to UBound($array) - 1
msgbox(0, "RegExp Test with Option 3 - " & $i, $array[$i])
Next
;Option 4, global return, php/preg_match_all() style
$array = StringRegExp('F1oF2oF3o', '(F.o)*?', 4)
for $i = 0 to UBound($array) - 1
$match = $array[$i]
for $j = 0 to UBound($match) - 1
msgbox(0, "cRegExp Test with Option 4 - " & $i & ',' & $j, $match[$j])
Next
Next