Jump to content

RegExp - Match correct amount of brackets


Recommended Posts

Hi there,

I am getting better with regular expression, but there are certain things I am still losing it. The "new" thing I try to match are some parts from code.

#include <Array.au3>
Dim $sString
$sString &= "(g_sStruct_1.sAck.boStart OR "
$sString &= "(NOT g_sStruct_1.sAck.boState AND "
$sString &= "(g_iState.i32StateCurrent[12] <> 11))) AND "
$sString &= "(g_iState.i32StateCurrent <> 12) AND "
$sString &= "(g_iState.i32StateCurrent <> 17) AND "
$sString &= "(g_iState.i32StateCurrent <> 8) AND "
$sString &= "(g_iState.i32StateCurrent <> 9) AND "
$sString &= "(g_iState.i32StateCurrent <> 2) AND "
$sString &= "(g_iState.i32StateCurrent = 1) AND "
$sString &= "(g_sAxis_3 =< (g_sAxis_1 + 10)) OR (g_sAxis_2 < g_sAxis_3 + 10);"
$aString = StringRegExp($sString, "(?:NOT )*(\(\(*[\w.\[\] +]*\)* [<>|=<|>=|=|<|>]+ \(?[\w.\[\] +]*\)?\))", 3)

_ArrayDisplay($aString)

;~ [0] = (g_iState.i32StateCurrent[12] <> 11))   the last bracket is not needed
;~ [1] = (g_iState.i32StateCurrent <> 12)        perfect
;~ [2] = (g_iState.i32StateCurrent <> 17)        perfect
;~ [3] = (g_iState.i32StateCurrent <> 8)         perfect
;~ [4] = (g_iState.i32StateCurrent <> 9)         perfect
;~ [5] = (g_iState.i32StateCurrent <> 2)         perfect
;~ [6] = (g_iState.i32StateCurrent = 1)          perfect
;~ [7] = (g_sAxis_3 =< (g_sAxis_1 + 10))         perfect (last bracket wanted)
;~ [8] = (g_sAxis_2 < g_sAxis_3 + 10)            perfect

The second closing bracket in the first result is to much and I don't really know how to change my pattern without removing the other from the code. I know I can make two pattern (one with "inner brackets" to match the result in [7] and one for the rest, but I guess it might be possible to get in done in one pattern.

Any ideas?

Edited by HurleyShanabarger
Link to comment
Share on other sites

I don't really know what's the logic behind this. So the only information i have to deduce what's wanted and what not is your desired result-set.
With this information this is my suggestion:

#include <Array.au3>

Global $sString = "(g_sStruct_1.sAck.boStart OR " & _
"(NOT g_sStruct_1.sAck.boState AND " & _
"(g_iState.i32StateCurrent[12] <> 11))) AND " & _
"(g_iState.i32StateCurrent <> 12) AND " & _
"(g_iState.i32StateCurrent <> 17) AND " & _
"(g_iState.i32StateCurrent <> 8) AND " & _
"(g_iState.i32StateCurrent <> 9) AND " & _
"(g_iState.i32StateCurrent <> 2) AND " & _
"(g_iState.i32StateCurrent = 1) AND " & _
"(g_sAxis_3 =< (g_sAxis_1 + 10)) OR (g_sAxis_2 < g_sAxis_3 + 10);"


Global $s_Pattern = '(?x)' & @CRLF & _
'(\(' & @CRLF & _
'   [^\(]+' & @CRLF & _
'   [<=>]{1,2}\s*' & @CRLF & _
'   (?>' & @CRLF & _
'      [^\(\)]+' & @CRLF & _
'      |' & @CRLF & _
'      (?>\((?>[^\(\)]*+|(?R))*\)) )# balanced brackets' & @CRLF & _
'\))'

$aString = StringRegExp($sString, $s_Pattern, 3)
_ArrayDisplay($aString)

 

Edited by AspirinJunkie
Link to comment
Share on other sites

Hi Aspirinjunkie,

I will try to understand you solution and learn from it. I also appreciate your feedback regarding the question.

What I want ist to extract comparisions from the code that are nested in brackets.

  1. variable name can consist of
    • \w.\[\]
  2. variables left/right from the charater from point 4 might be multiplied, divided, subracted or added with a integer, float or another variable. It is possible that the operation is nested in brackets
    • +, *, -, /
  3. comparision caracters
    • <>, <, >, =, =<, >=

Some examples of what should be matched:

(xyz > zyx)
(xyz <> zyx)
(xyz + 10 > zyx)
((xyz + 10) > zyx)
(xyz * abc <> zyx - 10)
((xyz * abc) <> (zyx - 10))

If the regular expression would be able to match multiple operations, it would be perfect, e.g:

((xyz * abc) - abc <> ((zyx - 10) * 95) / cba)

 

Is this a valid/good description for a help request?

Link to comment
Share on other sites

2 hours ago, HurleyShanabarger said:

variable name can consist of

  • \w.\[\]

You need to be more precise about this. Is the dot allowed anywhere in a varname? Several dots?

If square brakets are used for indexation, then is an arbitrary complex expression allowed there, or whatelse? E.g. is the term xyz[abc + 4 * def - 5 * (ghi + 5) / 2] - jkl[6] / 4 allowed?
I'd guess that neither
xy]z
x.y[.z
[x
... would be OK by themselves.

Also you seem to imply that every arithmetic operation has to be enclosed in parenthesis if it isn't a left-/right-hand of a comparison. Is that always the case? E.g; not like the example I just typed, or is it valid?

Then which forms of float do you intend to handle?
-.4
.2e-5
.2 e - 5
+.4Exyz        with xyz a variable (possibly indexed?)
1.E
...

Regexp are well suited to describe a grammar but things have to be very strict, else you're gonna obtain meaningless result one day or the other.

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Link to comment
Share on other sites

Very good - I get it how important it is not only to describe but to write down the rules.

Variable

  • Starts with a letter or underscore
  • after every underscore a letter/digit follow (meaning also the variable can not end with an underscore).
  • dots are allowed, but after every dot a letter or underscore has to follow
  • a pair of square brackets (used for array access) with embedded digits is allowed at the end of a variable or in front of a dot

Equations

  • only handle basic floats (e.g. 2.4510.0 or -125.34), a leading zero is mandatory
  • multiple pairs of round brackets are used to create the required order of operations
  • a pair of round brackets might "group" the arithmetic operation left/right from the comparion

So, that is "the best" I can think of at the moment - feel free to ask more question - that will help me in the future as well!

Link to comment
Share on other sites

It's a little bit tricky to create a pattern for matching the equations.
Not the main idea behind this - this is quite logic.
The main problem is to avoid infinite loops inside the pattern.
So i made a slightly dirty pattern to cover at least the most cases.
But i know it's not perfect at this point.
Maybe anyone else has a better idea to handle the nested equations with avoiding infinite loops.:

#include <Array.au3>

Global $sString = '(g_sStruct_1.sAck.boStart OR (NOT g_sStruct_1.sAck.boState AND (g_iState.i32StateCurrent[12] <> 11))) AND (g_iState.i32StateCurrent <> 12) AND (g_iState.i32StateCurrent <> 17) AND (g_iState.i32StateCurrent <> 8) AND (g_iState.i32StateCurrent <> 9) AND (g_iState.i32StateCurrent <> 2) AND (g_iState.i32StateCurrent = 1) AND (g_sAxis_3 =< (g_sAxis_1 + 10)) OR (g_sAxis_2 < g_sAxis_3 + 10);' & @CRLF & _
'(xyz > zyx)' & @CRLF & _
'(xyz <> zyx)' & @CRLF & _
'(xyz + 10 > zyx)' & @CRLF & _
'((xyz + 10) > zyx)' & @CRLF & _
'(xyz * abc <> zyx - 10)' & @CRLF & _
'((xyz * abc) <> (zyx - 10))' & @CRLF & _
'(95 * (12 + abc))' & @CRLF & _
'((xyz * abc) - abc <> ((zyx - 10) * 95) / cba)'

Global $s_Pattern = '(?x)(?(DEFINE)' & @CRLF & _
'   (?<number>   -? (?= [1-9]|0(?!\d) ) \d+ (\.\d+)? ([eE] [+-]? \d+)? )' & @CRLF & _
'   (?<variable>  [[:alpha:]_](?>_[[:alnum:]]|\.[[:alpha:]_]|[[:alnum:]])*(?>\[\d+\])?        )' & @CRLF & _
'   (?<operator>  \*|\+|-|/        )' & @CRLF & _
'   (?<operand>  (?&number) | (?&variable) )' & @CRLF & _
'   (?<term>    (?&operand) | (?&eqx)        )' & @CRLF & _
'   (?<leftoperand>  (?&term))' & @CRLF & _
'   (?<rightoperand>  (?&operand) | (?&equation) )' & @CRLF & _
'   (?<eqx> (?&eqxinner) | \( (?&eqxinner) \))' & @CRLF & _
'   (?<eqxinner> (?&operand)\s*(?&operator)\s*(?&term))' & @CRLF & _
'   (?<rightpart> \s*(?&operator)\s*(?&rightoperand))' & @CRLF & _
'   (?<equation>(?>(?&innerequation)|\((?&innerequation)\)) (?&rightpart)*)' & @CRLF & _
'   (?<innerequation>\s*(?&leftoperand)(?&rightpart))' & @CRLF & _
')' & @CRLF & _
'(?&equation)'

$aString = StringRegExp($sString, $s_Pattern, 3)
_ArrayDisplay($aString)

 

Link to comment
Share on other sites

There is no problem with recursion inside a PCRE pattern. I've started something similar but I'm too busy to finish it right now. I hope I have some time later today to complete it.

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...