Jump to content

Need fresh perspective on StringRegExp


 Share

Recommended Posts

Hi

I implement Hospital Laboratory Information systems. The system we implement has a scripting language that is kind of a black art because there are no tools provided by the vendor which makes it easy to write scripts and verify they are correct. The compiler even missing some of the errors...and you have to upload and compile to use that. Its underutilized and while I write them, advanced scripts are hard to do without aid and lots of trial and error.

...so Ive been writing an editor. I took scintilla and made a new lexer...and have written a nice autoit program that will scan the files for basic proper syntax and I've been using it. But now I'm want to drill down on the details of the individual construction of the script that was written.

I need some fresh input on how to perform a check of a function to verify that it is propery formatted. I got a lot of ideas from this topic:

With that, I can pull the functions from the script and believe that I can then send them to a different stringregexp expression based on the number of variables that can be in each function.

I wrote a regex that will pull all the comments out so I'm not messing with them

$array = StringRegExp($string, "(?:[\/]{2}.+[\n])", 3)

;_ArrayDisplay($array)

and I have one that will pull out the items within quotation marks

$array = StringRegExp($string, "((?:[match]+^,\x22]|\x22[^\x22]*\x22)+)", 3)

_ArrayDisplay($array)

when I get all of this done I can pull the functions left with a variation from the link above which I) feel I can script into a loop

Local $aReturn = StringRegExp($string, '(?s)(?i)abs\s*\(\s*((?:.*?|\w+)\s*)\)', 3)

_ArrayDisplay($aReturn)

for instance will let me display the ABS() function contents.

one short example from one of the scripts of code that I'm parsing

RULE

(

CBCND.FINDTEST AND CBCD.FINDTEST AND

ABS(NUMBER(CBCND.EXPECTED - CBCD.EXPECTED)) <=TIMELIMIT AND

NOT MATCH(CBCD.STATUS,"V","D","X") AND

NOT MATCH(CBCND.STATUS,"L","A","T","G","I","V","D","X") AND

NOT MATCH(CBCND.PLCODE,"S") AND

NOT CBCND.ANYTESTCOM("ORDER_CBCND_S")

)

{

CBCND.CANCELTEST(CONDITION="DUPS", COMMENT="1: AUTOCANCEL CBCND ORDERED WITH CBCD");

}

which would be assigned to variable $string

Local $array= StringRegExp($string, '(?s)(?i)match\s*\(\s*((?:.*?|\w+)\s*)\)', 3)

_ArrayDisplay($array)

gives me all the contents of the match functions.

once all of this is done, I'm a bit lost on what to do with what is left. I need to verify that things are properly joined when needed (with AND, OR, NOT, &, -, +, =, >=, <=, *) and when & is used its a string, AND, NOT,OR is used between functions, and the remainder between numbers unless its =.

but I'm still a little confused one how to handle what i get from that. As in the code above, functions can be within functions which complicates things.

I'm hoping someone has some suggestions or experience on validating the construction of a function...much like compiling a program would do,but pre-compile.

or maybe a function called "match" like this:

func match($condition,$item)

switch $condition

case ""

do something

case else

do something

endswitch

switch $item

case ""

do something

case else

do something

endswitch

endfunc

that means writing a function for each possible function

or maybe one function which has the function name in the variables that I pass.

anyhow, I think I'm making it too complicated and overthinking it.

a fresh idea or perspective or direction would be appreciated.

TIA

BT

Link to comment
Share on other sites

Making a lexer from regexp only is probably going to be a headache and essentially unmaintainable.

You would have more success with bison, COCO/R, lemon, whatever tool you can manage to run.

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Link to comment
Share on other sites

That's the job of a full-fledged lexical analysis tool. I can't guess if what you have can perform that.

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Link to comment
Share on other sites

I'm actually getting very close to a completed version-all autoIT. I did some searching and research on compilers and read a bunch of articles.

While I don't need the depth that a compiler does, it pointed me in the right direction.

I first strip all comments from the script.

I'm using a function that I pass the found functions in the script to which has the name of the function, and the contents it finds between closed parenthesis for that function. It starts on the inside and works out. It matches that there is the proper count of items in each function and that they are called correctly (string when needed, number when needed, etc). If it is ok, I replace that function with <var> and search again.

I do this recursively until there is nothing left but <var>, NOT, AND, OR and am then making sure that no ANDAND appears or OROR or NOTOR

I just ran it on over 1000 scripts and it caught most of the errors.

After a little more tweaking, I'll have to work on speeding it up...cause this part is slower than I was counting on. Its still under 10 seconds, but that's too long IMHO for short scripts under 250 lines.

I may change it all to binary...

thanx

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...