Sign in to follow this  
Followers 0

Experimental (Academic) AutoIT Script Interpreter [C++]


23 posts in this topic

#1 ·  Posted (edited)

For a long time now, I have been curious about the dealing of Interpreters, and compilers, and often, this curiousity manifests into some experimentation.

My Interpreter (nicknamed PARADIME) is an attempt at interpreting the autoit syntax, to gain a better understanding of how AutoIT 'ticks' and also to cure my curiousity to see if I can write an interpreter for an existing language.

The current (UNFINISHED) result I am quite happy with. A great deal of the syntatical features of autoit are implemented, with most intended to be implemented.

The Following functionality operates correctly in my Interpreter:

Global Declarations
'=' Assignments
Function calls (including recursive)
Variant Datatype (Implementing Arrays, INT32, INT64, double, string)
Operators: + - / * > < <> >= <= = ==
Singleline IF
Multiline IF
WHILE statements
About 20 macros
About 12 Builtin Functions{
ConsoleWrite
FileRead
FileOpen
FileClose
MsgBox(Non optional params only)
Stringlen
StringLeft/Right
StringTrimLeft/Right
TimerInit
(TimerDiff() is bugged, however)
}
Arrays

For example, the following code will execute correctly.

$t = 1
$t2 = 2

if $t = $t2 then MsgBox(48, "TEST", "EQUALITY")
if $t <> $t2 then MsgBox(48, "TEST", "NOT EQUALITY")

Global $mate = 89, $eee, $f = 55, $arraydestroytest[65000]


MsgBox(48, "TEST", "This Code is running in Paradime: " & $eee)


While $mate < 4000
    $arraydestroytest[$mate] = $mate
    $mate = $mate + 1
    $r = "FFFF" & "00043"
WEnd
MsgBox(48, "TEST", "This Code is running in Paradime: " & $mate & " " & $arraydestroytest[$mate-1])

COnsoleWrite(Stringlen("LOL RECURSION"))
ConsoleWrite("Macro Test:" & @LF)
ConsoleWrite("Program files: " & @PROGRAMFILESDIR & @LF)
ConsoleWrite("Common files: " & @CommonFilesDir & @CR)
ConsoleWrite("My Documents: " & @MyDocumentsDir & @CR)
ConsoleWrite("AppDataC files: " & @AppDataCommonDir & @CR)
ConsoleWrite("DesktopC files: " & @DesktopCommonDir & @CR)
ConsoleWrite("DocumentsC files: " & @DocumentsCommonDir & @CR)
ConsoleWrite("FavouritesC files: " & @FavoritesCommonDir & @CR)
ConsoleWrite("ProgramsC files: " & @ProgramsCommonDir & @CR)
ConsoleWrite("StartMC files: " & @StartMenuCommonDir & @CR)
ConsoleWrite("Startup files: " & @StartupCommonDir & @CR)
ConsoleWrite("AppData files: " & @AppDataDir & @CR)
ConsoleWrite("Desktop files: " & @DesktopDir & @CR)
ConsoleWrite("Favs files: " & @FavoritesDir & @CR)
ConsoleWrite("Program files: " & @ProgramsDir & @CR)
ConsoleWrite("Start Menu files: " & @StartMenuDir & @CR)
ConsoleWrite("Startup files: " & @StartupDir & @CR)

ConsoleWrite(@CRLF & "Computer: " & @ComputerName & @CR)
ConsoleWrite("WIN: " & @WindowsDir & @CR)
ConsoleWrite("Working: " & @WorkingDir & @CR)
ConsoleWrite("System: " & @SystemDir & @CR)
ConsoleWrite("IP1: " & @IPAddress1 & @CR)
ConsoleWrite("IP2: " & @IPAddress2 & @CR)
ConsoleWrite("IP3: " & @IPAddress3 & @CR)
ConsoleWrite("IP4: " & @IPAddress4 & @CR)
ConsoleWrite("TempDir: " & @TempDir & @CR)
ConsoleWrite("Username: " & @UserName & @CR)
ConsoleWrite("HomeDrive: " & @HomeDrive & @CR)
ConsoleWrite("HomePath: " & @HomePath & @CR)
ConsoleWrite("HomeShare: " & @HomeShare & @CR)
ConsoleWrite("LogonServer: " & @LogonServer & @CR)
ConsoleWrite("LogonDomain: " & @LogonDomain & @CR)
ConsoleWrite("LogonDNSDomain: " & @LogonDNSDomain & @CR)

Academic Discourse:

The biggest thing that surprised me was how well written/optimized AutoIT was (or how inefficient a C++ coder I am, having 6 months of experience ^^')

My interpreter runs approximately 5.4X slower than the AutoIT interpreter, dispite the datastructures being similar. My guess is that these speed differences are due to two things:

-Pointer passing: nearly everything large in the public autoit source has its pointer passed around as opposed to the datatype. Substantial portions of my code do not pointer-pass, reducing speed. Also, my inexperience/rush in writing this would attenuate this with potentially inferior code (relative to autoIT)

-Operator evaluation: I originally thought that AutoIT's decision to treat every operand as a VARIANT class would incur a noticable overhead, so I thought I could sidestep it by using my original TOKEN datastructure from the lexing stage. Now I realise that this overhead is unavoidable, as im still doing typechecking and conversions with the token datatype. The only difference is Jonathon is doing it in a pretty little variant class and my parser_eval.cpp is littered with switch statements for every operand possibility for every operator. (Please dont look at the source, you will cry).

PARADIME Implements, from scratch,

-Custom Lexer/tokeniser

-Stateful Recursive Decent Parser

-Shunting yard algorithm for expression evaluation

-implements std::map for Variable and builtinfunction pointer lookup

-implements std::vector for token storage.

Parser

Evidently, I have attempted to deviate from Jonathons chosen parsing approach to test the validity of other algorithms, and initial results indicate that my parsing model is applicable.

Both of our interpreters use the recursive decent model for traversing nested structures. Paradime has various parsing states transparent to parsing of the tokens themselves. The two main states are EXEC and IGNORE, where EXEC, executes the code up to the corresponding end of the code block (ENDIF, WEND etc), whereas IGNORE 'ignores' the contained code. I did not quite understand how Jon traversed nested structures, so I cannot comment further on his methods here.

Handling of Expressions is done entirely different on the two interpreters. Jonathon uses a LALR Shift/Reduce Algorithm, where as I use dijkastras shunting yard algorithm. Thus far, both approaches seem entirely applicable.

Variant Storage:

Done the same on both interpreters. Array handling code is practically copied, It was better than anything I could ever make.

Lookup Speed:

One other thing I noticed is that Macros and Builtin functions have no optimal lookup table (in the public autoIT source). Perhaps, to improve speed, these things could be stored in a red/black binary tree to increase efficiency?

Conclusion:

All in all, the parsing and interpreting backbone is a magnificent piece of work, and all my attempts to replicate it and deconstruct (from the publicsource) it have only increased my sense of awe. I express my most sincere thanks to the autoit developers for such, and I hope that development of AutoIT never stops. One day, when I get out of highschool I would like to develop autoIT, who knows.

Paradime Sourcecode:

As previously mentioned, the vast majority of the sourcecode is created from scratch. However, the There was no point re-inventing the wheel when implementing some macros and some builtins, and the code for array handling in variants, and one or two syntatical expressions. These elements of the sourcecode are clearly labelled at the top and have the GNU license attached (Code from before Autoit went to closedsource). Credit is clearly given.

Please dont look at it. It is poorly written, undercommented, and due to my bad choice to use the token structure as the operand structure, a good deal of parsing logic is littered in hundreds of lines of switch statements. (eww)

http://code.google.com/p/paradime-interpreter/source/browse/#hg%2FParadime%2Fcore

Please, dont judge me.

SciTE integration:

Thanks to LaCastiglione:

command.38.*.au3="C:Paradime.exe" "$(FilePath)"
command.name.38.*.au3=Paradime
command.save.before.38.*.au3=1
command.shortcut.38.*.au3=Ctrl+F7

Drop Paradime.exe into your C: drive.

Future of Paradime:

I will implement NOT, AND, OR, FOR-NEXT, SWITCH-CASE-ENDSWITCH, and user defined functions. Then I will deviate from autoit, exploring new, custom language constructs, but thats another academic project entirely.

-hyperzap

Edited by twitchyliquid64
3 people like this

ongoing projects:-firestorm: Largescale P2P Social NetworkCompleted Autoit Programs/Scripts: Variable Pickler | Networked Streaming Audio (in pure autoIT) | firenet p2p web messenger | Proxy Checker | Dynamic Execute() Code Generator | P2P UDF | Graph Theory Proof of Concept - Breadth First search

Share this post


Link to post
Share on other sites



#2 ·  Posted (edited)

The internals have progressed quite a bit since that public source code as well - some parts from scratch. There's lots of weird optimizations. Quite a lot of Copy-on-write activity which really sped things up a lot as well. The main slow down the last time I checked was the eval code which still creates a lot of copies of data as it works the values out using stacks - it's one of the scarier areas to contemplate rewriting though...

Edited by Jon
1 person likes this

Uber promo code for money off the first ride: uberautoit

Share this post


Link to post
Share on other sites

#3 ·  Posted (edited)

Lookup Speed:

One other thing I noticed is that Macros and Builtin functions have no optimal lookup table (in the public autoIT source). Perhaps, to improve speed, these things could be stored in a red/black binary tree to increase efficiency?

I'd need to check but I think I changed macro/built-in function lookup to be resolved during the lexing/token stage so it didn't do a runtime lookup (the token contains an index to the function).

For user functions, the names of all the functions are stored in a sorted list - which then uses a binary search for lookup.

Variable lookup is done with splay trees.

Edited by Jon

Uber promo code for money off the first ride: uberautoit

Share this post


Link to post
Share on other sites

#4 ·  Posted

I have the feeling that there isn't much of the current code that even resembles the last public released source code. Anyone looking at that old source shouldn't be under the impression that it still looks like that.


GeorgeQuestion about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.*** The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number.Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else."Old age and treachery will always overcome youth and skill!"

Share this post


Link to post
Share on other sites

#5 ·  Posted

It doesn't look anything like that, really. Some parts have been re-written multiple times.

Share this post


Link to post
Share on other sites

#6 ·  Posted (edited)

I have the feeling that there isn't much of the current code that even resembles the last public released source code. Anyone looking at that old source shouldn't be under the impression that it still looks like that.

I am/was aware that there were differences in the source code, and thus I have modelled my studies based on the following underlying assumption:

The available autoit source is an implementation of behaviour. Any revision is based on the 
fundamental elements of this implementation. (for instance, I would expect the token structure
 to be mostly the same, and the variant structure to be similar save the addition of binary 
and Boolean types. Furthermore, the shift/reduce algorithm and recursive decent are unlikely
 to be changed much, save optimization)
Edited by twitchyliquid64

ongoing projects:-firestorm: Largescale P2P Social NetworkCompleted Autoit Programs/Scripts: Variable Pickler | Networked Streaming Audio (in pure autoIT) | firenet p2p web messenger | Proxy Checker | Dynamic Execute() Code Generator | P2P UDF | Graph Theory Proof of Concept - Breadth First search

Share this post


Link to post
Share on other sites

#7 ·  Posted

The internals have progressed quite a bit since that public source code as well - some parts from scratch. There's lots of weird optimizations. Quite a lot of Copy-on-write activity which really sped things up a lot as well. The main slow down the last time I checked was the eval code which still creates a lot of copies of data as it works the values out using stacks - it's one of the scarier areas to contemplate rewriting though...

I don't believe I know of a method of expression parsing that does not use stacks. (perhaps manadar /mat can jump in on this one).

The most optimal thing I can think of would be to convert all expressions to RPN form at compile time, then all you would need is one simple token/variant pointer stack to evaluate the expression.


ongoing projects:-firestorm: Largescale P2P Social NetworkCompleted Autoit Programs/Scripts: Variable Pickler | Networked Streaming Audio (in pure autoIT) | firenet p2p web messenger | Proxy Checker | Dynamic Execute() Code Generator | P2P UDF | Graph Theory Proof of Concept - Breadth First search

Share this post


Link to post
Share on other sites

#8 ·  Posted

It doesn't look anything like that, really. Some parts have been re-written multiple times.

How much different?

I don't understand the point of having a public version available, so upcoming developers understand how to integrate functionality into the interpreter, only to let the public version remain un-updated to the point it cannot be used as a introduction point for developers.

Is this the case? Or are the internals roughly the same (save optimizations)?


ongoing projects:-firestorm: Largescale P2P Social NetworkCompleted Autoit Programs/Scripts: Variable Pickler | Networked Streaming Audio (in pure autoIT) | firenet p2p web messenger | Proxy Checker | Dynamic Execute() Code Generator | P2P UDF | Graph Theory Proof of Concept - Breadth First search

Share this post


Link to post
Share on other sites

#9 ·  Posted

I don't believe I know of a method of expression parsing that does not use stacks. (perhaps manadar /mat can jump in on this one).

The point is not to not use stacks.

Also, inb4 pratt parser.

Share this post


Link to post
Share on other sites

#10 ·  Posted

The public release version of the source really isn't very relevant any more.

That was back in the days when AutoIt was open source which no longer applies and that is why it's never updated.


GeorgeQuestion about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.*** The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number.Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else."Old age and treachery will always overcome youth and skill!"

Share this post


Link to post
Share on other sites

#11 ·  Posted

George hit it. The source is available because that version of AutoIt is open source. No other reason really.

Your assumptions are incorrect. The Variant class - for example - has been re-written 2 or 3 times. Much of the core of AutoIt is different. Some of the functions are maybe the same give or take a bug-fix.

Share this post


Link to post
Share on other sites

#12 ·  Posted

At this point the published code (which I have no intention to ever look at) is probably more a "how not to do it" example than anything else.


This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites

#13 ·  Posted

At this point the published code (which I have no intention to ever look at) is probably more a "how not to do it" example than anything else.

I disagree entirely.

It's well written, it's just not optimal and as valik said, elements have been rewritten as better methods have become known.

(I'm only talking about the interpreter core here, btw)


ongoing projects:-firestorm: Largescale P2P Social NetworkCompleted Autoit Programs/Scripts: Variable Pickler | Networked Streaming Audio (in pure autoIT) | firenet p2p web messenger | Proxy Checker | Dynamic Execute() Code Generator | P2P UDF | Graph Theory Proof of Concept - Breadth First search

Share this post


Link to post
Share on other sites

#14 ·  Posted

George hit it. The source is available because that version of AutoIt is open source. No other reason really.

Your assumptions are incorrect. The Variant class - for example - has been re-written 2 or 3 times. Much of the core of AutoIt is different. Some of the functions are maybe the same give or take a bug-fix.

Wow...2/3 times??? Really??? What was wrong with it to sanction those re-writes! It seemed quite fine to me.


ongoing projects:-firestorm: Largescale P2P Social NetworkCompleted Autoit Programs/Scripts: Variable Pickler | Networked Streaming Audio (in pure autoIT) | firenet p2p web messenger | Proxy Checker | Dynamic Execute() Code Generator | P2P UDF | Graph Theory Proof of Concept - Breadth First search

Share this post


Link to post
Share on other sites

#15 ·  Posted

At this point the published code (which I have no intention to ever look at) is probably more a "how not to do it" example than anything else.

This.

It's well written

No. No it's not.

Wow...2/3 times??? Really??? What was wrong with it to sanction those re-writes! It seemed quite fine to me.

Everything. It's still hopelessly bad but it's massive and it works. The same can be said for quite a number or parts of AutoIt.

Share this post


Link to post
Share on other sites

#16 ·  Posted

Do you know if any of the original code is still there?

Share this post


Link to post
Share on other sites

#17 ·  Posted

I'm sure there's lots of original code still there. What that may be, though, I do not know.

Share this post


Link to post
Share on other sites

#18 ·  Posted

Everything. It's still hopelessly bad but it's massive and it works. The same can be said for quite a number or parts of AutoIt.

Can you be more specific? WHY is it hopelessly bad? What design goals does it not achieve, how does it underperform the expectations you would have for an 'ideal' Variant Class?


ongoing projects:-firestorm: Largescale P2P Social NetworkCompleted Autoit Programs/Scripts: Variable Pickler | Networked Streaming Audio (in pure autoIT) | firenet p2p web messenger | Proxy Checker | Dynamic Execute() Code Generator | P2P UDF | Graph Theory Proof of Concept - Breadth First search

Share this post


Link to post
Share on other sites

#19 ·  Posted

I imagine those are the parts we're not allowed to see. :)

Share this post


Link to post
Share on other sites

#20 ·  Posted

It's a bloated mess of inter-connected pieces that should be separate. It uses about a billion switch statements to do what C++ can do for you if you know what abstract classes are.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0