Jump to content

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Find out more here. X
X


Photo

Experimental (Academic) AutoIT Script Interpreter [C++]

AutoIT C++ script

  • Please log in to reply
22 replies to this topic

#1 twitchyliquid64

twitchyliquid64

    Peace. Always.

  • Active Members
  • PipPipPipPipPipPip
  • 527 posts

Posted 03 January 2012 - 01:48 PM

For a long time now, I have been curious about the dealing of Interpreters, and compilers, and often, this curiousity manifests into some experimentation.

My Interpreter (nicknamed PARADIME) is an attempt at interpreting the autoit syntax, to gain a better understanding of how AutoIT 'ticks' and also to cure my curiousity to see if I can write an interpreter for an existing language.

The current (UNFINISHED) result I am quite happy with. A great deal of the syntatical features of autoit are implemented, with most intended to be implemented.

The Following functionality operates correctly in my Interpreter:
Plain Text         
Global Declarations '=' Assignments Function calls (including recursive) Variant Datatype (Implementing Arrays, INT32, INT64, double, string) Operators: + - / * > < <> >= <= = == Singleline IF Multiline IF WHILE statements About 20 macros About 12 Builtin Functions{ ConsoleWrite FileRead FileOpen FileClose MsgBox(Non optional params only) Stringlen StringLeft/Right StringTrimLeft/Right TimerInit (TimerDiff() is bugged, however) } Arrays


For example, the following code will execute correctly.
AutoIt         
$t = 1 $t2 = 2 if $t = $t2 then MsgBox(48, "TEST", "EQUALITY") if $t <> $t2 then MsgBox(48, "TEST", "NOT EQUALITY") Global $mate = 89, $eee, $f = 55, $arraydestroytest[65000] MsgBox(48, "TEST", "This Code is running in Paradime: " & $eee) While $mate < 4000     $arraydestroytest[$mate] = $mate     $mate = $mate + 1     $r = "FFFF" & "00043" WEnd MsgBox(48, "TEST", "This Code is running in Paradime: " & $mate & " " & $arraydestroytest[$mate-1]) COnsoleWrite(Stringlen("LOL RECURSION")) ConsoleWrite("Macro Test:" & @LF) ConsoleWrite("Program files: " & @PROGRAMFILESDIR & @LF) ConsoleWrite("Common files: " & @CommonFilesDir & @CR) ConsoleWrite("My Documents: " & @MyDocumentsDir & @CR) ConsoleWrite("AppDataC files: " & @AppDataCommonDir & @CR) ConsoleWrite("DesktopC files: " & @DesktopCommonDir & @CR) ConsoleWrite("DocumentsC files: " & @DocumentsCommonDir & @CR) ConsoleWrite("FavouritesC files: " & @FavoritesCommonDir & @CR) ConsoleWrite("ProgramsC files: " & @ProgramsCommonDir & @CR) ConsoleWrite("StartMC files: " & @StartMenuCommonDir & @CR) ConsoleWrite("Startup files: " & @StartupCommonDir & @CR) ConsoleWrite("AppData files: " & @AppDataDir & @CR) ConsoleWrite("Desktop files: " & @DesktopDir & @CR) ConsoleWrite("Favs files: " & @FavoritesDir & @CR) ConsoleWrite("Program files: " & @ProgramsDir & @CR) ConsoleWrite("Start Menu files: " & @StartMenuDir & @CR) ConsoleWrite("Startup files: " & @StartupDir & @CR) ConsoleWrite(@CRLF & "Computer: " & @ComputerName & @CR) ConsoleWrite("WIN: " & @WindowsDir & @CR) ConsoleWrite("Working: " & @WorkingDir & @CR) ConsoleWrite("System: " & @SystemDir & @CR) ConsoleWrite("IP1: " & @IPAddress1 & @CR) ConsoleWrite("IP2: " & @IPAddress2 & @CR) ConsoleWrite("IP3: " & @IPAddress3 & @CR) ConsoleWrite("IP4: " & @IPAddress4 & @CR) ConsoleWrite("TempDir: " & @TempDir & @CR) ConsoleWrite("Username: " & @UserName & @CR) ConsoleWrite("HomeDrive: " & @HomeDrive & @CR) ConsoleWrite("HomePath: " & @HomePath & @CR) ConsoleWrite("HomeShare: " & @HomeShare & @CR) ConsoleWrite("LogonServer: " & @LogonServer & @CR) ConsoleWrite("LogonDomain: " & @LogonDomain & @CR) ConsoleWrite("LogonDNSDomain: " & @LogonDNSDomain & @CR)



Academic Discourse:
The biggest thing that surprised me was how well written/optimized AutoIT was (or how inefficient a C++ coder I am, having 6 months of experience ^^')
My interpreter runs approximately 5.4X slower than the AutoIT interpreter, dispite the datastructures being similar. My guess is that these speed differences are due to two things:
-Pointer passing: nearly everything large in the public autoit source has its pointer passed around as opposed to the datatype. Substantial portions of my code do not pointer-pass, reducing speed. Also, my inexperience/rush in writing this would attenuate this with potentially inferior code (relative to autoIT)
-Operator evaluation: I originally thought that AutoIT's decision to treat every operand as a VARIANT class would incur a noticable overhead, so I thought I could sidestep it by using my original TOKEN datastructure from the lexing stage. Now I realise that this overhead is unavoidable, as im still doing typechecking and conversions with the token datatype. The only difference is Jonathon is doing it in a pretty little variant class and my parser_eval.cpp is littered with switch statements for every operand possibility for every operator. (Please dont look at the source, you will cry).

PARADIME Implements, from scratch,
-Custom Lexer/tokeniser
-Stateful Recursive Decent Parser
-Shunting yard algorithm for expression evaluation
-implements std::map for Variable and builtinfunction pointer lookup
-implements std::vector for token storage.

Parser
Evidently, I have attempted to deviate from Jonathons chosen parsing approach to test the validity of other algorithms, and initial results indicate that my parsing model is applicable.
Both of our interpreters use the recursive decent model for traversing nested structures. Paradime has various parsing states transparent to parsing of the tokens themselves. The two main states are EXEC and IGNORE, where EXEC, executes the code up to the corresponding end of the code block (ENDIF, WEND etc), whereas IGNORE 'ignores' the contained code. I did not quite understand how Jon traversed nested structures, so I cannot comment further on his methods here.

Handling of Expressions is done entirely different on the two interpreters. Jonathon uses a LALR Shift/Reduce Algorithm, where as I use dijkastras shunting yard algorithm. Thus far, both approaches seem entirely applicable.

Variant Storage:
Done the same on both interpreters. Array handling code is practically copied, It was better than anything I could ever make.

Lookup Speed:
One other thing I noticed is that Macros and Builtin functions have no optimal lookup table (in the public autoIT source). Perhaps, to improve speed, these things could be stored in a red/black binary tree to increase efficiency?


Conclusion:
All in all, the parsing and interpreting backbone is a magnificent piece of work, and all my attempts to replicate it and deconstruct (from the publicsource) it have only increased my sense of awe. I express my most sincere thanks to the autoit developers for such, and I hope that development of AutoIT never stops. One day, when I get out of highschool I would like to develop autoIT, who knows.

Paradime Sourcecode:
As previously mentioned, the vast majority of the sourcecode is created from scratch. However, the There was no point re-inventing the wheel when implementing some macros and some builtins, and the code for array handling in variants, and one or two syntatical expressions. These elements of the sourcecode are clearly labelled at the top and have the GNU license attached (Code from before Autoit went to closedsource). Credit is clearly given.
Please dont look at it. It is poorly written, undercommented, and due to my bad choice to use the token structure as the operand structure, a good deal of parsing logic is littered in hundreds of lines of switch statements. (eww)
http://code.google.com/p/paradime-interpreter/source/browse/#hg%2FParadime%2Fcore
Please, dont judge me.

SciTE integration:
Thanks to LaCastiglione:
command.38.*.au3="C:Paradime.exe" "$(FilePath)" command.name.38.*.au3=Paradime command.save.before.38.*.au3=1 command.shortcut.38.*.au3=Ctrl+F7

Drop Paradime.exe into your C: drive.

Future of Paradime:
I will implement NOT, AND, OR, FOR-NEXT, SWITCH-CASE-ENDSWITCH, and user defined functions. Then I will deviate from autoit, exploring new, custom language constructs, but thats another academic project entirely.

-hyperzap

Edited by twitchyliquid64, 03 January 2012 - 01:58 PM.

  • Manadar, Mobius and funkey like this
ongoing projects:-firestorm: Largescale P2P Social NetworkCompleted Autoit Programs/Scripts: Variable Pickler | Networked Streaming Audio (in pure autoIT) | firenet p2p web messenger | Proxy Checker | Dynamic Execute() Code Generator | P2P UDF | Graph Theory Proof of Concept - Breadth First search







#2 Jon

Jon

    Up all night to get lucky

  • Administrators
  • 10,630 posts

Posted 03 January 2012 - 03:33 PM

The internals have progressed quite a bit since that public source code as well - some parts from scratch. There's lots of weird optimizations. Quite a lot of Copy-on-write activity which really sped things up a lot as well. The main slow down the last time I checked was the eval code which still creates a lot of copies of data as it works the values out using stacks - it's one of the scarier areas to contemplate rewriting though...

Edited by Jon, 03 January 2012 - 03:33 PM.

  • Manadar likes this

#3 Jon

Jon

    Up all night to get lucky

  • Administrators
  • 10,630 posts

Posted 03 January 2012 - 03:47 PM

Lookup Speed:
One other thing I noticed is that Macros and Builtin functions have no optimal lookup table (in the public autoIT source). Perhaps, to improve speed, these things could be stored in a red/black binary tree to increase efficiency?


I'd need to check but I think I changed macro/built-in function lookup to be resolved during the lexing/token stage so it didn't do a runtime lookup (the token contains an index to the function).

For user functions, the names of all the functions are stored in a sorted list - which then uses a binary search for lookup.

Variable lookup is done with splay trees.

Edited by Jon, 03 January 2012 - 03:48 PM.


#4 GEOSoft

GEOSoft

    Sure I'm senile. What's your excuse?

  • MVPs
  • 10,573 posts

Posted 03 January 2012 - 04:09 PM

I have the feeling that there isn't much of the current code that even resembles the last public released source code. Anyone looking at that old source shouldn't be under the impression that it still looks like that.
GeorgeQuestion about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.*** The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number.Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else."Old age and treachery will always overcome youth and skill!"

#5 Valik

Valik

    Former developer.

  • Active Members
  • PipPipPipPipPipPip
  • 18,879 posts

Posted 03 January 2012 - 07:15 PM

It doesn't look anything like that, really. Some parts have been re-written multiple times.

#6 twitchyliquid64

twitchyliquid64

    Peace. Always.

  • Active Members
  • PipPipPipPipPipPip
  • 527 posts

Posted 03 January 2012 - 10:09 PM

I have the feeling that there isn't much of the current code that even resembles the last public released source code. Anyone looking at that old source shouldn't be under the impression that it still looks like that.

I am/was aware that there were differences in the source code, and thus I have modelled my studies based on the following underlying assumption:
The available autoit source is an implementation of behaviour. Any revision is based on the fundamental elements of this implementation. (for instance, I would expect the token structure  to be mostly the same, and the variant structure to be similar save the addition of binary and Boolean types. Furthermore, the shift/reduce algorithm and recursive decent are unlikely  to be changed much, save optimization)

Edited by twitchyliquid64, 03 January 2012 - 10:10 PM.

ongoing projects:-firestorm: Largescale P2P Social NetworkCompleted Autoit Programs/Scripts: Variable Pickler | Networked Streaming Audio (in pure autoIT) | firenet p2p web messenger | Proxy Checker | Dynamic Execute() Code Generator | P2P UDF | Graph Theory Proof of Concept - Breadth First search

#7 twitchyliquid64

twitchyliquid64

    Peace. Always.

  • Active Members
  • PipPipPipPipPipPip
  • 527 posts

Posted 03 January 2012 - 10:13 PM

The internals have progressed quite a bit since that public source code as well - some parts from scratch. There's lots of weird optimizations. Quite a lot of Copy-on-write activity which really sped things up a lot as well. The main slow down the last time I checked was the eval code which still creates a lot of copies of data as it works the values out using stacks - it's one of the scarier areas to contemplate rewriting though...


I don't believe I know of a method of expression parsing that does not use stacks. (perhaps manadar /mat can jump in on this one).

The most optimal thing I can think of would be to convert all expressions to RPN form at compile time, then all you would need is one simple token/variant pointer stack to evaluate the expression.
ongoing projects:-firestorm: Largescale P2P Social NetworkCompleted Autoit Programs/Scripts: Variable Pickler | Networked Streaming Audio (in pure autoIT) | firenet p2p web messenger | Proxy Checker | Dynamic Execute() Code Generator | P2P UDF | Graph Theory Proof of Concept - Breadth First search

#8 twitchyliquid64

twitchyliquid64

    Peace. Always.

  • Active Members
  • PipPipPipPipPipPip
  • 527 posts

Posted 03 January 2012 - 10:23 PM

It doesn't look anything like that, really. Some parts have been re-written multiple times.


How much different?

I don't understand the point of having a public version available, so upcoming developers understand how to integrate functionality into the interpreter, only to let the public version remain un-updated to the point it cannot be used as a introduction point for developers.

Is this the case? Or are the internals roughly the same (save optimizations)?
ongoing projects:-firestorm: Largescale P2P Social NetworkCompleted Autoit Programs/Scripts: Variable Pickler | Networked Streaming Audio (in pure autoIT) | firenet p2p web messenger | Proxy Checker | Dynamic Execute() Code Generator | P2P UDF | Graph Theory Proof of Concept - Breadth First search

#9 Manadar

Manadar

         

  • MVPs
  • 10,865 posts

Posted 03 January 2012 - 10:29 PM

I don't believe I know of a method of expression parsing that does not use stacks. (perhaps manadar /mat can jump in on this one).

The point is not to not use stacks.


Also, inb4 pratt parser.

#10 GEOSoft

GEOSoft

    Sure I'm senile. What's your excuse?

  • MVPs
  • 10,573 posts

Posted 03 January 2012 - 11:38 PM

The public release version of the source really isn't very relevant any more.
That was back in the days when AutoIt was open source which no longer applies and that is why it's never updated.
GeorgeQuestion about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.*** The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number.Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else."Old age and treachery will always overcome youth and skill!"

#11 Valik

Valik

    Former developer.

  • Active Members
  • PipPipPipPipPipPip
  • 18,879 posts

Posted 04 January 2012 - 01:00 AM

George hit it. The source is available because that version of AutoIt is open source. No other reason really.

Your assumptions are incorrect. The Variant class - for example - has been re-written 2 or 3 times. Much of the core of AutoIt is different. Some of the functions are maybe the same give or take a bug-fix.

#12 jchd

jchd

    Whatever your capacity, resistance is futile.

  • MVPs
  • 5,317 posts

Posted 04 January 2012 - 01:28 AM

At this point the published code (which I have no intention to ever look at) is probably more a "how not to do it" example than anything else.

SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!

SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)

An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.

 

SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.

 

PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

RegExp tutorial: enough to get started

Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.


#13 twitchyliquid64

twitchyliquid64

    Peace. Always.

  • Active Members
  • PipPipPipPipPipPip
  • 527 posts

Posted 04 January 2012 - 02:07 AM

At this point the published code (which I have no intention to ever look at) is probably more a "how not to do it" example than anything else.


I disagree entirely.

It's well written, it's just not optimal and as valik said, elements have been rewritten as better methods have become known.

(I'm only talking about the interpreter core here, btw)
ongoing projects:-firestorm: Largescale P2P Social NetworkCompleted Autoit Programs/Scripts: Variable Pickler | Networked Streaming Audio (in pure autoIT) | firenet p2p web messenger | Proxy Checker | Dynamic Execute() Code Generator | P2P UDF | Graph Theory Proof of Concept - Breadth First search

#14 twitchyliquid64

twitchyliquid64

    Peace. Always.

  • Active Members
  • PipPipPipPipPipPip
  • 527 posts

Posted 04 January 2012 - 02:08 AM

George hit it. The source is available because that version of AutoIt is open source. No other reason really.

Your assumptions are incorrect. The Variant class - for example - has been re-written 2 or 3 times. Much of the core of AutoIt is different. Some of the functions are maybe the same give or take a bug-fix.


Wow...2/3 times??? Really??? What was wrong with it to sanction those re-writes! It seemed quite fine to me.
ongoing projects:-firestorm: Largescale P2P Social NetworkCompleted Autoit Programs/Scripts: Variable Pickler | Networked Streaming Audio (in pure autoIT) | firenet p2p web messenger | Proxy Checker | Dynamic Execute() Code Generator | P2P UDF | Graph Theory Proof of Concept - Breadth First search

#15 Valik

Valik

    Former developer.

  • Active Members
  • PipPipPipPipPipPip
  • 18,879 posts

Posted 04 January 2012 - 02:18 AM

At this point the published code (which I have no intention to ever look at) is probably more a "how not to do it" example than anything else.

This.

It's well written

No. No it's not.

Wow...2/3 times??? Really??? What was wrong with it to sanction those re-writes! It seemed quite fine to me.

Everything. It's still hopelessly bad but it's massive and it works. The same can be said for quite a number or parts of AutoIt.

#16 Richard Robertson

Richard Robertson

    Universalist

  • Active Members
  • PipPipPipPipPipPip
  • 10,324 posts

Posted 04 January 2012 - 02:26 AM

Do you know if any of the original code is still there?

#17 Valik

Valik

    Former developer.

  • Active Members
  • PipPipPipPipPipPip
  • 18,879 posts

Posted 04 January 2012 - 02:47 AM

I'm sure there's lots of original code still there. What that may be, though, I do not know.

#18 twitchyliquid64

twitchyliquid64

    Peace. Always.

  • Active Members
  • PipPipPipPipPipPip
  • 527 posts

Posted 04 January 2012 - 03:01 AM

Everything. It's still hopelessly bad but it's massive and it works. The same can be said for quite a number or parts of AutoIt.


Can you be more specific? WHY is it hopelessly bad? What design goals does it not achieve, how does it underperform the expectations you would have for an 'ideal' Variant Class?
ongoing projects:-firestorm: Largescale P2P Social NetworkCompleted Autoit Programs/Scripts: Variable Pickler | Networked Streaming Audio (in pure autoIT) | firenet p2p web messenger | Proxy Checker | Dynamic Execute() Code Generator | P2P UDF | Graph Theory Proof of Concept - Breadth First search

#19 czardas

czardas

  • MVPs
  • 7,049 posts

Posted 04 January 2012 - 03:13 AM

I imagine those are the parts we're not allowed to see. :)

#20 Valik

Valik

    Former developer.

  • Active Members
  • PipPipPipPipPipPip
  • 18,879 posts

Posted 04 January 2012 - 03:13 AM

It's a bloated mess of inter-connected pieces that should be separate. It uses about a billion switch statements to do what C++ can do for you if you know what abstract classes are.





Also tagged with one or more of these keywords: AutoIT, C++, script

0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users