I am back with another bi-weekly (almost ) update, as usual I have been busy with life so progress is slow, but I have jumped another small yet-somewhat-significant hurdle, I have mostly figured out the basic skeleton of the syntax, to achieve this I had to spend 100% of my brain power for a couple of minutes almost every day for a two weeks
Here is a sneak peak of the code:
enum Operation {
/* Addition, Substraction, Multiplcation, Division, Exponentiation */
OP_ADD, OP_SUB, OP_MUL, OP_DIV, OP_EXP,
/* Concatenation */
OP_CAT,
/* Equal to, Strictly equal to, Less than, Less than or equal to, Greater than, Greater than or equal to */
OP_EQU, OP_SEQU, OP_LT, OP_LTE, OP_GT, OP_GTE,
/* Conditional */
OP_CON,
};
struct Token {
enum TokenType type;
char *data;
size_t data_len;
void *info;
};
struct TokenList {
size_t length;
struct TokenListNode *head;
struct TokenListNode *tail;
bool dirty;
};
struct TokenListNode {
struct Token *token;
struct TokenListNode *prev;
struct TokenListNode *next;
};
struct Primitive {
enum {
PRI_NUMBER,
PRI_STRING,
PRI_BOOLEAN,
// ...
} type;
union {
//int number;
//char *string;
//bool boolean;
};
};
struct Operand {
enum {
OPR_PRIMITIVE,
//OPR_VARIABLE,
//OPR_MACRO,
} type;
union {
struct Primitive *value;
//struct Variable *variable;
//struct Macro *macro;
};
};
struct Expression {
enum Operation op;
struct Operand operands[];
};
struct Declaration {
enum {SCO_LOCAL, SCO_GLOBAL} scope;
// ...Const, Static
char *name;
bool is_function;
union {
// Variable or constant
struct Expression *initializer;
// Function
struct {
struct Statement *block;
size_t size;
} code;
};
};
struct Statement {
enum StatementType {
SMT_DECLARATION,
SMT_EXPRESSION,
} type;
union {
struct Declaration *declaration;
struct Expression *expression;
};
};
struct Unit {
enum UnitType {
UNT_COMMENT,
UNT_DIRECTIVE,
UNT_STATEMENT,
} type;
union {
struct Token *token;
struct Statement *statement;
};
};
The above code is basically the struct declarations that I extracted from my WIP code, these structures reference each other to form a single "code unit" which roughly corresponds to a single line of code in the AutoIt syntax. I have tried to arrange the declarations in the ascending order, where the top-most structure roughly represents the most basic element of the syntax (the token in this case) which is contained in a more informative and complex structure, all the way to the bottom where the "Unit" structure has a fully formed meaning.
This should be enough for me to start writing code to construct the syntax tree, but there are some other things that I would like to work on first. Right now I am looking into incorporating debug data into the final binary, so that we can have a nice debugger to debug our scripts .
It is not necessary that I look into this right now, but it will come in handy if I study and understand the basic concepts of how the debugger maps the final instructions to strings in the source code, so that I can make modifications to the syntax tree's design right now, as opposed to in the future to prevent inconvenience of re-writing of related code. Basically leaving gaps in the tree which I can fill later when actually implementing the debugging functionality.
A good example to study in my opinion is the format used by C debuggers (gdb and co.), and I found out that the format is called DWARF (Name FAQ), I tried to read their technical specification but it is too thick for me right now... but luckily I found an article called "Introduction to the DWARF debugging format" written by the Chairman of the standardization committee, lucky me .
So I am reading that right now. Hopefully I will have another update for you in 2 weeks, see you until then!