Sign in to follow this  
Followers 0
tylo

Rewrite of variabletable.

17 posts in this topic

This is a rewrite of variabletable

Here's what you get: B)

- dynamic table size (unlimited)

- faster lookup (Assign, GetRef)

- faster ScopeDecrease()

- smaller file size and .exe footprint.

It must be noted that you only get the speed benefits for (large) scripts with many variables. Also other modules should be optimized to get faster execution.

It appears to work correctly, but it's not extensively tested. Only Jon can verify the correctness of this code (as I don't know the parser in detail), but here's what I did:

- Added a simplistic Vector class (templated inline - not bloated).

- Reorganised: added Lookup() and AddEntry() functions -> smaller code.

- Splitted: one vector for global and one for local variables. The variable tables acts as pure stacks (no gaps).

- Lookup(): search from top of local stack while variables have current scope. Then search through global variables.

- ScopeDecrease(): again, only pop off variables from top until scope != current.

- Assign(): added an optional parameter, so that a subsequent call to GetRef() is not needed. (both GetRef and Assign uses Lookup).

ps: I also attached the stack_int_datatype.h implemented with the simple vector class for the fun of it. :whistle:


blub

Share this post


Link to post
Share on other sites



Thanks.

When I look at stuff like vector_datatype.h it doesn't half make me feel crap at c++ - my books don't even have templates in them (pretty old) and I have no idea what the code does. I should renew my books I think.

I'll have a look at the vartable rewrite today, thanks :whistle:

Share this post


Link to post
Share on other sites

Would you be able to do a general stack and a general vector that would allow the replacement of all the stack_xxx and vector_xxx that I use? They are all identical apart from the datatype that is being used

So that I could do

Stack<int>

Stack<GenStatement>

etc

From the little I understand of templates as long as the items you are storing have correct constructors, copy cons and destructors this should be ok.

And from my attempt at using STL stacks/vectors initially (AutoIt was totally STL at one point...and twice the size ) even though a stack looks general, everytime you create a new stack of a different type more code is added, so you should keep the number of different datatypes used to a minimum?

This wouldn't be a problem here as I only use a couple of datatypes on stacks/vectors.

Share this post


Link to post
Share on other sites

#4 ·  Posted (edited)

heh, not quite so simple to replace Vector... with Vector<> - I forgot that the iterators are used completely differently to how I use my simple version...

Edited by Jon

Share this post


Link to post
Share on other sites

#5 ·  Posted (edited)

The vector has the [] operator so vector[0] returns what you expect. But your vector types has non-standard definitions of begin()/end(). You can simply use 0 and size() in place of your current begin()/end() functions, and let begin() and end() return iterators (you don't need to use them). It's a good idea to make home brewed containers as subsets, but compatible with the std containers.

Here's the Stack class you wanted:

template <typename T>
class Stack {
public:
   void push(const T& x) { vec.push_back(x); }
   void pop() { vec.pop_back(); }
   T& top() { return vec.back(); }
   bool empty() const { return vec.empty(); }
   unsigned int size() const { return vec.size(); }
private:
   Vector<T> vec;
};

The Vector implementation is very fast and compact, but there is one thing to remember when using it (compared to the single linked list you have now): References to items in the vector/stack are invalidated by push_back(). E.g.

Stack<int> S;

S.push(10);

int *p = &S.top(); // address of element in stack is only valid until next S.push();

int i = S.top; // no problem here.

Iterators, typical usage:

Vector<int> V;...

for (const_iterator i = V.begin(); i != V.end(); ++i) printf("%d\n", (*i))

even though a stack looks general, everytime you create a new stack of a different type more code is added, so you should keep the number of different datatypes used to a minimum?

This wouldn't be a problem here as I only use a couple of datatypes on stacks/vectors.

Actually, all this code is inlined, so it wouldn't matter how many datatypes you use. No functions are added - it's expanded directly where you do the calls (like macros). Because the functions are so short, the don't add more code than a regular function call would do (maybe less - think of all the code that is executed in autoit just to call an empty function :whistle: ). Edited by tylo

blub

Share this post


Link to post
Share on other sites

Thanks, yeah I tried to make my subset similar but at the time I was really confused about how iterators work. Seeing the implementation of your code has really cleared some things up. I'll have a play and see what happens.

Share this post


Link to post
Share on other sites

Can you explain how the push_back() works. It seems to call a realloc function that creates a new block from scratch and then copies all the old entries to the new.

Edit:

Actually it looks like it allocates double (*2) the memory it needs so that it only does a complete realloc if the existing memory is totally used. That makes sense and is sneaky :whistle: . But what is the significance of +4 in the realloc line?

Share this post


Link to post
Share on other sites

Yes, it's doubling the capacity of the vector whenever it's full. The capacity is 0 to begin with, so you need the +4 because 0*2 = 0. A small detail for realloc(): you can remove the second param and the 6th line. Not needed.


blub

Share this post


Link to post
Share on other sites

I got rid of all my datatypes and replaced with the two new ones.

On VC6 and 7 the uncompressed output is reduced by a couple of KB, and the upx'ed version is about the same as before. And the code has fewer files and that's always good.

On MinGW (gcc) the uncompressed output increased by 30KB. I think that's what happened with the STL stuff too. I wonder why gcc is so bad at templates - I thought it was supposed to be a good compiler?

I think I'll leave the new stuff though as I like playing with new stuff and learning more about c++. And it works fine on the intended compilers (VC). The MinGW support is just so that people without access to VC can compile/test and contribute.

Share this post


Link to post
Share on other sites

Funny, I was thinking about making a template for the linked lists I've added for the DLL support. I'll just wait and take a look and see if I can use these, though. I'm beginning to feel better so I was going to start back on it soon, but it sounds like I'll be looking at a completely different code. :whistle:

Jon, I've had problems with gcc before too (On Linux, I don't remember what version of gcc). I'd written my own char_traits to perform case insensitive stuff and I'd typedef'd my own string class with that. I'd also typedef'd an output stream to use my new type, the problem was, gcc never worked. It complained a lot at first about syntax errors (When VS said nothing, gcc seems too strict about a lot of stuff that didn't make sense to me), then when it compiled with no problems, the stream never worked at all. I don't know if it had to do with gcc or STL, but I suspect gcc. I wasn't doing anything the STL wasn't designed to do. Anyway, there's my template horror story with gcc.

Share this post


Link to post
Share on other sites

#11 ·  Posted (edited)

I've got gcc compiling with -Wall as well, so it's uber-strict. But then I like getting stuff to compile without any warnings at all as I know how nasty it can be to compile someone else's code and be hit with a page full of warnings. :whistle:

VC7 seems to bitch about certain things more than VC6, especially around 64/32 bit "issues". And it loves telling you that strlen() actually returns a "size_t" rather than the int I initially used.

The only noticable changes to the code is that 3 classes (stack_nnn, vector_nnn) have gone and whereever you saw VectorVariant/VectorToken you now have Vector<Token> and Vector<Variant>

Edited by Jon

Share this post


Link to post
Share on other sites

I've made a fatal assumption about Tylo's vector. I assumed that pop_back() would delete the last entry and call any destructors on it. D'oh. I think that is what is causing the problem reported in the Bugs forum with the guy with a large array.

I've just compiled the code with a less efficient form of the vector that stores items in a linked list and can call destructors during a pop to see if that works for him, if so I may have to rolly back all the vector/stack stuff. :whistle:

Share this post


Link to post
Share on other sites

This is just a guess, and I didn't try this and I've only looked at his code for a couple minutes, but what about something like this:

void pop_back()
{
    if (! empty())
    {
        --m.end;    // I'm not sure if this should go here or not...
        realloc((size()-1)*2 + 4);  
    }
}

Reducing the size() by 1 should copy all elements except the last one. I don't know if you even need the --m.end bit using realloc with a smaller size. Like I said, I didn't try this, but at the moment, I can't see why this wouldn't work.

Share this post


Link to post
Share on other sites

The items need destructing too.  I've rolled back the code now anyway - I want a stable version again before I do anything else.  Last two versions have been bugged to hell.

Oooh, I see. realloc also would need modified to delete anything left over after resizing. I should of looked at the code a couple minutes longer, I guess. :whistle: That shouldn't be too hard to fix if you later want to use the vector again.

Share this post


Link to post
Share on other sites

#16 ·  Posted (edited)

Arghh. I just used an hour answering this then doing a preview and I got 'page expired' - lost everything! This time I copy it to the clipboard and editor first! :whistle:

I feel a little responsible for this, but I warned about the simplicty of the vector class. variabletable should be OK, though - the vector used there only holds pointers, which knowingly has no constructor/destructor.

1) Do you know of other bug in variabletable, because the rollback?

2) Where in the code did pop_back() require destructor called (just curious)?

3) For me, 3.0.87 worked great. Is that bugged too?

Ok. Here's the correct way to do the vector class: Don't know if you all understand this code, but it does correct construct/destruct of elements.

#include <new.h>    // defines "placement" new
#include <string.h> // memcpy()

template <typename T>
class Vector
{
    struct Data { T *begin, *end, *final; };
public:
    typedef T value_type;
    typedef size_t size_type;
    typedef T* iterator;
    typedef const T* const_iterator;

    Vector()
    {
        m.begin = m.end = m.final = 0;
    }

    Vector(const Vector& other)
    {
        m.begin = m.end = allocate(other.size());
        m.final = m.begin + other.size();
        for (const T* i=other.m.begin; i!=other.m.end; construct(m.end++, *i++));
    }

    Vector& operator=(const Vector& other)
    {
        if (&other != this)
        {
            Vector<T> tmp(other);
            swap(tmp);
        }
        return *this;
    }

    ~Vector()
    {
        for (T* i=m.begin; i!=m.end; destruct(i++));
        deallocate(m.begin, size());
    }

    void clear()
    {
        Vector<T> tmp;
        swap(tmp);
    }

    void swap(Vector& other)
    {
        Data t = m;
        m = other.m;
        other.m = t;
    }

    bool empty() const { return m.begin == m.end; }
    int size() const { return m.end - m.begin; }
    int capacity() const { return m.final - m.begin; }

    iterator begin() { return m.begin; }
    iterator end() { return m.end; }
    const_iterator begin() const { return m.begin; }
    const_iterator end() const { return m.end; }

    T& back() { return *(m.end - 1); }
    const T& back() const { return *(m.end - 1); }

    T& operator[](size_type idx) { return m.begin[idx]; }
    const T& operator[](size_type idx) const { return m.begin[idx]; }

    void push_back(const T& x)
    {
        if (m.end == m.final)
            realloc(size()*2 + 4);
        construct(m.end++, x);
    }

    void pop_back()
    {
        if (! empty())
            destruct(--m.end);
        if (capacity() > size()*3)
            realloc(capacity()/2);
    }

private:
    static T* allocate(size_type n) { return (T *) operator new(n * sizeof(T)); }
    static void deallocate(T* p, size_type) { operator delete((void *)p); }
    static void construct(T* p, const T& x) { new ((void *)p) T(x); }
    static void destruct(T* p) { p->~T(); }

    void realloc(size_type cap)
    {
        Data tmp;
        tmp.begin = allocate(cap);
        tmp.end = tmp.begin + size();
        tmp.final = tmp.begin + cap;
        memcpy(tmp.begin, m.begin, size()*sizeof(T));
        deallocate(m.begin, size());
        m = tmp;
    }

    Data m;
};

/EDIT: minor optim. in pop_back(): realloc(size()) => realloc(capacity()/2)

Test

#include <stdio.h>
#include "vector_datatype.h"

class X {
public:
    X() { printf("construct\n"); }
    X(const X&) { printf("copy construct\n"); }
    X& operator=(const X&) { printf("assign\n"); return *this; }
    ~X() { printf("destruct\n"); }
};

int main()
{
    Vector<X> v;
    X x;
    v.push_back(x);
    v.push_back(x);
    v.push_back(x);
    v.push_back(x);
    printf("sz: %d  cap: %d\n", v.size(), v.capacity());
    v.push_back(x);
    v.push_back(x);
    v.push_back(x);
    printf("sz: %d  cap: %d\n", v.size(), v.capacity());
    v.pop_back();
    v.pop_back();
    printf("sz: %d  cap: %d\n", v.size(), v.capacity());
        v.pop_back();
    v.pop_back();
    printf("sz: %d  cap: %d\n", v.size(), v.capacity());
    return 0;
}

If you decide to try this out, you don't need to rename all the types back again, simply do a typedef Vector<Token> VectorToken; etc.

And again, make sure you don't give away references (or pointers) of vector items to the client code (except for short temporary usage). push() and pop() will invalidate those when doing realloc().

Edited by tylo

blub

Share this post


Link to post
Share on other sites

I feel a little responsible for this, but I warned about the simplicty of the vector class. variabletable should be OK, though - the vector used there only holds pointers, which knowingly has no constructor/destructor.

1) Do you know of other bug in variabletable, because the rollback?

2) Where in the code did pop_back() require destructor called (just curious)?

3) For me, 3.0.87 worked great. Is that bugged too?

Yeah you did warn :whistle:

I swapped the vector class for another one I knocked (less efficient) to test (src\unused\vector_datatype_jon.h) but the guy with the bug/big array said there was still a bug (although much reduced memory leak). I was past caring then and just rolled the lot back...

pop_back() was used in the stack as well and a StackVariant is used in the expression parser.

87/88/89 were bugged.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0