Is it possible to make an insert function for an array in O(log n)?

Mithrandir · January 20, 2011

I like Melba23's approach better than a long string. Remember with strings you'll run into string length limit and other problems.

(Edit: .. and other problems .. that's a simple statement to make that will always be valid)

Yes, I forgot to say that Melba23 method is a good ploy to use:

Mithrandir,

If I have no idea how many elements an array will contain, I limit the number of very expensive ReDim calls like this (taken from my RecFileListToArray UDF):
; Start by declaring the array with a suitable number of elements and a count in the [0] element
Local $asReturnList[100] = [0]

; Then add elements by using a function like this:
Func _RFLTA_AddToList(ByRef $asList, $sValue)

    ; Increase list count
    $asList[0] += 1
    ; Double list size if too small (fewer ReDim needed)
    If UBound($asList) <= $asList[0] Then ReDim $asList[UBound($asList) * 2]
    ; Add value
    $asList[$asList[0]] = $sValue

EndFunc   ;==>_RFLTA_AddToList

; And finally remove any unused elements with a last ReDim
ReDim $asReturnList[$asReturnList[0] + 1]
This way you get to a 6000 element array with only 7 ReDims (200, 400, 800, 1600, 3200, 6400, 6000). Of course, you can reduce this number further by increasing the expansion factore for each ReDim.

This is a fairly standard coding ploy in my experience.

M23

But as I said, if you know that each element you're going to store is no longer than 132 characters you can store the same amount than in an array of 16 million elements (the lenght limit for a string is around 2 billion characters) and honestly, you rarely reach these limits, so if you're not collecting large amounts of data and you want to optimize execution time it is a good alternative. I'm curious to know what are the other problems with strings? I mean if you use a well formed regular expresion (or just stringinstring) while 'surrounding' the elements with non printing characters (ascii code =< 30) there shouldn't be any confusion.

Obviously, this should be a good alternative if concatenating two strings is an O(1) (constant) operation. Is it?

George,

But if you have no idea of the final size of the array and want to add elements what else can you do than use ReDim? Declare a 16 million element array (AutoIt limit) initially and accept the memory overhead?

M23

I would also like to know the same.

MvGulik · January 20, 2011

Your post is empty Edited February 12, 2011 by iEvKI3gv9Wrkd41u

Mithrandir · January 20, 2011

127.99~ = 2147483647(max string) / 2^24(max array elements, 16777216)

Thanks for the correction

, I did my maths based on the info from the FAQ of Autoit Help File which says an array in autoit can store 16 million elements:

15. What are the current technical limits of AutoIt v3?
Here are details of the current technical limits of AutoIt. Please note that some of the limits are theoretical and you may run into performance or memory related problems before you reach the actual limit.

Maximum length of a single script line: 4,095
Maximum string length: 2,147,483,647 characters

Number range (floating point): 1.7E–308 to 1.7E+308 with 15-digit precision
Number range (integers): 64-bit signed integer
Hexadecimal numbers: 32-bit signed integer (0x80000000 to 0x7FFFFFFF)

Arrays: A maximum of 64 dimensions and/or a total of 16 million elements
Maximum depth of recursive function calls: 5100 levels

Maximum number of variables in use at one time: No limit
Maximum number of user defined functions: No limit

Maximum number of GUI windows: No limit
Maximum number of GUI controls: 65532

czardas · January 20, 2011

So is parsing strings faster than looping through an array?

I think this is a good question to which there may be no definitive answer. If you are searching just for a single character, then my guess is that there won't be much difference. However if both the array elements and the search pattern are strings of more than one character, then searching the array ought to be quicker. Maybe you can speed up the string search with regexp. It's an interesting thing to test. Edited January 20, 2011 by czardas

Mithrandir · January 20, 2011

I think this is a good question to which there may be no definitive answer. If you are searching just for a single character, then my guess is that there won't be much difference. However if both the array elements and the search pattern are strings of more than one character, then searching the array ought to be quicker. Maybe you can speed up the string search with regexp. It's an interesting thing to test.

I will do that test as soon as I can And what do you think of concatenating two strings? Is that an order 1 ( O(1) constant time) operation? If so it would be a good alternative to save time when using a stack or queue because if you use an array for those, as it has been pointed, each time you do an _ArrayAdd there's a call to the 'expensive' ReDim.

GEOSoft · January 20, 2011

A regEx would be faster than _ArraySearch() but it also depends on the possibility that you need to know what elements contain a given string. In that case you may be looking at _ArrayFindAll()

czardas · January 20, 2011

I guess different approaches should be used to suit the particular requirements. I sometimes use strings with delimiters instead of an array. For example if I wish to see if two arrays are identical, it is faster to just compair two strings, rather than testing each of the elements.

wraithdu · January 20, 2011

Random data for you. On my system (Win7 x64) it takes 125 ms and 140 MB of memory to create a maximum sized array, $a[2^24].

Valik · January 20, 2011

Thats because space is not pre-allocated. Fill that array and see what happens.

wraithdu · January 21, 2011

Heh, filling each element with "string" results in ~1.5 GB of memory

But my point was that creating an array that's big enough to accommodate the largest possible set of data you might use is not out of the question from a performance or memory standpoint. For example creating a 1,000,000 element array to hold the paths of all files on a drive... that's way more than you need, and even if you have 1,000,000 files, the resulting array only takes ~95 MB of memory.

Edited January 21, 2011 by wraithdu

Valik · January 21, 2011

This makes me wonder why ReDim is expensive to begin with.

Mithrandir · January 21, 2011

This makes me wonder why ReDim is expensive to begin with.

Maybe because it is like in a linked list that it has to go element by element throughout the pointers until the last one and there make a 'new(element)', 'element.following = NIL' ? I don't know if arrays in autoit behave internally as a linked list without a pointer to the last element or as a dynamic array

czardas · January 21, 2011

There are a few things that confuse me about how arrays work. I imagine the varied nature of arrays has some bearing on the speed of Redim. Perhaps one dimensional arrays should call a different function. I also don't understand why assigning one array equal to another should be any faster than declaring an empty array of the same size. To me that doesn't make any sense. There is some discussion of this here: The thread has some comical moments of revalation.

Edited January 21, 2011 by czardas

jvanegmond · January 21, 2011

Maybe because it is like in a linked list that it has to go element by element throughout the pointers until the last one and there make a 'new(element)', 'element.following = NIL' ? I don't know if arrays in autoit behave internally as a linked list without a pointer to the last element or as a dynamic array

I don't know either from the top of my head, but the relevant AutoIt source code has been made available here:

If you're into this sort of low level thing, and by the look of your original post I can see that you are, the source code is definitely worth checking out.

trancexx · January 21, 2011

About the price...

Shrinking the array causes every ditched variant of that array to be freed. This is a must.

I can (let me) assume that increasing the size calls initialization function for every new element. But this is done when that element is initially accessed. Therefore no initialization should be done for new elements if ReDim increases the size. Only bound should change.

Redim to increase the size should be, even much, faster than when done to decrease. If it's not than either my or somebody else's logic is wrong.

Edited January 21, 2011 by trancexx

jvanegmond · January 21, 2011

About the price...
Shrinking the array causes every ditched variant of that array to be freed. This is a must.
I can (let me) assume that increasing the size calls initialization function for every new element. But this is done when that element is initially accessed. Therefore no initialization should be done for new elements if ReDim increases the size. Only bound should change.
Redim to increase the size should be, even much, faster than when done to decrease. If it's not than either my or somebody else's logic is wrong.

Very good post. I had a look at the source code, what the AutoIt source is doing when it encounters a ReDim. Internally AutoIt uses a one dimensional array of pointers to vector types. Even when multidimensional arrays are used in AutoIt.

It creates a new array using the new size with all empty variant types in a similiar fashion to what Dim does. This means that an array is created and for each element in that array the constructor is called. The constructor is empty, so only variables are given default values.

Then it uses ArrayCopy to copy the old array into the new array. For each element in the array it gets a pointer to the element from the old array and puts it in the new array. Because of multi-dimensional arrays defined in AutoIt, and the one dimensional array that is used internally, after every copy element it needs to check if it now needs to copy elements from another dimension.

So, hardly a straight forward ReDim as we seem to think it works.

Disclaimer: All information in this post is taken from the AutoIt source code and falls under the AutoIt source code license. If you use this information for writing new code then it is considered a derivative work based on the AutoIt source code which is not allowed.

Edited January 21, 2011 by Manadar

Mithrandir · January 21, 2011

I don't know either from the top of my head, but the relevant AutoIt source code has been made available here:
If you're into this sort of low level thing, and by the look of your original post I can see that you are, the source code is definitely worth checking out.

Thanks for that link! but where's the actual autoit source code? I didn't find any file like 'autoit-v3.n.n-src.exe' so I downloaded 'autoit-docs-v3.3.6.1-src.exe' and uncompress it. Now I have the folder 'docs' with the folders 'autoit', 'autoitx' and 'build' but in none of them I found an .au3 file that is like to a source code.

jvanegmond · January 21, 2011

Thanks for that link! but where's the actual autoit source code? I didn't find any file like 'autoit-v3.n.n-src.exe' so I downloaded 'autoit-docs-v3.3.6.1-src.exe' and uncompress it. Now I have the folder 'docs' with the folders 'autoit', 'autoitx' and 'build' but in none of them I found an .au3 file that is like to a source code.

It's called autoit-v3.1.0-src.exe and it's right there.

Don't worry about it. Happened to me the first time as well.

GEOSoft · January 21, 2011

Remember that you will only have the older AutoIt source since the source is no longer open. There will be many functions and changes that are not in there but it will be good for a general reference for you.

Valik · January 21, 2011

Remember that you will only have the older AutoIt source since the source is no longer open. There will be many functions and changes that are not in there but it will be good for a general reference for you.

This. I honestly don't give a shit what you guys find in the posted source code. It was written a long time ago and many things - especially the Variant class - has seen many revisions since then. In all likelihood, though, arrays still just use arrays of Variants instead of a list. A list would provide faster growth since no data would need copied, only the bookkeeping is updated to know the bounds are larger. However random access to the array becomes slow. It's a trade-off. Random access is more important as once you populate an array you tend to access it more frequently than you continue to populate it.

Maybe because it is like in a linked list that it has to go element by element throughout the pointers until the last one and there make a 'new(element)', 'element.following = NIL' ? I don't know if arrays in autoit behave internally as a linked list without a pointer to the last element or as a dynamic array

I wrote the Linked List class AutoIt uses and it's not used by arrays. Any linked list class that does not store a "tail" reference for quick back insertions is fail. The same applies if the list supports front insertions.

Sign In

Is it possible to make an insert function for an array in O(log n)?

Recommended Posts

Mithrandir

Top Posters In This Topic

Top Posters In This Topic

MvGulik

Mithrandir

czardas

Mithrandir

GEOSoft

czardas

wraithdu

Valik

wraithdu

Valik

Mithrandir

czardas

jvanegmond

trancexx

jvanegmond

Mithrandir

jvanegmond

GEOSoft

Valik

Create an account or sign in to comment

Create an account

Sign in

Browse

AutoIt Resources

Release

Beta