Jump to content

Is it possible to make an insert function for an array in O(log n)?


Mithrandir
 Share

Recommended Posts

I have been thinking that since theres an _ArrayBinarySearch function in autoit that searches in O(log n), that it would be great to make an '_ArrayBinaryInsert' or '_ArrayOrderedInsert' that would insert elements in order. Basically the function would use _ArrayBinarySearch to compare in O(log n) the elements in a previously ordered array with the new value and then insert the new value. The problem is that when inserting it would have to push all the other values to the right of the array (if sorting ascending for instance) and that in the worst case is O(n) :/ But what if the 'array' would actually be a data structure which would consist in a 2D array or an array of arrays where the first element has the value and the second one has a pointer to the next array element? Here's a draw:

Posted Image

Now that I think that is similar to a linked list (actually 'arrays' in autoit behave much like a linked list because you can add new elements dinamically always, I mean their size is not determined in compile time) but the problem is that linked lists are not indexed thus you have to go element by element thus O(n) in the worst case again.

Another problem is that I don't understand how to create a pointer and make it to point to a datatype in autoit (well in autoit there aren't even datatypes :x ) . All I got is this instruction:

Ptr ( expression )

Converts an expression into a pointer variant.

But I don't know its employment.

Another thing I thought was using _ArrayConcatenate but I would need a function that splits the array at the position where I want to add the new value which would be an array of one element and concatenate them and then with the array that is 'at right' of the split position. Anyway, looking at _ArrayConcatenate function I saw this:

For $i = $iStart To $iUBoundSource - 1
        $avArrayTarget[$iUBoundTarget + $i] = $avArraySource[$i]
    Next

So O(n) again :P

Is there a way to do an insert function that inserts a value in order in O(log n) ?

PS: I know there's _ArraySort that sorts an array using the quicksort/insertionsort algorithms but reading wikipedia I saw that it is O(n log n)

Link to comment
Share on other sites

  • Replies 42
  • Created
  • Last Reply

Top Posters In This Topic

Top Posters In This Topic

If you are wanting to add just a few values to an already sorted list then I would

ReDim the array to accommodate the new values

Append the new values.

Use and Insertion Sort to put all the values back in order.

With a mostly already sorted list an Insertion Sort will be very close to O(n)

"Programming today is a race between software engineers striving to build bigger and better idiot-proof programs, and the universe trying to build bigger and better idiots. So far, the universe is winning."- Rick Cook

Link to comment
Share on other sites

Ok thanks for your answers! But I asked this because I have a script that I'm trying to optimize. Here is the idea of it:

While UBound($array) <> 0
        
      $element = $array[0]      
      _ArrayDelete($array,0)
      
      $array2 = _somefunction($element)
      
      
      If IsArray($array2) Then 
         For $i = 0 To Ubound($array2) - 1
             _ArraySearch($arrayofresults,$array2[$i]); search whether the element is already in $arrayofresults
             If @error = 6 Then ; if it isn't in $arrayofresults then add it to both arrays
                _ArrayAdd($arrayofresults,$array2[$i])
                _ArrayAdd($array,$array2[$i])
             EndIf
         Next
      EndIf
WEnd

Bowmore said to order the array so that I can use Binary Search so I would do that before the For loop:

While UBound($array) <> 0
        
      $element = $array[0]      
      _ArrayDelete($array,0)
      
      $array2 = _somefunction($element)
      
      
      If IsArray($array2) Then
         _ArraySort($arrayofresults) 
         For $i = 0 To Ubound($array2) - 1
             _ArrayBinarySearch($arrayofresults,$array2[$i]); search whether the element is already in $arrayofresults
             If @error = 3 Then ; if it isn't in $arrayofresults then add it to both arrays
                _ArrayAdd($arrayofresults,$array2[$i])
                _ArrayAdd($array,$array2[$i])
             EndIf
         Next
      EndIf
WEnd

Let's assume that all the arrays have n elements and I will analyze the execution time inside the while. -Also I'll take the execution time of the _somefunction($element) as O(1) . So in the first code I would have in the worst case:

For $i = 0 To Ubound($array2) - 1 --> O(n)

_ArraySearch($arrayofresults,$array2[$i])--> O(n)

_ArrayAdd($arrayofresults,$array2[$i])--> O(n)

_ArrayAdd($array,$array2[$i])--> O(n)

So adding all the instructions inside the For loop and multiplying by its execution time would be 3n2 ie O(n2)

And in the second code, assuming what Bowmore said that in an already ordered array the execution time order to order it would be O(n) then

_ArraySort($arrayofresults)--> O(n)

For $i = 0 To Ubound($array2) - 1 --> O(n)

_ArrayBinarySearch($arrayofresults,$array2[$i])--> O(log n)

_ArrayAdd($arrayofresults,$array2[$i])--> O(n)

_ArrayAdd($array,$array2[$i])--> O(n)

n + n(log n + n + n) = n + nlogn + 2n2 --> O(n2) So it wouldn't make any difference, or I'm reasoning something wrong?

Finally if it's like this, how can I store new values checking before that they are not already inserted in O(n) < n2 ?

It has to be possible, if not how can databases work? wouldn't their execution times rise dramatically when searching if a value is already present?

Thanks for your inputs!

EDIT: Maybe _ArrayAdd is O(1) (don't know if Redim has to go element by element or if its constant) but in that case it would be n2 against n + nlog(n) thus reducing the order considerably. Is it like this? :x

Edited by Mithrandir
Link to comment
Share on other sites

Not even close to what I said.

Oh I skimmed your message and I got only the 'don't use the Array library' part. So are you implying there's room for improvement in my code? If so I would appreciate to know what could be. Honestly, I don't see any error in the logic or a way of doing things shorter. As I posted, what I consider that is increasing the execution time of my code is the fact that the elements in the array are increasing and using a binary search is recommended in these cases to reduce execution times. But on the other hand, ordering those elements takes time also. But there's have to be an algorithm because if not databases wouldn't function efficiently would them?

Edited by Mithrandir
Link to comment
Share on other sites

Repeating what Valik said. Don't use the Array library: The functions that start with _Array. They are slow because they do extensive error checking and are written to work in all situations. _ArrayAdd resizes the array with ReDim: A very expensive operation. Especially expensive because you can know the size of your result array before you start. There are more cases like this. Once you get rid of the _Array functions you'll see room for improvement.

Making your code shorter is pretty far apart from making your code more optimized. See my signature.

Edited by Manadar
Link to comment
Share on other sites

Repeating what Valik said. Don't use the Array library: The functions that start with _Array. They are slow because they do extensive error checking and are written to work in all situations. _ArrayAdd resizes the array with ReDim: A very expensive operation. Especially expensive because you can know the size of your result array before you start. There are more cases like this. Once you get rid of the _Array functions you'll see room for improvement.

Making your code shorter is pretty far apart from making your code more optimized. See my signature.

Ok, I understand now. So getting rid of the _Array UDF I can make my code like this:

Global $array[3] = ['1','2','3']
$inputbox = InputBox("enter a number or 'end' if you want to end the script","")
While $inputbox <> 'end'
    $newbound = UBound($array) + 1
    ReDim $array[$newbound]
    $array[$newbound - 1] = $inputbox
    _ArrayDisplay($array)
    $inputbox = InputBox("enter a number or 'end' if you want to end the script","")
WEnd

But you say that ReDim is a very expensive operation but is there any other way to add one place in the array to input a new value?

Link to comment
Share on other sites

There is no other way. It is much less expensive to calculate the correct size of array you need before hand. Sometimes even if that involves doing the same thing twice.

But how can I know the correct size of the array I will need for all the execution of the program? Sometimes I will add 1, sometimes 11 and sometimes no value.

Link to comment
Share on other sites

  • Moderators

Mithrandir,

If I have no idea how many elements an array will contain, I limit the number of very expensive ReDim calls like this (taken from my RecFileListToArray UDF):

; Start by declaring the array with a suitable number of elements and a count in the [0] element
Local $asReturnList[100] = [0]

; Then add elements by using a function like this:
Func _RFLTA_AddToList(ByRef $asList, $sValue)

    ; Increase list count
    $asList[0] += 1
    ; Double list size if too small (fewer ReDim needed)
    If UBound($asList) <= $asList[0] Then ReDim $asList[UBound($asList) * 2]
    ; Add value
    $asList[$asList[0]] = $sValue

EndFunc   ;==>_RFLTA_AddToList

; And finally remove any unused elements with a last ReDim
ReDim $asReturnList[$asReturnList[0] + 1]

This way you get to a 6000 element array with only 7 ReDims (200, 400, 800, 1600, 3200, 6400, 6000). Of course, you can reduce this number further by increasing the expansion factore for each ReDim. :P

This is a fairly standard coding ploy in my experience. :x

M23

Public_Domain.png.2d871819fcb9957cf44f4514551a2935.png Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind

Open spoiler to see my UDFs:

Spoiler

ArrayMultiColSort ---- Sort arrays on multiple columns
ChooseFileFolder ---- Single and multiple selections from specified path treeview listing
Date_Time_Convert -- Easily convert date/time formats, including the language used
ExtMsgBox --------- A highly customisable replacement for MsgBox
GUIExtender -------- Extend and retract multiple sections within a GUI
GUIFrame ---------- Subdivide GUIs into many adjustable frames
GUIListViewEx ------- Insert, delete, move, drag, sort, edit and colour ListView items
GUITreeViewEx ------ Check/clear parent and child checkboxes in a TreeView
Marquee ----------- Scrolling tickertape GUIs
NoFocusLines ------- Remove the dotted focus lines from buttons, sliders, radios and checkboxes
Notify ------------- Small notifications on the edge of the display
Scrollbars ----------Automatically sized scrollbars with a single command
StringSize ---------- Automatically size controls to fit text
Toast -------------- Small GUIs which pop out of the notification area

 

Link to comment
Share on other sites

You're still using Redim though and has already been pointed out, that is one of the slowest of all functions and can slow the script down considerably.

George

Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.

Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.***

The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number.

Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else.

"Old age and treachery will always overcome youth and skill!"

Link to comment
Share on other sites

  • Moderators

George,

But if you have no idea of the final size of the array and want to add elements what else can you do than use ReDim? Declare a 16 million element array (AutoIt limit) initially and accept the memory overhead? :x

M23

Public_Domain.png.2d871819fcb9957cf44f4514551a2935.png Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind

Open spoiler to see my UDFs:

Spoiler

ArrayMultiColSort ---- Sort arrays on multiple columns
ChooseFileFolder ---- Single and multiple selections from specified path treeview listing
Date_Time_Convert -- Easily convert date/time formats, including the language used
ExtMsgBox --------- A highly customisable replacement for MsgBox
GUIExtender -------- Extend and retract multiple sections within a GUI
GUIFrame ---------- Subdivide GUIs into many adjustable frames
GUIListViewEx ------- Insert, delete, move, drag, sort, edit and colour ListView items
GUITreeViewEx ------ Check/clear parent and child checkboxes in a TreeView
Marquee ----------- Scrolling tickertape GUIs
NoFocusLines ------- Remove the dotted focus lines from buttons, sliders, radios and checkboxes
Notify ------------- Small notifications on the edge of the display
Scrollbars ----------Automatically sized scrollbars with a single command
StringSize ---------- Automatically size controls to fit text
Toast -------------- Small GUIs which pop out of the notification area

 

Link to comment
Share on other sites

Your post is empty Edited by iEvKI3gv9Wrkd41u

"Straight_and_Crooked_Thinking" : A "classic guide to ferreting out untruths, half-truths, and other distortions of facts in political and social discussions."
"The Secrets of Quantum Physics" : New and excellent 2 part documentary on Quantum Physics by Jim Al-Khalili. (Dec 2014)

"Believing what you know ain't so" ...

Knock Knock ...
 

Link to comment
Share on other sites

Thanks to everyone for their inputs! The discussion that arised here is interesting.

For a single dimensional array, and generally known data content, and consecutive element processing/collection, and ... probably some other things ... a Stringlist could be a option. ("element1,element2,etc" + single final StringSplit(StringList,',')

Well ... maybe. :x

Yes! great idea! One could 'mimmick' an stack with a string and then doing StringRegExpReplace with a capturing group at the end, delete the last 'element' of the string. For a queue it would be the same but with a capturing group at the beginning or with a count of 1. The only possible disadvantage is that the maximum string length is of 2,147,483,647 characters while arrays can store a total of 16 million elements. Anyway, if each one of the elements you're storing in an array is no more than 132 characters long then you can use a string with separators instead of an array because when reaching the limit of strings it would also reach the max amount of array elements.

Now, I recall seeing that _ArrayUnique function stores the array values in a string and then it uses stringinstr to check whether the 'new value' to concatenate in the string isn't already present.

; #FUNCTION# ====================================================================================================================
; Name...........: _ArrayUnique
; Description ...: Returns the Unique Elements of a 1-dimensional array.
; Syntax.........: _ArrayUnique($aArray[, $iDimension = 1[, $iBase = 0[, $iCase = 0[, $vDelim = "|"]]]])
; Parameters ....: $aArray - The Array to use
;                  $iDimension  - [optional] The Dimension of the Array to use
;                  $iBase  - [optional] Is the Array 0-base or 1-base index.  0-base by default
;                  $iCase  - [optional] Flag to indicate if the operations should be case sensitive.
;                  0 = not case sensitive, using the user's locale (default)
;                  1 = case sensitive
;                  2 = not case sensitive, using a basic/faster comparison
;                  $vDelim  - [optional] One or more characters to use as delimiters.  However, cannot forsee its usefullness
; Return values .: Success - Returns a 1-dimensional array containing only the unique elements of that Dimension
;                  Failure - Returns 0 and Sets @Error:
;                  0 - No error.
;                  1 - Returns 0 if parameter is not an array.
;                  2 - _ArrayUnique failed for some other reason
;                  3 - Array dimension is invalid, should be an integer greater than 0
; Author ........: SmOke_N
; Modified.......: litlmike
; Remarks .......: Returns an array, the first element ($array[0]) contains the number of strings returned, the remaining elements ($array[1], $array[2], etc.) contain the unique strings.
; Related .......: _ArrayMax, _ArrayMin
; Link ..........:
; Example .......: Yes
; ===============================================================================================================================
Func _ArrayUnique($aArray, $iDimension = 1, $iBase = 0, $iCase = 0, $vDelim = "|")
    Local $iUboundDim
    ;$aArray used to be ByRef, but litlmike altered it to allow for the choosing of 1 Array Dimension, without altering the original array
    If $vDelim = "|" Then $vDelim = Chr(01) ; by SmOke_N, modified by litlmike
    If Not IsArray($aArray) Then Return SetError(1, 0, 0) ;Check to see if it is valid array

    ;Checks that the given Dimension is Valid
    If Not $iDimension > 0 Then
        Return SetError(3, 0, 0) ;Check to see if it is valid array dimension, Should be greater than 0
    Else
        ;If Dimension Exists, then get the number of "Rows"
        $iUboundDim = UBound($aArray, 1) ;Get Number of "Rows"
        If @error Then Return SetError(3, 0, 0) ;2 = Array dimension is invalid.

        ;If $iDimension Exists, And the number of "Rows" is Valid:
        If $iDimension > 1 Then ;Makes sure the Array dimension desired is more than 1-dimensional
            Local $aArrayTmp[1] ;Declare blank array, which will hold the dimension declared by user
            For $i = 0 To $iUboundDim - 1 ;Loop through "Rows"
                _ArrayAdd($aArrayTmp, $aArray[$i][$iDimension - 1]) ;$iDimension-1 to match Dimension
            Next
            _ArrayDelete($aArrayTmp, 0) ;Get rid of 1st-element which is blank
        Else ;Makes sure the Array dimension desired is 1-dimensional
            ;If Dimension Exists, And the number of "Rows" is Valid, and the Dimension desired is not > 1, then:
            ;For the Case that the array is 1-Dimensional
            If UBound($aArray, 0) = 1 Then ;Makes sure the Array is only 1-Dimensional
                Dim $aArrayTmp[1] ;Declare blank array, which will hold the dimension declared by user
                For $i = 0 To $iUboundDim - 1
                    _ArrayAdd($aArrayTmp, $aArray[$i])
                Next
                _ArrayDelete($aArrayTmp, 0) ;Get rid of 1st-element which is blank
            Else ;For the Case that the array is 2-Dimensional
                Dim $aArrayTmp[1] ;Declare blank array, which will hold the dimension declared by user
                For $i = 0 To $iUboundDim - 1
                    _ArrayAdd($aArrayTmp, $aArray[$i][$iDimension - 1]) ;$iDimension-1 to match Dimension
                Next
                _ArrayDelete($aArrayTmp, 0) ;Get rid of 1st-element which is blank
            EndIf
        EndIf
    EndIf

    Local $sHold ;String that holds the Unique array info
    For $iCC = $iBase To UBound($aArrayTmp) - 1 ;Loop Through array
        ;If Not the case that the element is already in $sHold, then add it
        If Not StringInStr($vDelim & $sHold, $vDelim & $aArrayTmp[$iCC] & $vDelim, $iCase) Then _
                $sHold &= $aArrayTmp[$iCC] & $vDelim
    Next
    If $sHold Then
        $aArrayTmp = StringSplit(StringTrimRight($sHold, StringLen($vDelim)), $vDelim, 1) ;Split the string into an array
        Return $aArrayTmp ;SmOke_N's version used to Return SetError(0, 0, 0)
    EndIf
    Return SetError(2, 0, 0) ;If the script gets this far, it has failed
EndFunc   ;==>_ArrayUnique

So is parsing strings faster than looping through an array? Anyone know how does StringInStr or StringRegExp function? I guess it goes char by char but is this faster than looping through an array? Because when comparing one value, an string for instance, against an array element there must be an implicit call to StringCompare or not? If so, both ways (parsing a string and looping through an array to check if the value is there) should take the same time.

But I am right to assume that concatenating two strings is O(1)?

Also if parsing strings is faster than looping through an array it would be great to create a '_StringBinarySearch' to search in a 'sorted string' (the strings you are storing with their separators following alphanumerical order) and, with string functions, maybe a '_StringOrderedInsert' to insert in O(log n)!

Edited by Mithrandir
Link to comment
Share on other sites

Your post is empty Edited by iEvKI3gv9Wrkd41u

"Straight_and_Crooked_Thinking" : A "classic guide to ferreting out untruths, half-truths, and other distortions of facts in political and social discussions."
"The Secrets of Quantum Physics" : New and excellent 2 part documentary on Quantum Physics by Jim Al-Khalili. (Dec 2014)

"Believing what you know ain't so" ...

Knock Knock ...
 

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...