MikeASD

_ArraySort() PivotSort algorithm bug?

8 posts in this topic

Unexpected sorting results (PivotSort algorithm) after ReDim array (increase array size and leave empty elements).

#include <Array.au3>

Local $aArray[4] = [3, 1, 0, 2]
ReDim $aArray[5]
$aArray2 = $aArray
_ArraySort($aArray, 0, 0, 0, 0, 0) ;    => [Empty, 0, 1, 2, 3]
_ArrayDisplay($aArray, 'QuickSort')
_ArraySort($aArray2, 0, 0, 0, 0, 1) ;   => [0, Empty, 1, 2, 3]
_ArrayDisplay($aArray2, 'PivotSort')

Local $aArray[100]
For $i = 0 To 99
    $aArray[$i] = $i
Next
ReDim $aArray[102]
$aArray2 = $aArray
_ArraySort($aArray, 0, 0, 0, 0, 0) ;    => [Empty, Empty, 0, 1, 2, ...]
_ArrayDisplay($aArray, 'QuickSort')
_ArraySort($aArray2, 0, 0, 0, 0, 1) ;   => [Empty, 0, Empty, 1, 2, ...]
_ArrayDisplay($aArray2, 'PivotSort')

 

Share this post


Link to post
Share on other sites



#2 ·  Posted (edited)

I would consider this a bug. A little investigation points to the most likely cause being the numeric conversion of an empty string.

Local $aArray[8] = ['','','a', 'b', 'c', 'd','','']
;Local $aArray[8] = ['','',0, 1, 2, 3,'','']
$aArray2 = $aArray
_ArraySort($aArray, 0, 0, 0, 0, 0) ;    => [Empty, Empty, Empty, Empty, 0, 1, 2, 3]
_ArrayDisplay($aArray, 'QuickSort')
_ArraySort($aArray2, 0, 0, 0, 0, 1) ;   => [Empty, Empty, 0, Empty, Empty, 1, 2, 3]
_ArrayDisplay($aArray2, 'PivotSort')

 

Edited by czardas

Share this post


Link to post
Share on other sites
9 hours ago, czardas said:

I would consider this a bug. A little investigation points to the most likely cause being the numeric conversion of an empty string.

Local $aArray[8] = ['','','a', 'b', 'c', 'd','','']
;Local $aArray[8] = ['','',0, 1, 2, 3,'','']
$aArray2 = $aArray
_ArraySort($aArray, 0, 0, 0, 0, 0) ;    => [Empty, Empty, Empty, Empty, 0, 1, 2, 3]
_ArrayDisplay($aArray, 'QuickSort')
_ArraySort($aArray2, 0, 0, 0, 0, 1) ;   => [Empty, Empty, 0, Empty, Empty, 1, 2, 3]
_ArrayDisplay($aArray2, 'PivotSort')

 

I think problem in the "0". It seems that the "0" considers as string.

Local $aArray1[] = ['A', 'B', 0, 1, 2, 3, 'X', 'Z']
Local $aArray2[] = ['A', 'B', 1, 2, 3, 4, 'X', 'Z']
_ArraySort($aArray1, 0, 0, 0, 0, 1) ;    => ['A', 'B', 0, 'X', 'Z', 1, 2, 3]
_ArrayDisplay($aArray1, 'PivotSort1')
_ArraySort($aArray2, 0, 0, 0, 0, 1) ;   => ['A', 'B', 'X', 'Z', 1, 2, 3, 4]
_ArrayDisplay($aArray2, 'PivotSort2')

 

Share this post


Link to post
Share on other sites

#4 ·  Posted (edited)

I think it's the other way around : empty string converted to a number. I'm not sure though: it's a complicated piece of code which I don't fully understand (200+ lines without comments). If only I had lots of time to study it, but I am overwhelmed with several of my own projects. I mentioned it to the last person working on the code, but he doesn't seem to be around right now. I'm sure someone will take a look at it shortly.

Edited by czardas
1 person likes this

Share this post


Link to post
Share on other sites

#5 ·  Posted (edited)

After looking more closely, I see that I'm right. Very little consideration is given to variable type. The whole function would have to be rewritten to fix this issue, and more detailed comparisons would slow the function down. It's another one of those inherited legacy issues, I suppose.

Edited by czardas

Share this post


Link to post
Share on other sites

The real issue with our UDF sort(s) is that the compare function is built-in. When the thing will be rewritten so that the user provides the compare function tailored for its own use case, the problems will definitely disappear, magically. C qsort() is the correct basic (so to speak) example.


This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites

Hi,

Testing at the code makes it obvious that the problem is not the PivotSort algorithm as such - it is the different implementation of the InsertionSort algorithm used within the PivotSort function when the number of elements falls below a preset value. Both the 1DSort and PivotSort use minimum thresholds (1D : 15 & Pivot : 45) which were determined empirically during testing where the InsertionSort algorithm proved significantly faster for the smaller ranges.

If I replace the InsertionSort algorithm within the PivotSort function with the one from the 1DSort function, the results are identical - as would be expected as in both cases many of the recursive sorts are in fact carried out by the InsertionSort algorithm. This can be seen when running this modified example which colour codes the algorithm used for each call to the function:

#include <Array.au3>

Local $aArray[4] = [3, 1, 0, 2]
ReDim $aArray[5]
$aArray2 = $aArray
ConsoleWrite("Testing 1DSort" & @CRLF)
_ArraySort_Test($aArray, 0, 0, 0, 0, 0) ;    => [Empty, 0, 1, 2, 3]
_ArrayDisplay($aArray, 'QuickSort')
ConsoleWrite(@CRLF & "Testing PivotSort" & @CRLF)
_ArraySort_Test($aArray2, 0, 0, 0, 0, 1) ;   => [0, Empty, 1, 2, 3]
_ArrayDisplay($aArray2, 'PivotSort')

Local $aArray[100]
For $i = 0 To 99
    $aArray[$i] = $i
Next
ReDim $aArray[102]
$aArray2 = $aArray
ConsoleWrite(@CRLF & "Testing 1DSort" & @CRLF)
_ArraySort_Test($aArray, 0, 0, 0, 0, 0) ;    => [Empty, Empty, 0, 1, 2, ...]
_ArrayDisplay($aArray, 'QuickSort')
ConsoleWrite(@CRLF & "Testing PivotSort" & @CRLF)
_ArraySort_Test($aArray2, 0, 0, 0, 0, 1) ;   => [Empty, 0, Empty, 1, 2, ...]
_ArrayDisplay($aArray2, 'PivotSort')

; #FUNCTION# ====================================================================================================================
; Author ........: Jos
; Modified.......: LazyCoder - added $iSubItem option; Tylo - implemented stable QuickSort algo; Jos - changed logic to correctly Sort arrays with mixed Values and Strings; Melba23 - implemented stable pivot algo
; ===============================================================================================================================
Func _ArraySort_Test(ByRef $aArray, $iDescending = 0, $iStart = 0, $iEnd = 0, $iSubItem = 0, $iPivot = 0)

    If $iDescending = Default Then $iDescending = 0
    If $iStart = Default Then $iStart = 0
    If $iEnd = Default Then $iEnd = 0
    If $iSubItem = Default Then $iSubItem = 0
    If $iPivot = Default Then $iPivot = 0
    If Not IsArray($aArray) Then Return SetError(1, 0, 0)

    Local $iUBound = UBound($aArray) - 1
    If $iUBound = -1 Then Return SetError(5, 0, 0)

    ; Bounds checking
    If $iEnd = Default Then $iEnd = 0
    If $iEnd < 1 Or $iEnd > $iUBound Or $iEnd = Default Then $iEnd = $iUBound
    If $iStart < 0 Or $iStart = Default Then $iStart = 0
    If $iStart > $iEnd Then Return SetError(2, 0, 0)

    If $iDescending = Default Then $iDescending = 0
    If $iPivot = Default Then $iPivot = 0
    If $iSubItem = Default Then $iSubItem = 0

    ; Sort
    Switch UBound($aArray, $UBOUND_DIMENSIONS)
        Case 1
            If $iPivot Then ; Switch algorithms as required
                __ArrayDualPivotSort_Test($aArray, $iStart, $iEnd)
            Else
                __ArrayQuickSort1D_Test($aArray, $iStart, $iEnd)
            EndIf
            If $iDescending Then _ArrayReverse($aArray, $iStart, $iEnd)
        Case 2
            If $iPivot Then Return SetError(6, 0, 0) ; Error if 2D array and $iPivot
            Local $iSubMax = UBound($aArray, $UBOUND_COLUMNS) - 1
            If $iSubItem > $iSubMax Then Return SetError(3, 0, 0)

            If $iDescending Then
                $iDescending = -1
            Else
                $iDescending = 1
            EndIf

            __ArrayQuickSort2D($aArray, $iDescending, $iStart, $iEnd, $iSubItem, $iSubMax)
        Case Else
            Return SetError(4, 0, 0)
    EndSwitch

    Return 1
EndFunc   ;==>_ArraySort_Test

Func __ArrayQuickSort1D_Test(ByRef $aArray, Const ByRef $iStart, Const ByRef $iEnd)
    If $iEnd <= $iStart Then Return

    Local $vTmp

    ; InsertionSort (faster for smaller segments)
    If ($iEnd - $iStart) < 15 Then

        ConsoleWrite("- InsertionSort not 1D > " & $iEnd - $iStart & @CRLF)

        Local $vCur
        For $i = $iStart + 1 To $iEnd
            $vTmp = $aArray[$i]

            If IsNumber($vTmp) Then
                For $j = $i - 1 To $iStart Step -1
                    $vCur = $aArray[$j]
                    ; If $vTmp >= $vCur Then ExitLoop
                    If ($vTmp >= $vCur And IsNumber($vCur)) Or (Not IsNumber($vCur) And StringCompare($vTmp, $vCur) >= 0) Then ExitLoop
                    $aArray[$j + 1] = $vCur
                Next
            Else
                For $j = $i - 1 To $iStart Step -1
                    If (StringCompare($vTmp, $aArray[$j]) >= 0) Then ExitLoop
                    $aArray[$j + 1] = $aArray[$j]
                Next
            EndIf

            $aArray[$j + 1] = $vTmp
        Next
        Return
    EndIf

    ConsoleWrite("> 1DSort" & @CRLF)

    ; QuickSort
    Local $L = $iStart, $R = $iEnd, $vPivot = $aArray[Int(($iStart + $iEnd) / 2)], $bNum = IsNumber($vPivot)
    Do
        If $bNum Then
            ; While $aArray[$L] < $vPivot
            While ($aArray[$L] < $vPivot And IsNumber($aArray[$L])) Or (Not IsNumber($aArray[$L]) And StringCompare($aArray[$L], $vPivot) < 0)
                $L += 1
            WEnd
            ; While $aArray[$R] > $vPivot
            While ($aArray[$R] > $vPivot And IsNumber($aArray[$R])) Or (Not IsNumber($aArray[$R]) And StringCompare($aArray[$R], $vPivot) > 0)
                $R -= 1
            WEnd
        Else
            While (StringCompare($aArray[$L], $vPivot) < 0)
                $L += 1
            WEnd
            While (StringCompare($aArray[$R], $vPivot) > 0)
                $R -= 1
            WEnd
        EndIf

        ; Swap
        If $L <= $R Then
            $vTmp = $aArray[$L]
            $aArray[$L] = $aArray[$R]
            $aArray[$R] = $vTmp
            $L += 1
            $R -= 1
        EndIf
    Until $L > $R

    __ArrayQuickSort1D_Test($aArray, $iStart, $R)
    __ArrayQuickSort1D_Test($aArray, $L, $iEnd)
EndFunc   ;==>__ArrayQuickSort1D_Test

Func __ArrayDualPivotSort_Test(ByRef $aArray, $iPivot_Left, $iPivot_Right, $bLeftMost = True)
    If $iPivot_Left > $iPivot_Right Then Return
    Local $iLength = $iPivot_Right - $iPivot_Left + 1
    Local $i, $j, $k, $iAi, $iAk, $iA1, $iA2, $iLast
    If $iLength < 45 Then ; Use insertion sort for small arrays - value chosen empirically

        ConsoleWrite("! InsertionSort not Pivot > " & $iPivot_Right - $iPivot_Left + 1 & @CRLF)

        ; InsertionSort algorithm taken from 1DSort function

        Local $vCur
        For $i = $iPivot_Left + 1 To $iPivot_Right
            $vTmp = $aArray[$i]

            If IsNumber($vTmp) Then
                For $j = $i - 1 To $iPivot_Left Step -1
                    $vCur = $aArray[$j]
                    ; If $vTmp >= $vCur Then ExitLoop
                    If ($vTmp >= $vCur And IsNumber($vCur)) Or (Not IsNumber($vCur) And StringCompare($vTmp, $vCur) >= 0) Then ExitLoop
                    $aArray[$j + 1] = $vCur
                Next
            Else
                For $j = $i - 1 To $iPivot_Left Step -1
                    If (StringCompare($vTmp, $aArray[$j]) >= 0) Then ExitLoop
                    $aArray[$j + 1] = $aArray[$j]
                Next
            EndIf

            $aArray[$j + 1] = $vTmp
        Next
        Return

        #cs ; Old InsertionSort algorithm

            If $bLeftMost Then
            $i = $iPivot_Left
            While $i < $iPivot_Right
            $j = $i
            $iAi = $aArray[$i + 1]
            While $iAi < $aArray[$j]
            $aArray[$j + 1] = $aArray[$j]
            $j -= 1
            If $j + 1 = $iPivot_Left Then ExitLoop
            WEnd
            $aArray[$j + 1] = $iAi
            $i += 1
            WEnd
            Else
            While 1
            If $iPivot_Left >= $iPivot_Right Then Return 1
            $iPivot_Left += 1
            If $aArray[$iPivot_Left] < $aArray[$iPivot_Left - 1] Then ExitLoop
            WEnd
            While 1
            $k = $iPivot_Left
            $iPivot_Left += 1
            If $iPivot_Left > $iPivot_Right Then ExitLoop
            $iA1 = $aArray[$k]
            $iA2 = $aArray[$iPivot_Left]
            If $iA1 < $iA2 Then
            $iA2 = $iA1
            $iA1 = $aArray[$iPivot_Left]
            EndIf
            $k -= 1
            While $iA1 < $aArray[$k]
            $aArray[$k + 2] = $aArray[$k]
            $k -= 1
            WEnd
            $aArray[$k + 2] = $iA1
            While $iA2 < $aArray[$k]
            $aArray[$k + 1] = $aArray[$k]
            $k -= 1
            WEnd
            $aArray[$k + 1] = $iA2
            $iPivot_Left += 1
            WEnd
            $iLast = $aArray[$iPivot_Right]
            $iPivot_Right -= 1
            While $iLast < $aArray[$iPivot_Right]
            $aArray[$iPivot_Right + 1] = $aArray[$iPivot_Right]
            $iPivot_Right -= 1
            WEnd
            $aArray[$iPivot_Right + 1] = $iLast
            EndIf
            Return 1

        #ce

    EndIf

    ConsoleWrite("+ PivotSort" & @CRLF)

    Local $iSeventh = BitShift($iLength, 3) + BitShift($iLength, 6) + 1
    Local $iE1, $iE2, $iE3, $iE4, $iE5, $t
    $iE3 = Ceiling(($iPivot_Left + $iPivot_Right) / 2)
    $iE2 = $iE3 - $iSeventh
    $iE1 = $iE2 - $iSeventh
    $iE4 = $iE3 + $iSeventh
    $iE5 = $iE4 + $iSeventh
    If $aArray[$iE2] < $aArray[$iE1] Then
        $t = $aArray[$iE2]
        $aArray[$iE2] = $aArray[$iE1]
        $aArray[$iE1] = $t
    EndIf
    If $aArray[$iE3] < $aArray[$iE2] Then
        $t = $aArray[$iE3]
        $aArray[$iE3] = $aArray[$iE2]
        $aArray[$iE2] = $t
        If $t < $aArray[$iE1] Then
            $aArray[$iE2] = $aArray[$iE1]
            $aArray[$iE1] = $t
        EndIf
    EndIf
    If $aArray[$iE4] < $aArray[$iE3] Then
        $t = $aArray[$iE4]
        $aArray[$iE4] = $aArray[$iE3]
        $aArray[$iE3] = $t
        If $t < $aArray[$iE2] Then
            $aArray[$iE3] = $aArray[$iE2]
            $aArray[$iE2] = $t
            If $t < $aArray[$iE1] Then
                $aArray[$iE2] = $aArray[$iE1]
                $aArray[$iE1] = $t
            EndIf
        EndIf
    EndIf
    If $aArray[$iE5] < $aArray[$iE4] Then
        $t = $aArray[$iE5]
        $aArray[$iE5] = $aArray[$iE4]
        $aArray[$iE4] = $t
        If $t < $aArray[$iE3] Then
            $aArray[$iE4] = $aArray[$iE3]
            $aArray[$iE3] = $t
            If $t < $aArray[$iE2] Then
                $aArray[$iE3] = $aArray[$iE2]
                $aArray[$iE2] = $t
                If $t < $aArray[$iE1] Then
                    $aArray[$iE2] = $aArray[$iE1]
                    $aArray[$iE1] = $t
                EndIf
            EndIf
        EndIf
    EndIf
    Local $iLess = $iPivot_Left
    Local $iGreater = $iPivot_Right
    If (($aArray[$iE1] <> $aArray[$iE2]) And ($aArray[$iE2] <> $aArray[$iE3]) And ($aArray[$iE3] <> $aArray[$iE4]) And ($aArray[$iE4] <> $aArray[$iE5])) Then
        Local $iPivot_1 = $aArray[$iE2]
        Local $iPivot_2 = $aArray[$iE4]
        $aArray[$iE2] = $aArray[$iPivot_Left]
        $aArray[$iE4] = $aArray[$iPivot_Right]
        Do
            $iLess += 1
        Until $aArray[$iLess] >= $iPivot_1
        Do
            $iGreater -= 1
        Until $aArray[$iGreater] <= $iPivot_2
        $k = $iLess
        While $k <= $iGreater
            $iAk = $aArray[$k]
            If $iAk < $iPivot_1 Then
                $aArray[$k] = $aArray[$iLess]
                $aArray[$iLess] = $iAk
                $iLess += 1
            ElseIf $iAk > $iPivot_2 Then
                While $aArray[$iGreater] > $iPivot_2
                    $iGreater -= 1
                    If $iGreater + 1 = $k Then ExitLoop 2
                WEnd
                If $aArray[$iGreater] < $iPivot_1 Then
                    $aArray[$k] = $aArray[$iLess]
                    $aArray[$iLess] = $aArray[$iGreater]
                    $iLess += 1
                Else
                    $aArray[$k] = $aArray[$iGreater]
                EndIf
                $aArray[$iGreater] = $iAk
                $iGreater -= 1
            EndIf
            $k += 1
        WEnd
        $aArray[$iPivot_Left] = $aArray[$iLess - 1]
        $aArray[$iLess - 1] = $iPivot_1
        $aArray[$iPivot_Right] = $aArray[$iGreater + 1]
        $aArray[$iGreater + 1] = $iPivot_2
        __ArrayDualPivotSort_Test($aArray, $iPivot_Left, $iLess - 2, True)
        __ArrayDualPivotSort_Test($aArray, $iGreater + 2, $iPivot_Right, False)
        If ($iLess < $iE1) And ($iE5 < $iGreater) Then
            While $aArray[$iLess] = $iPivot_1
                $iLess += 1
            WEnd
            While $aArray[$iGreater] = $iPivot_2
                $iGreater -= 1
            WEnd
            $k = $iLess
            While $k <= $iGreater
                $iAk = $aArray[$k]
                If $iAk = $iPivot_1 Then
                    $aArray[$k] = $aArray[$iLess]
                    $aArray[$iLess] = $iAk
                    $iLess += 1
                ElseIf $iAk = $iPivot_2 Then
                    While $aArray[$iGreater] = $iPivot_2
                        $iGreater -= 1
                        If $iGreater + 1 = $k Then ExitLoop 2
                    WEnd
                    If $aArray[$iGreater] = $iPivot_1 Then
                        $aArray[$k] = $aArray[$iLess]
                        $aArray[$iLess] = $iPivot_1
                        $iLess += 1
                    Else
                        $aArray[$k] = $aArray[$iGreater]
                    EndIf
                    $aArray[$iGreater] = $iAk
                    $iGreater -= 1
                EndIf
                $k += 1
            WEnd
        EndIf
        __ArrayDualPivotSort_Test($aArray, $iLess, $iGreater, False)
    Else
        Local $iPivot = $aArray[$iE3]
        $k = $iLess
        While $k <= $iGreater
            If $aArray[$k] = $iPivot Then
                $k += 1
                ContinueLoop
            EndIf
            $iAk = $aArray[$k]
            If $iAk < $iPivot Then
                $aArray[$k] = $aArray[$iLess]
                $aArray[$iLess] = $iAk
                $iLess += 1
            Else
                While $aArray[$iGreater] > $iPivot
                    $iGreater -= 1
                WEnd
                If $aArray[$iGreater] < $iPivot Then
                    $aArray[$k] = $aArray[$iLess]
                    $aArray[$iLess] = $aArray[$iGreater]
                    $iLess += 1
                Else
                    $aArray[$k] = $iPivot
                EndIf
                $aArray[$iGreater] = $iAk
                $iGreater -= 1
            EndIf
            $k += 1
        WEnd
        __ArrayDualPivotSort_Test($aArray, $iPivot_Left, $iLess - 1, True)
        __ArrayDualPivotSort_Test($aArray, $iGreater + 1, $iPivot_Right, False)
    EndIf
EndFunc   ;==>__ArrayDualPivotSort_Test

I find the resulting display very interesting: the 1DSort takes 15 passes to sort the large array, 8 of which use InsertionSort; the PivotSort takes only 4, but 3 of these use InsertionSort. Anyway, I will do some more testing and if I cannot find a problem with using the other InsertionSort algorithm in the PivotSort code, I will make the necessary changes to the function.

M23


Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind._______My UDFs:

Spoiler

ArrayMultiColSort ---- Sort arrays on multiple columns
ChooseFileFolder ---- Single and multiple selections from specified path treeview listing
Date_Time_Convert -- Easily convert date/time formats, including the language used
ExtMsgBox --------- A highly customisable replacement for MsgBox
GUIExtender -------- Extend and retract multiple sections within a GUI
GUIFrame ---------- Subdivide GUIs into many adjustable frames
GUIListViewEx ------- Insert, delete, move, drag, sort, edit and colour ListView items
GUITreeViewEx ------ Check/clear parent and child checkboxes in a TreeView
Marquee ----------- Scrolling tickertape GUIs
NoFocusLines ------- Remove the dotted focus lines from buttons, sliders, radios and checkboxes
Notify ------------- Small notifications on the edge of the display
Scrollbars ----------Automatically sized scrollbars with a single command
StringSize ---------- Automatically size controls to fit text
Toast -------------- Small GUIs which pop out of the notification area

 

Share this post


Link to post
Share on other sites

That's definitely an improvement. :) I looked at the code and thought 'my god, I'll leave that one alone for the time being'. That was before I fully understood the single pivot algorithm. There is still an issue with the numeric variant -1.#IND which I mentioned in the MVP forum. Perhaps it could be seeded out because there is no logical sort sequence associated with numbers that are neither positive nor negative [-1^.5].

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now