Jump to content

Optimized _ArraySort


Recommended Posts

Have you tried putting the array into an SQLite DB and then getting the sorted results out of it?

I know by default SQLite is limited to 2000 columns and I don't know if there's a command that can expand that during runtime, but then I can't think of any possible scenarios where having that many columns in an array would ever be encountered in the real world.

If I posted any code, assume that code was written using the latest release version unless stated otherwise. Also, if it doesn't work on XP I can't help with that because I don't have access to XP, and I'm not going to.
Give a programmer the correct code and he can do his work for a day. Teach a programmer to debug and he can do his work for a lifetime - by Chirag Gude
How to ask questions the smart way!

I hereby grant any person the right to use any code I post, that I am the original author of, on the autoitscript.com forums, unless I've specifically stated otherwise in the code or the thread post. If you do use my code all I ask, as a courtesy, is to make note of where you got it from.

Back up and restore Windows user files _Array.au3 - Modified array functions that include support for 2D arrays.  -  ColorChooser - An add-on for SciTE that pops up a color dialog so you can select and paste a color code into a script.  -  Customizable Splashscreen GUI w/Progress Bar - Create a custom "splash screen" GUI with a progress bar and custom label.  -  _FileGetProperty - Retrieve the properties of a file  -  SciTE Toolbar - A toolbar demo for use with the SciTE editor  -  GUIRegisterMsg demo - Demo script to show how to use the Windows messages to interact with controls and your GUI.  -   Latin Square password generator

Link to comment
Share on other sites

No I haven't tried using an SQL DB, but it is unsuitable for the reason you mentioned. I expect the typical speed increase to be about double. It's generally faster once you have more than one column. At four or five columns it's about twice as fast. An earlier analysis with 10000 rows gave these results (_ArraySortNew is the same function without implementing Insertion Sort):

Cols = 2
1211.01825539496 ==> _ArraySort
1056.49102956991 ==> _ArraySort_NEW
816.237785383889 ==> _ArraySort_NEW2

Cols = 3
1416.67267340732 ==> _ArraySort
1073.97173701106 ==> _ArraySort_NEW
842.834904915116 ==> _ArraySort_NEW2

Cols = 4
1591.5667545471 ==> _ArraySort
1097.97794907303 ==> _ArraySort_NEW
844.329309181139 ==> _ArraySort_NEW2

Cols = 5
1823.40146069371 ==> _ArraySort
1079.70835635917 ==> _ArraySort_NEW
846.583111083929 ==> _ArraySort_NEW2

Cols = 6
2042.94018669678 ==> _ArraySort
1131.62989449251 ==> _ArraySort_NEW
896.919961818972 ==> _ArraySort_NEW2

Cols = 7
2238.30523998931 ==> _ArraySort
1218.27658237571 ==> _ArraySort_NEW
939.640265403285 ==> _ArraySort_NEW2

Cols = 8
2452.27553538261 ==> _ArraySort
1178.72783785732 ==> _ArraySort_NEW
954.884281051391 ==> _ArraySort_NEW2

Cols = 9
2720.15541804367 ==> _ArraySort
1219.95118886139 ==> _ArraySort_NEW
973.14804904704 ==> _ArraySort_NEW2

Cols = 10
2907.80199161677 ==> _ArraySort
1228.81094930529 ==> _ArraySort_NEW
1011.72042517531 ==> _ArraySort_NEW2

Cols = 11
3062.61280840973 ==> _ArraySort
1226.45485078892 ==> _ArraySort_NEW
993.404962805534 ==> _ArraySort_NEW2

Cols = 12
3274.85134226991 ==> _ArraySort
1243.37238078804 ==> _ArraySort_NEW
1020.76985300595 ==> _ArraySort_NEW2

Cols = 13
3568.7637108406 ==> _ArraySort
1384.39026485722 ==> _ArraySort_NEW
1144.89569021819 ==> _ArraySort_NEW2

Cols = 14
3746.39104100092 ==> _ArraySort
1310.74143566198 ==> _ArraySort_NEW
1106.9250802901 ==> _ArraySort_NEW2

Cols = 15
3953.02728807673 ==> _ArraySort
1355.33802660003 ==> _ArraySort_NEW
1116.65709228611 ==> _ArraySort_NEW2

Cols = 16
4185.71822780036 ==> _ArraySort
1367.293988818 ==> _ArraySort_NEW
1138.64758780217 ==> _ArraySort_NEW2

Cols = 17
4370.83323322099 ==> _ArraySort
1385.3003770777 ==> _ArraySort_NEW
1193.27725585876 ==> _ArraySort_NEW2

Cols = 18
4592.14594996421 ==> _ArraySort
1439.93150131384 ==> _ArraySort_NEW
1173.02507468381 ==> _ArraySort_NEW2

Cols = 19
4788.44878288873 ==> _ArraySort
1447.63105069908 ==> _ArraySort_NEW
1210.75759925502 ==> _ArraySort_NEW2

Cols = 20
5029.25027867636 ==> _ArraySort
1476.89589117096 ==> _ArraySort_NEW
1205.97004493042 ==> _ArraySort_NEW2

Cols = 21
5257.49113368675 ==> _ArraySort
1480.86507258691 ==> _ArraySort_NEW
1222.19661773175 ==> _ArraySort_NEW2

Cols = 22
5404.44841010676 ==> _ArraySort
1433.0612461839 ==> _ArraySort_NEW
1216.22264111654 ==> _ArraySort_NEW2

Cols = 23
5687.98185891513 ==> _ArraySort
1501.66259300437 ==> _ArraySort_NEW
1289.25004568763 ==> _ArraySort_NEW2

Cols = 24
5913.07736609155 ==> _ArraySort
1566.00607081256 ==> _ArraySort_NEW
1298.16768926876 ==> _ArraySort_NEW2

Cols = 25
6170.79711996808 ==> _ArraySort
1603.86455491508 ==> _ArraySort_NEW
1337.81545399674 ==> _ArraySort_NEW2

Cols = 26
6315.03024848976 ==> _ArraySort
1557.91044058897 ==> _ArraySort_NEW
1327.23303314192 ==> _ArraySort_NEW2

Cols = 27
6505.80578787687 ==> _ArraySort
1628.49437587052 ==> _ArraySort_NEW
1410.59166759498 ==> _ArraySort_NEW2

Cols = 28
6691.84510326861 ==> _ArraySort
1706.75419761958 ==> _ArraySort_NEW
1512.45615989434 ==> _ArraySort_NEW2

Cols = 29
7488.97708483047 ==> _ArraySort
1752.66025802045 ==> _ArraySort_NEW
1565.9121472314 ==> _ArraySort_NEW2

Cols = 30
7661.51761576809 ==> _ArraySort
1928.90822209942 ==> _ArraySort_NEW
1611.85861661486 ==> _ArraySort_NEW2

Cols = 31
7380.50153736156 ==> _ArraySort
1722.13254583143 ==> _ArraySort_NEW
1482.58300041428 ==> _ArraySort_NEW2

Cols = 32
7540.01362984061 ==> _ArraySort
1737.06748736946 ==> _ArraySort_NEW
1480.24510414232 ==> _ArraySort_NEW2

Cols = 33
7783.85999707308 ==> _ArraySort
1791.11868809872 ==> _ArraySort_NEW
1545.96831207675 ==> _ArraySort_NEW2

Cols = 34
8025.98625220884 ==> _ArraySort
1852.95680898638 ==> _ArraySort_NEW
1605.0600783279 ==> _ArraySort_NEW2

Cols = 35
8307.68709905006 ==> _ArraySort
1817.34156948488 ==> _ArraySort_NEW
1571.66041601594 ==> _ArraySort_NEW2

Cols = 36
8416.79717675908 ==> _ArraySort
1800.11278110636 ==> _ArraySort_NEW
1562.99723981166 ==> _ArraySort_NEW2

Cols = 37
8739.65366225517 ==> _ArraySort
1871.06148936588 ==> _ArraySort_NEW
1587.56444504633 ==> _ArraySort_NEW2

Cols = 38
8901.56044200874 ==> _ArraySort
1810.94348057493 ==> _ArraySort_NEW
1599.13415563793 ==> _ArraySort_NEW2

Cols = 39
9240.90342835633 ==> _ArraySort
1925.93907199133 ==> _ArraySort_NEW
1682.22157664929 ==> _ArraySort_NEW2

Cols = 40
9187.52534662534 ==> _ArraySort
1921.23488394613 ==> _ArraySort_NEW
1673.64249481418 ==> _ArraySort_NEW2

Cols = 41
9453.01418246075 ==> _ArraySort
1956.41472576135 ==> _ArraySort_NEW
1715.08864128983 ==> _ArraySort_NEW2

Cols = 42
9779.40991235983 ==> _ArraySort
1925.34094624004 ==> _ArraySort_NEW
1703.79342054393 ==> _ArraySort_NEW2

Cols = 43
9986.25584929124 ==> _ArraySort
1952.56422297895 ==> _ArraySort_NEW
1710.04479936394 ==> _ArraySort_NEW2

Cols = 44
10203.9659778209 ==> _ArraySort
1972.83897493697 ==> _ArraySort_NEW
1742.94171568531 ==> _ArraySort_NEW2

Cols = 45
10392.5139265372 ==> _ArraySort
1999.22167202905 ==> _ArraySort_NEW
1774.49202996526 ==> _ArraySort_NEW2

Cols = 46
10696.7378665659 ==> _ArraySort
2068.8008434192 ==> _ArraySort_NEW
1824.35234594166 ==> _ArraySort_NEW2

Cols = 47
10775.3464433178 ==> _ArraySort
2131.27058218787 ==> _ArraySort_NEW
1851.46495303457 ==> _ArraySort_NEW2

Cols = 48
11026.6509981747 ==> _ArraySort
2078.95805984461 ==> _ArraySort_NEW
1848.55368606371 ==> _ArraySort_NEW2

Cols = 49
11255.9290170715 ==> _ArraySort
2120.90877253529 ==> _ArraySort_NEW
1868.44546279934 ==> _ArraySort_NEW2

Cols = 50
11509.0796435564 ==> _ArraySort
2122.03549146424 ==> _ArraySort_NEW
1893.78589937654 ==> _ArraySort_NEW2

 

Edited by czardas
Link to comment
Share on other sites

Have you ever run into an array with more than 2000 columns other than for stress testing something?

If I posted any code, assume that code was written using the latest release version unless stated otherwise. Also, if it doesn't work on XP I can't help with that because I don't have access to XP, and I'm not going to.
Give a programmer the correct code and he can do his work for a day. Teach a programmer to debug and he can do his work for a lifetime - by Chirag Gude
How to ask questions the smart way!

I hereby grant any person the right to use any code I post, that I am the original author of, on the autoitscript.com forums, unless I've specifically stated otherwise in the code or the thread post. If you do use my code all I ask, as a courtesy, is to make note of where you got it from.

Back up and restore Windows user files _Array.au3 - Modified array functions that include support for 2D arrays.  -  ColorChooser - An add-on for SciTE that pops up a color dialog so you can select and paste a color code into a script.  -  Customizable Splashscreen GUI w/Progress Bar - Create a custom "splash screen" GUI with a progress bar and custom label.  -  _FileGetProperty - Retrieve the properties of a file  -  SciTE Toolbar - A toolbar demo for use with the SciTE editor  -  GUIRegisterMsg demo - Demo script to show how to use the Windows messages to interact with controls and your GUI.  -   Latin Square password generator

Link to comment
Share on other sites

For office type data out in the wild, the most I've encountered is about 50 columns in a csv. Programmatically rows and columns are the same thing (sometimes transposed for easier human comprehension). Typical bookkeeping usage will involve up to about 20 columns, so I say we look at your questions from this perspective.

Edited by czardas
Link to comment
Share on other sites

So, the TL:DR answer would be, No you've never seen an array that required more than 2000 columns in the real world?:P

 

If I posted any code, assume that code was written using the latest release version unless stated otherwise. Also, if it doesn't work on XP I can't help with that because I don't have access to XP, and I'm not going to.
Give a programmer the correct code and he can do his work for a day. Teach a programmer to debug and he can do his work for a lifetime - by Chirag Gude
How to ask questions the smart way!

I hereby grant any person the right to use any code I post, that I am the original author of, on the autoitscript.com forums, unless I've specifically stated otherwise in the code or the thread post. If you do use my code all I ask, as a courtesy, is to make note of where you got it from.

Back up and restore Windows user files _Array.au3 - Modified array functions that include support for 2D arrays.  -  ColorChooser - An add-on for SciTE that pops up a color dialog so you can select and paste a color code into a script.  -  Customizable Splashscreen GUI w/Progress Bar - Create a custom "splash screen" GUI with a progress bar and custom label.  -  _FileGetProperty - Retrieve the properties of a file  -  SciTE Toolbar - A toolbar demo for use with the SciTE editor  -  GUIRegisterMsg demo - Demo script to show how to use the Windows messages to interact with controls and your GUI.  -   Latin Square password generator

Link to comment
Share on other sites

@czardas
Whilst the standard _arraySort() in autoIt is a good choice for generic sorting, Like you have done I often find myself writing specific sorting functions for different scripts. When you as sorting the same type of data in a script and you understand the data e.g. is it mostly in order to start with, all the same data type, size of data etc You can then choose the most appropriate algorithm and remove any unnecessary error and data type checking within the main sorting loops. It's surprising how big an improvement can be  made to the sort speeds especially with some the the large file I deal with at work. I was interested to see you use of the double pivot method.

If you you were to submit this as a replacement for the standard arraySort, then I think that adding an extra parameter where the default behaviour was to replicate the stable sort of the current function. Users could the set the parameter to the go faster mode when speed was more important than having a stable sort.

"Programming today is a race between software engineers striving to build bigger and better idiot-proof programs, and the universe trying to build bigger and better idiots. So far, the universe is winning."- Rick Cook

Link to comment
Share on other sites

15 hours ago, Bowmore said:

If you you were to submit this as a replacement for the standard arraySort, then I think that adding an extra parameter where the default behaviour was to replicate the stable sort of the current function. Users could the set the parameter to the go faster mode when speed was more important than having a stable sort.

I was thinking it would make sense to do it this way: it's the easiest way to include it by far. The down side is that it means adding more bloat and a new parameter to an already quite long list. The up side is that it becomes the user's responsibility to select the most suitable method, so checking for available RAM won't be needed for extreme cases.

@BrewManNH, please stop going on about 2000 columns. It runs @ ~ 400% efficiency with just 20 columns. :P

In any event, the code is available to all now, and perhaps it can be improved further. After writing my own sort function, I realized that _ArraySort() could potentially benefit from the (swap sequence) method I employed in _ArraySortXD(): which I need to sort arrays with more than two dimensions anyway. The function here can just as well be an example script without being included in the UDF. It only effects me to a small degree either way (both outcomes being positive).

Edited by czardas
Link to comment
Share on other sites

  • Moderators

czardas,

Adding a parameter so that the user could choose is exactly what I had envisaged. We already do that for PivotSort on 1D arrays, so it seems logical to do the same sort of thing for 2D - in fact we could use the same parameter! And I do not see this as massive bloat - I make it 72 lines in total for the 2 functions, plus a few more to deal with the parameter.

What we need now is some details of when the user would be best advised to use your "new" algorithm - the 50+ element suggestion for PivotSort was made empirically after much testing, so I could I ask you to provide something similar along the lines of what you have been suggesting earlier in this thread.

M23

Public_Domain.png.2d871819fcb9957cf44f4514551a2935.png Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind

Open spoiler to see my UDFs:

Spoiler

ArrayMultiColSort ---- Sort arrays on multiple columns
ChooseFileFolder ---- Single and multiple selections from specified path treeview listing
Date_Time_Convert -- Easily convert date/time formats, including the language used
ExtMsgBox --------- A highly customisable replacement for MsgBox
GUIExtender -------- Extend and retract multiple sections within a GUI
GUIFrame ---------- Subdivide GUIs into many adjustable frames
GUIListViewEx ------- Insert, delete, move, drag, sort, edit and colour ListView items
GUITreeViewEx ------ Check/clear parent and child checkboxes in a TreeView
Marquee ----------- Scrolling tickertape GUIs
NoFocusLines ------- Remove the dotted focus lines from buttons, sliders, radios and checkboxes
Notify ------------- Small notifications on the edge of the display
Scrollbars ----------Automatically sized scrollbars with a single command
StringSize ---------- Automatically size controls to fit text
Toast -------------- Small GUIs which pop out of the notification area

 

Link to comment
Share on other sites

On ‎05‎/‎05‎/‎2016 at 1:13 PM, czardas said:

@jpm Oops, I forgot to change the huge penalty value back to 0. It was 64 in the last test I ran. I have changed it back to 0 and ran the test again. It only completed the first test before silently failing. I don't even know if it's the same error as in the abrupt termination above.

>Running:(3.3.14.2):C:\Program Files\AutoIt3\autoit3.exe "C:\Users\czardas\Documents\SortLargeTests.au3"    
--> Press Ctrl+Alt+Break to Restart or Ctrl+Break to Stop
Test # 1 : Rows = 512 : Columns = 32768 : Penalty = 0
188742.3367892 _ArraySort($aArray2D, 0, 0, 0, 0)
73859.7321439298 _ArraySort_NEW2($aArray2D, 0, 0, 0, 1)

->12:07:37 AutoIt3.exe ended.rc:1
+>12:07:37 AutoIt3Wrapper Finished.

Edit: If you can reproduce the same error, it may be connected to ReDim. I just intended to delete the data and reset the bounds. The alternative method I chose works fine.

Edit 2: The tests are also running fine for me (with the above code) when I halve the size of the arrays, so memory is part of the issue that I'm seeing. I thought there used to be a message - 'out of memory' or something like that. Why it would work with larger arrays when not using ReDim[0][0], but fail when including that line of code, is beyond me.

I'm running Windows 7 Pro SP1.

I don't understand whatt's happening to you I get the following results

+>19:33:53 AU3Check ended.rc:0
>Running:(3.3.14.2):C:\Program Files (x86)\AutoIt3\autoit3_x64.exe "F:\AdmMesnage\_Data\Bureau\_ArraySort_New.au3"    
--> Press Ctrl+Alt+Break to Restart or Ctrl+Break to Stop
Init : WorkingSet = 13672 Kb

ReDim00 : WorkingSet = 13692 Kb
ReDim : WorkingSet = 144768 Kb
Test # 1 : Rows = 512 : Columns = 32768 : Penalty = 0 : WorkingSet = 2257164 Kb
96.78 Sec : _ArraySort($aArray2D, 0, 0, 0, 0) : WorkingSet = 2257164 Kb
22.295 Sec : _ArraySort_NEW2($aArray2D, 0, 0, 0, 1) : WorkingSet = 2261544 Kb
Ratio = 4.3
Reset : WorkingSet = 21184 Kb

ReDim00 : WorkingSet = 21184 Kb
ReDim : WorkingSet = 152260 Kb
Test # 2 : Rows = 1024 : Columns = 16384 : Penalty = 0 : WorkingSet = 2261628 Kb
108.123 Sec : _ArraySort($aArray2D, 0, 0, 0, 0) : WorkingSet = 2261700 Kb
22.276 Sec : _ArraySort_NEW2($aArray2D, 0, 0, 0, 1) : WorkingSet = 2263992 Kb
Ratio = 4.9
Reset : WorkingSet = 24156 Kb

ReDim00 : WorkingSet = 24156 Kb
ReDim : WorkingSet = 155232 Kb
Test # 3 : Rows = 2048 : Columns = 8192 : Penalty = 0 : WorkingSet = 2264152 Kb
118.694 Sec : _ArraySort($aArray2D, 0, 0, 0, 0) : WorkingSet = 2259560 Kb
22.362 Sec : _ArraySort_NEW2($aArray2D, 0, 0, 0, 1) : WorkingSet = 2260728 Kb
Ratio = 5.3
Reset : WorkingSet = 539600 Kb

+>19:42:34 AutoIt3.exe ended.rc:0
+>19:42:34 AutoIt3Wrapper Finished.
>Exit code: 0    Time: 521.9

with the following code where I added the display of memory used.

My machine has 8Gb  running Windows 10

this test was running in 64-bit but the 32-bit is  OK

I suspct you don't hve too much memory, so is not a COM error you are looking for but some kind of memory error

#AutoIt3Wrapper_UseX64=y
#cs ----------------------------------------------------------------------------

 AutoIt Version: 3.3.15.0 (Beta)
 Author:         myName

 Script Function:
    Template AutoIt script.

Test # 1 : Rows = 512 : Columns = 32768 : Penalty = 0
175234.748283886 _ArraySort($aArray2D, 0, 0, 0, 0)
49156.9694463294 _ArraySort_NEW2($aArray2D, 0, 0, 0, 1)

Test # 2 : Rows = 1024 : Columns = 16384 : Penalty = 0
196477.845967679 _ArraySort($aArray2D, 0, 0, 0, 0)
70024.0488928627 _ArraySort_NEW2($aArray2D, 0, 0, 0, 1)

Test # 4 : Rows = 4096 : Columns = 4096 : Penalty = 0
283877.033926195 _ArraySort($aArray2D, 0, 0, 0, 0)
72032.9163305307 _ArraySort_NEW2($aArray2D, 0, 0, 0, 1)

Test # 8 : Rows = 8388608 : Columns = 2 : Penalty = 0
1696033.42199072 _ArraySort($aArray2D, 0, 0, 0, 0)
1313561.93845872 _ArraySort_NEW2($aArray2D, 0, 0, 0, 1)

JPM 32-Bit

Test # 1 : Rows = 512 : Columns = 32768 : Penalty = 0
 100.607 Sec : _ArraySort($aArray2D, 0, 0, 0, 0)
 23.498 Sec : _ArraySort_NEW2($aArray2D, 0, 0, 0, 1)
 Ratio = 4.3

Test # 2 : Rows = 1024 : Columns = 16384 : Penalty = 0
 114.401 Sec : _ArraySort($aArray2D, 0, 0, 0, 0)
 23.45 Sec : _ArraySort_NEW2($aArray2D, 0, 0, 0, 1)
 Ratio = 4.9

Test # 4 : Rows = 4096 : Columns = 4096 : Penalty = 0
 137.269 Sec : _ArraySort($aArray2D, 0, 0, 0, 0)
 23.47 Sec : _ArraySort_NEW2($aArray2D, 0, 0, 0, 1)
 Ratio = 5.8

Test # 8 : Rows = 8388608 : Columns = 2 : Penalty = 0
 1072.845 Sec : _ArraySort($aArray2D, 0, 0, 0, 0)
 883.013 Sec : _ArraySort_NEW2($aArray2D, 0, 0, 0, 1)
 Ratio = 1.2

JPM  64-bit



#ce ----------------------------------------------------------------------------

; Script Start - Add your code below here

#include <Array.au3>
#include <WinAPIProc.au3>

Global $iPenalty = 0 ; Caution: 1 penalty point = between 8 and 16 megabytes depending on the test number
Global $iFirstTest = 1 ; lowest test number = 1
Global $iLastTest = 3 ; highest test number = 8

; the following sleep values can be modified depending on available RAM
Global $iNap = 00000 ; 30 secs - short sleep
Global $iSleep = 00000 ; 40 secs - long sleep to be multiplied by the test number because each test becomes more brutal

; Bounds for 2D arrays containing 16777216 elements [Rows, Columns]
Global $aArray[0][0], $aArray2D = $aArray, $aBounds = [['',''],[512,32768],[1024,16384],[2048,8192],[4096,4096],[8192,2048],[16384,1024],[32768,512],[8388608, 2]]
;~ Global $aArray[0][0], $aArray2D = $aArray, $aBounds = [['',''],[512,16384],[1024,8192],[2048,4096],[4096,2048],[8192,1024],[16384,512],[32768,2],[8388608, 1]]
Local $iTimer, $iTimer0, $iTimer1
Global $sBloat = StringFormat("%0" & $iPenalty & "s", "")

Local $aData = _WinAPI_GetProcessMemoryInfo()
    ConsoleWrite("Init : WorkingSet = " & Int($aData[2] / 1024) & " Kb" & @LF & @LF)
;~ Int($aData[2] / 1024)

For $iTestNum = $iFirstTest To $iLastTest
    ReDim $aArray2D[0][0] ; get rid of all previous data [COM ERROR HERE ON 1ST REPEAT ==> see line 36 below]
    $aData = _WinAPI_GetProcessMemoryInfo()
    ConsoleWrite("ReDim00 : WorkingSet = " & Int($aData[2] / 1024) & " Kb" & @LF)

    ReDim $aArray2D [$aBounds[$iTestNum][0]] [$aBounds[$iTestNum][1]] ; resize the array
    $aData = _WinAPI_GetProcessMemoryInfo()
    ConsoleWrite("ReDim : WorkingSet = " & Int($aData[2] / 1024) & " Kb" & @LF)
    GenerateHorribleData($aArray2D) ; the first two columns are filled with random strings and numbers
    AddMoreBloat($aArray2D) ; the rest array is filled with the same string duplicated again and again
    Sleep($iNap)

    $aData = _WinAPI_GetProcessMemoryInfo()
    ConsoleWrite("Test # " & $iTestNum & " : Rows = " & UBound($aArray2D) & " : Columns = " & UBound($aArray2D, 2) & " : Penalty = " & $iPenalty  & " : WorkingSet = " & Int($aData[2] / 1024) & " Kb" & @LF)

    ; now we are ready to run tests
    $iTimer = TimerInit()
    ; for the 1st test, sort on the first column
    _ArraySort($aArray2D, 0, 0, 0, 0)
    $iTimer = TimerDiff($iTimer)
    $iTimer0 = Round($iTimer/1000, 3)
    $aData = _WinAPI_GetProcessMemoryInfo()
    ConsoleWrite($iTimer0 & " Sec : _ArraySort($aArray2D, 0, 0, 0, 0)" & " : WorkingSet = " & Int($aData[2] / 1024) & " Kb" & @LF)
;~     Sleep($iSleep * $iTestNum) ; multiplied by the test number because each test becomes more brutal

    $iTimer = TimerInit()
    ; sort similar data on the second column
    _ArraySort_NEW2($aArray2D, 0, 0, 0, 1) ; the more rows, the more similar the starting conditions [lost after the 1st sort]
    $iTimer = TimerDiff($iTimer)
    $iTimer1 = Round($iTimer/1000, 3)
    $aData = _WinAPI_GetProcessMemoryInfo()
    ConsoleWrite($iTimer1 & " Sec : _ArraySort_NEW2($aArray2D, 0, 0, 0, 1)" & " : WorkingSet = " & Int($aData[2] / 1024) & " Kb" & @LF)

    ConsoleWrite("Ratio = " & Round($iTimer0 / $iTimer1, 1) & @LF)

    $aArray2D = $aArray ; patch to deal with the COM error above
;~     Sleep($iSleep * $iTestNum)
    Sleep($iNap)
    $aData = _WinAPI_GetProcessMemoryInfo()
    ConsoleWrite("Reset : WorkingSet = " & Int($aData[2] / 1024) & " Kb" & @LF & @LF)
;~  $iTestNum = 2 * $iTestNum - 1 ; for 1 2 4 8 only testing
Next

Func GenerateHorribleData(ByRef $aArray2D)
    If Not IsArray($aArray2D) Or UBound($aArray2D, 0) <> 2 Or UBound($aArray2D) < 2 Or UBound($aArray2D, 2) < 2 Then Return SetError(1)

    ; strings of 4 characters selected from 95 available characters would generate enough permutations to fill an array with 16777216 elements
    ; so I guess 3 characters will suffice
    Local $iCount = 0, $aCombi[1714750] ; 2 * 95 ^ 3

    ; create 857375 unique strings
    For $i = 32 To 126 ; use ascii range 126 - 31 = 95 characters
        For $j = 32 To 126
            For $k = 32 To 126
                $aCombi[$iCount] = $sBloat & Chr($i) & Chr($j) & Chr($k)
                $iCount += 1
            Next
        Next
    Next

    ; add another 857375 integers - exactly the same number as strings
    For $i = 857375 To 1714749
        $aCombi[$i] = $i
    Next

    _ArrayShuffle($aCombi) ; generate a random sample of 1714750 unique values
    For $i = 0 To UBound($aArray2D) - 1
        $aArray2D[$i][0] = $aCombi[Mod($i, 1714750)]
    Next

    _ArrayShuffle($aCombi) ; generate a similar random sample of 1714750 unique values
    For $i = 0 To UBound($aArray2D) - 1
        $aArray2D[$i][1] = $aCombi[Mod($i, 1714750)]
    Next
    ;_ArrayDisplay($aCombi, "Combi", '857000:857374')
EndFunc ; GenerateHorribleData

Func AddMoreBloat(ByRef $aArray2D)
    Local $sMoreBloat = $sBloat & "str"
    If Not IsArray($aArray2D) Or UBound($aArray2D, 0) <> 2 Or UBound($aArray2D) < 2 Or UBound($aArray2D, 2) < 2 Then Return SetError(1)
    For $i = 0 To UBound($aArray2D) -1
        For $j = 2 To UBound($aArray2D, 2) -1
            $aArray2D[$i][$j] = $sMoreBloat
        Next
    Next
EndFunc


; #FUNCTION# ====================================================================================================================
; Author ........: Jos
; Modified.......: LazyCoder - added $iSubItem option; Tylo - implemented stable QuickSort algo; Jos - changed logic to correctly Sort arrays with mixed Values and Strings; Melba23 - implemented stable pivot algo; czardas - speed optimization for 2D arrays
; ===============================================================================================================================
Func _ArraySort_NEW2(ByRef $aArray, $iDescending = 0, $iStart = 0, $iEnd = 0, $iSubItem = 0, $iPivot = 0)

    If $iDescending = Default Then $iDescending = 0
    If $iStart = Default Then $iStart = 0
    If $iEnd = Default Then $iEnd = 0
    If $iSubItem = Default Then $iSubItem = 0
    If $iPivot = Default Then $iPivot = 0
    If Not IsArray($aArray) Then Return SetError(1, 0, 0)

    Local $iUBound = UBound($aArray) - 1
    If $iUBound = -1 Then Return SetError(5, 0, 0)

    ; Bounds checking
    If $iEnd = Default Then $iEnd = 0
    If $iEnd < 1 Or $iEnd > $iUBound Or $iEnd = Default Then $iEnd = $iUBound
    If $iStart < 0 Or $iStart = Default Then $iStart = 0
    If $iStart > $iEnd Then Return SetError(2, 0, 0)

    If $iDescending = Default Then $iDescending = 0
    If $iPivot = Default Then $iPivot = 0
    If $iSubItem = Default Then $iSubItem = 0

    ; Sort
    Switch UBound($aArray, $UBOUND_DIMENSIONS)
        Case 1
            If $iPivot Then ; Switch algorithms as required
                __ArrayDualPivotSort($aArray, $iStart, $iEnd)
            Else
                __ArrayQuickSort1D($aArray, $iStart, $iEnd)
            EndIf
            If $iDescending Then _ArrayReverse($aArray, $iStart, $iEnd)
        Case 2
            If $iPivot Then Return SetError(6, 0, 0) ; Error if 2D array and $iPivot
            Local $iSubMax = UBound($aArray, $UBOUND_COLUMNS) - 1
            If $iSubItem > $iSubMax Then Return SetError(3, 0, 0)

            If $iDescending Then
                $iDescending = -1
            Else
                $iDescending = 1
            EndIf

            Local $aTrac[$iUBound +1] ; to track migrating indeces
            For $i = $iStart To $iEnd
                $aTrac[$i] = $i
            Next

            __ArrayQuickSort2D_NEW2($aArray, $aTrac, $iDescending, $iStart, $iEnd, $iSubItem, $iSubMax)
            __SwapSequence2D($aArray, $aTrac, $iStart, $iEnd)
        Case Else
            Return SetError(4, 0, 0)
    EndSwitch

    Return 1
EndFunc   ;==>_ArraySort

; #INTERNAL_USE_ONLY# ===========================================================================================================
; Name...........: __ArrayQuickSort2D
; Description ...: Helper function for sorting 2D arrays
; Syntax.........: __ArrayQuickSort2D ( ByRef $aArray, ByRef $iStep, ByRef $iStart, ByRef $iEnd, ByRef $iSubItem, ByRef $iSubMax )
; Parameters ....: $aArray  - Array to sort
;                  $iStep    - Step size (should be 1 to sort ascending, -1 to sort descending!)
;                  $iStart   - Index of array to start sorting at
;                  $iEnd     - Index of array to stop sorting at
;                  $iSubItem - Sub-index to sort on in 2D arrays
;                  $iSubMax  - Maximum sub-index that array has
; Return values .: None
; Author ........: Jos van der Zande, LazyCoder, Tylo, Ultima, czardas
; Modified.......:
; Remarks .......: For Internal Use Only
; Related .......:
; Link ..........:
; Example .......:
; ===============================================================================================================================
Func __ArrayQuickSort2D_NEW2(Const ByRef $aArray, ByRef $aTrac, Const ByRef $iStep, Const ByRef $iStart, Const ByRef $iEnd, Const ByRef $iSubItem, Const ByRef $iSubMax)
    If $iEnd <= $iStart Then Return

    Local $iTmp

    ; InsertionSort (faster for smaller segments)
    If ($iEnd - $iStart) < 15 Then
        For $i = $iStart + 1 To $iEnd
            $iTmp = $aTrac[$i]

            If IsNumber($aArray[$iTmp][$iSubItem]) Then
                For $j = $i - 1 To $iStart Step -1
                    ; Follows the same logic as __ArrayQuickSort1D
                    If ((($aArray[$iTmp][$iSubItem] * $iStep) >= $aArray[$aTrac[$j]][$iSubItem] * $iStep) And IsNumber($aArray[$aTrac[$j]][$iSubItem])) _
                    Or (Not IsNumber($aArray[$aTrac[$j]][$iSubItem]) And StringCompare($aArray[$iTmp][$iSubItem], $aArray[$aTrac[$j]][$iSubItem]) * $iStep >= 0) Then ExitLoop
                    $aTrac[$j + 1] = $aTrac[$j]
                Next
            Else
                For $j = $i - 1 To $iStart Step -1
                    If (StringCompare($aArray[$iTmp][$iSubItem], $aArray[$aTrac[$j]][$iSubItem]) * $iStep >= 0) Then ExitLoop
                    $aTrac[$j + 1] = $aTrac[$j]
                Next
            EndIf

            $aTrac[$j + 1] = $iTmp
        Next
        Return
    EndIf

    ; QuickSort
    Local $L = $iStart, $R = $iEnd, $vPivot = $aArray[$aTrac[Int(($iStart + $iEnd) / 2)]][$iSubItem], $bNum = IsNumber($vPivot)
    Do
        If $bNum Then
            ; While ($iStep * ($aArray[$L][$iSubItem] - $vPivot) < 0 And IsNumber($aArray[$L][$iSubItem])) Or (Not IsNumber($aArray[$L][$iSubItem]) And $iStep * StringCompare($aArray[$L][$iSubItem], $vPivot) < 0)
            While ($iStep * ($aArray[$aTrac[$L]][$iSubItem] - $vPivot) < 0 And IsNumber($aArray[$aTrac[$L]][$iSubItem])) Or (Not IsNumber($aArray[$aTrac[$L]][$iSubItem]) And $iStep * StringCompare($aArray[$aTrac[$L]][$iSubItem], $vPivot) < 0)
                $L += 1
            WEnd
            ; While ($iStep * ($aArray[$R][$iSubItem] - $vPivot) > 0 And IsNumber($aArray[$R][$iSubItem])) Or (Not IsNumber($aArray[$R][$iSubItem]) And $iStep * StringCompare($aArray[$R][$iSubItem], $vPivot) > 0)
            While ($iStep * ($aArray[$aTrac[$R]][$iSubItem] - $vPivot) > 0 And IsNumber($aArray[$aTrac[$R]][$iSubItem])) Or (Not IsNumber($aArray[$aTrac[$R]][$iSubItem]) And $iStep * StringCompare($aArray[$aTrac[$R]][$iSubItem], $vPivot) > 0)
                $R -= 1
            WEnd
        Else
            While ($iStep * StringCompare($aArray[$aTrac[$L]][$iSubItem], $vPivot) < 0)
                $L += 1
            WEnd
            While ($iStep * StringCompare($aArray[$aTrac[$R]][$iSubItem], $vPivot) > 0)
                $R -= 1
            WEnd
        EndIf

        ; Swap
        If $L <= $R Then
            $iTmp = $aTrac[$L]
            $aTrac[$L] = $aTrac[$R]
            $aTrac[$R] = $iTmp
            $L += 1
            $R -= 1
        EndIf
    Until $L > $R

    __ArrayQuickSort2D_NEW2($aArray, $aTrac, $iStep, $iStart, $R, $iSubItem, $iSubMax)
    __ArrayQuickSort2D_NEW2($aArray, $aTrac, $iStep, $L, $iEnd, $iSubItem, $iSubMax)
EndFunc   ;==>__ArrayQuickSort2D_NEW2


; #INTERNAL_USE_ONLY# ===========================================================================================================
; Name...........: ___SwapSequence2D
; Description ...: Helper function for populating 2D arrays. [algorithm modelled on the knight's tour problem]
; Author ........: czardas
; ===============================================================================================================================
Func __SwapSequence2D(ByRef $aArray, ByRef $aTrac, $iStart, $iEnd)
    Local $iCols = UBound($aArray, 2), $aFirst[$iCols], $i, $iNext

    For $iInit = $iStart To $iEnd ; initialize each potential overwrite sequence [separate closed system]
        If $aTrac[$iInit] <> $iInit Then ; rows will now be overwritten in accordance with tracking information
            $i = $iInit ; set the current row as the start of the sequence

            For $j = 0 To $iCols -1
                $aFirst[$j] = $aArray[$i][$j] ; copy the first row [although we don't know where to put it yet]
            Next

            Do
                For $j = 0 To $iCols -1
                    $aArray[$i][$j] = $aArray[$aTrac[$i]][$j] ; overwrite each row [following the trail]
                Next
                $iNext = $aTrac[$i] ; get the index of the next row in the sequence
                $aTrac[$i] = $i ; set to ignore rows already processed [may be needed once, or not at all]
                $i = $iNext ; follow the trail as far as it goes [indices could be higher or lower]
            Until $aTrac[$i] = $iInit ; all tracking sequences end at this juncture

            For $j = 0 To $iCols -1
                $aArray[$i][$j] = $aFirst[$j] ; now we know where to put the initial row we copied earlier
            Next
            $aTrac[$i] = $i ; set to ignore rows already processed [as above]
        EndIf
    Next
EndFunc ;==> __SwapSequence2D

Edit:32-bit results

>Running:(3.3.14.2):C:\Program Files (x86)\AutoIt3\autoit3.exe "F:\AdmMesnage\_Data\Bureau\_ArraySort_New.au3"    
--> Press Ctrl+Alt+Break to Restart or Ctrl+Break to Stop
Init : WorkingSet = 12764 Kb

ReDim00 : WorkingSet = 12780 Kb
ReDim : WorkingSet = 78320 Kb
Test # 1 : Rows = 512 : Columns = 32768 : Penalty = 0 : WorkingSet = 1532456 Kb
114.48 Sec : _ArraySort($aArray2D, 0, 0, 0, 0) : WorkingSet = 1532472 Kb
26.124 Sec : _ArraySort_NEW2($aArray2D, 0, 0, 0, 1) : WorkingSet = 1535424 Kb
Ratio = 4.4
Reset : WorkingSet = 19264 Kb

ReDim00 : WorkingSet = 19264 Kb
ReDim : WorkingSet = 84804 Kb
Test # 2 : Rows = 1024 : Columns = 16384 : Penalty = 0 : WorkingSet = 1536376 Kb
122.14 Sec : _ArraySort($aArray2D, 0, 0, 0, 0) : WorkingSet = 1536444 Kb
26.357 Sec : _ArraySort_NEW2($aArray2D, 0, 0, 0, 1) : WorkingSet = 1537900 Kb
Ratio = 4.6
Reset : WorkingSet = 21084 Kb

ReDim00 : WorkingSet = 21084 Kb
ReDim : WorkingSet = 86624 Kb
Test # 3 : Rows = 2048 : Columns = 8192 : Penalty = 0 : WorkingSet = 1538824 Kb
135.667 Sec : _ArraySort($aArray2D, 0, 0, 0, 0) : WorkingSet = 1538696 Kb
26.306 Sec : _ArraySort_NEW2($aArray2D, 0, 0, 0, 1) : WorkingSet = 1539428 Kb
Ratio = 5.2
Reset : WorkingSet = 23112 Kb

+>20:11:58 AutoIt3.exe ended.rc:0
+>20:11:58 AutoIt3Wrapper Finished.
>Exit code: 0    Time: 585.8

 

Edited by jpm
add 32-bit results
Link to comment
Share on other sites

Thanks for taking the time to check this. It is indeed my mistake to call it a COM error.

21 hours ago, jpm said:

I suspct you don't hve too much memory, so is not a COM error you are looking for but some kind of memory error

Subsequent tests have shown low memory to be the issue (see my next post).

Link to comment
Share on other sites

I have spent some time on this today. Firstly I have added the option as part of the pivot parameter. The name $iPivot is now only applicable to 1D, and should probably be renamed to something like $iOption or $iOpt (optimal).

@BrewManNH Here are some tests with not too many columns for you. :) 

#include <Array.au3>

Local $iBound = 6000, $aTest[$iBound][1], $iTimer
For $c = 2 To 10
    ReDim $aTest[$iBound][$c]
    ConsoleWrite("rows = " & $iBound & " : columns = " & $c & @CRLF)
    For $r = 0 To UBound($aTest) -1
        $aTest[$r][$c -1] = "the quick brown fox jumped over the lazy dog"
    Next

    CompleteMess($aTest)
    $iTimer = TimerInit()
    _ArraySort_Beta($aTest, 0, 0, 0, 0, 1)
    ConsoleWrite(TimerDiff($iTimer) & ' - optimized' & @CRLF)
    Sleep(1000)

    CompleteMess($aTest)
    $iTimer = TimerInit()
    _ArraySort_Beta($aTest, 0, 0, 0, 0, 0)
    ConsoleWrite(TimerDiff($iTimer) & ' - original' & @CRLF & @CRLF)
    Sleep(1000)
Next

For $c = 20 To 100 Step 10
    ReDim $aTest[$iBound][$c]
    ConsoleWrite("rows = " & $iBound & " : columns = " & $c & @CRLF)
    For $r = 0 To UBound($aTest) -1
        For $j = $c -10 To $c -1
            $aTest[$r][$j] = "the quick brown fox jumped over the lazy dog"
        Next
    Next

    CompleteMess($aTest)
    $iTimer = TimerInit()
    _ArraySort_Beta($aTest, 0, 0, 0, 0, 1)
    ConsoleWrite(TimerDiff($iTimer) & ' - optimized' & @CRLF)
    Sleep(1000)

    CompleteMess($aTest)
    $iTimer = TimerInit()
    _ArraySort_Beta($aTest, 0, 0, 0, 0, 0)
    ConsoleWrite(TimerDiff($iTimer) & ' - original' & @CRLF & @CRLF)
    Sleep(1000)
Next

; generate a highly disorganized list of alphabetic strings and integers (to test alpha-numeric sorting algorithms)
Func CompleteMess(ByRef $aArray, $iCol = 0) ; code for this function is somewhat improvised (without too much thought)
    Local $iBound = UBound($aArray), $iCount = $iBound

    For $i = 0 To $iBound -3 Step 4 ; empty the elements to be used for strings
        $aArray[$i][$iCol] = ''
        $aArray[$i +2][$iCol] = ''
    Next
    For $iIterations = 1 To 12 ; Add 12 alphabetic characters
        For $i = 0 To $iBound -1 Step 4
            $aArray[$i][$iCol] &= Chr(Mod(Int(StringReverse($iCount)), 26) + 65)
            $iCount += 1
        Next
        $iCount = Int(StringReverse($iCount)) + $iCount ; helps to mix things up
        For $i = 2 To $iBound -1 Step 4
            $aArray[$i][$iCol] &= Chr(Mod(Int(StringReverse($iCount)), 26) + 65)
            $iCount += 1
        Next
        $iCount = Int(StringReverse($iCount)) + $iCount
    Next
    ; include some integers
    For $i = 1 To $iBound -1 Step 4
        $aArray[$i][$iCol] = $iCount + $i
        $iCount += 1
    Next
    ; include a few more integers
    $iCount = Int(StringReverse($iCount)) + $iCount
    For $i = 3 To $iBound -1 Step 4
        $aArray[$i][$iCol] = Int(StringReverse($iCount)) + $iCount
        $iCount += 1
    Next
EndFunc    ;==>CompleteMess

; #FUNCTION# ====================================================================================================================
; Author ........: Jos
; Modified.......: LazyCoder - added $iSubItem option; Tylo - implemented stable QuickSort algo; Jos - changed logic to correctly Sort arrays with mixed Values and Strings; Melba23 - implemented stable pivot algo
; ===============================================================================================================================
Func _ArraySort_Beta(ByRef $aArray, $iDescending = 0, $iStart = 0, $iEnd = 0, $iSubItem = 0, $iPivot = 0)

    If $iDescending = Default Then $iDescending = 0
    If $iStart = Default Then $iStart = 0
    If $iEnd = Default Then $iEnd = 0
    If $iSubItem = Default Then $iSubItem = 0
    If $iPivot = Default Then $iPivot = 0
    If Not IsArray($aArray) Then Return SetError(1, 0, 0)

    Local $iUBound = UBound($aArray) - 1
    If $iUBound = -1 Then Return SetError(5, 0, 0)

    ; Bounds checking
    If $iEnd = Default Then $iEnd = 0
    If $iEnd < 1 Or $iEnd > $iUBound Or $iEnd = Default Then $iEnd = $iUBound
    If $iStart < 0 Or $iStart = Default Then $iStart = 0
    If $iStart > $iEnd Then Return SetError(2, 0, 0)

    ; Sort
    Switch UBound($aArray, $UBOUND_DIMENSIONS)
        Case 1
            If $iPivot Then ; Switch algorithms as required
                __ArrayDualPivotSort($aArray, $iStart, $iEnd)
            Else
                __ArrayQuickSort1D($aArray, $iStart, $iEnd)
            EndIf
            If $iDescending Then _ArrayReverse($aArray, $iStart, $iEnd)
        Case 2
            Local $iSubMax = UBound($aArray, $UBOUND_COLUMNS) - 1
            If $iSubItem > $iSubMax Then Return SetError(3, 0, 0)

            If $iDescending Then
                $iDescending = -1
            Else
                $iDescending = 1
            EndIf

            If $iPivot And $iSubMax Then ; ignore if only one column exists
                Local $aTrac[$iUBound +1] ; to track migrating indeces
                For $i = $iStart To $iEnd
                    $aTrac[$i] = $i
                Next
                __ArraySortIndices($aArray, $aTrac, $iDescending, $iStart, $iEnd, $iSubItem, $iSubMax)
                __SwapSequence2D($aArray, $aTrac, $iStart, $iEnd)
            Else
                __ArrayQuickSort2D($aArray, $iDescending, $iStart, $iEnd, $iSubItem, $iSubMax)
            EndIf
        Case Else
            Return SetError(4, 0, 0)
    EndSwitch

    Return 1
EndFunc   ;==>_ArraySort_Beta

; #INTERNAL_USE_ONLY# ===========================================================================================================
; Name...........: __ArraySortIndices
; Description ...: Helper function for sorting 2D arrays
; Syntax.........: __ArraySortIndices ( ByRef $aArray, ByRef $aTrac, ByRef $iStep, ByRef $iStart, ByRef $iEnd, ByRef $iSubItem, ByRef $iSubMax )
; Parameters ....: $aArray   - Array to use as reference when sorting indices
;                  $aTrac    - Array of indices to sort
;                  $iStep    - Step size (should be 1 to sort ascending, -1 to sort descending!)
;                  $iStart   - Index of array to start sorting at
;                  $iEnd     - Index of array to stop sorting at
;                  $iSubItem - Sub-index to use as reference when sorting indices
;                  $iSubMax  - Maximum sub-index that array has
; Return values .: None
; Author ........: Jos van der Zande, LazyCoder, Tylo, Ultima, czardas
; Modified.......:
; Remarks .......: For Internal Use Only
; Related .......:
; Link ..........:
; Example .......:
; ===============================================================================================================================
Func __ArraySortIndices(Const ByRef $aArray, ByRef $aTrac, Const ByRef $iStep, Const ByRef $iStart, Const ByRef $iEnd, Const ByRef $iSubItem, Const ByRef $iSubMax)
    If $iEnd <= $iStart Then Return

    Local $iTmp

    ; InsertionSort (faster for smaller segments)
    If ($iEnd - $iStart) < 15 Then
        For $i = $iStart + 1 To $iEnd
            $iTmp = $aTrac[$i]

            ; Follows the same logic as __ArrayQuickSort1D
            If IsNumber($aArray[$iTmp][$iSubItem]) Then
                For $j = $i - 1 To $iStart Step -1
                    If ((($aArray[$iTmp][$iSubItem] * $iStep) >= $aArray[$aTrac[$j]][$iSubItem] * $iStep) And IsNumber($aArray[$aTrac[$j]][$iSubItem])) _
                    Or (Not IsNumber($aArray[$aTrac[$j]][$iSubItem]) And StringCompare($aArray[$iTmp][$iSubItem], $aArray[$aTrac[$j]][$iSubItem]) * $iStep >= 0) Then ExitLoop
                    $aTrac[$j + 1] = $aTrac[$j]
                Next
            Else
                For $j = $i - 1 To $iStart Step -1
                    If (StringCompare($aArray[$iTmp][$iSubItem], $aArray[$aTrac[$j]][$iSubItem]) * $iStep >= 0) Then ExitLoop
                    $aTrac[$j + 1] = $aTrac[$j]
                Next
            EndIf

            $aTrac[$j + 1] = $iTmp
        Next
        Return
    EndIf

    ; QuickSort
    Local $L = $iStart, $R = $iEnd, $vPivot = $aArray[$aTrac[Int(($iStart + $iEnd) / 2)]][$iSubItem], $bNum = IsNumber($vPivot)
    Do
        If $bNum Then
            While ($iStep * ($aArray[$aTrac[$L]][$iSubItem] - $vPivot) < 0 And IsNumber($aArray[$aTrac[$L]][$iSubItem])) Or (Not IsNumber($aArray[$aTrac[$L]][$iSubItem]) And $iStep * StringCompare($aArray[$aTrac[$L]][$iSubItem], $vPivot) < 0)
                $L += 1
            WEnd
            While ($iStep * ($aArray[$aTrac[$R]][$iSubItem] - $vPivot) > 0 And IsNumber($aArray[$aTrac[$R]][$iSubItem])) Or (Not IsNumber($aArray[$aTrac[$R]][$iSubItem]) And $iStep * StringCompare($aArray[$aTrac[$R]][$iSubItem], $vPivot) > 0)
                $R -= 1
            WEnd
        Else
            While ($iStep * StringCompare($aArray[$aTrac[$L]][$iSubItem], $vPivot) < 0)
                $L += 1
            WEnd
            While ($iStep * StringCompare($aArray[$aTrac[$R]][$iSubItem], $vPivot) > 0)
                $R -= 1
            WEnd
        EndIf

        ; Swap
        If $L <= $R Then
            $iTmp = $aTrac[$L]
            $aTrac[$L] = $aTrac[$R]
            $aTrac[$R] = $iTmp
            $L += 1
            $R -= 1
        EndIf
    Until $L > $R

    __ArraySortIndices($aArray, $aTrac, $iStep, $iStart, $R, $iSubItem, $iSubMax)
    __ArraySortIndices($aArray, $aTrac, $iStep, $L, $iEnd, $iSubItem, $iSubMax)
EndFunc   ;==>__ArraySortIndices

; #INTERNAL_USE_ONLY# ===========================================================================================================
; Name...........: ___SwapSequence2D
; Description ...: Helper function for populating 2D arrays. [algorithm modelled on the knight's tour problem]
; Author ........: czardas
; ===============================================================================================================================
Func __SwapSequence2D(ByRef $aArray, ByRef $aTrac, $iStart, $iEnd)
    Local $iCols = UBound($aArray, 2), $aFirst[$iCols], $i, $iNext

    For $iInit = $iStart To $iEnd ; initialize each potential overwrite sequence [separate closed system]
        If $aTrac[$iInit] <> $iInit Then ; rows will now be overwritten in accordance with tracking information
            $i = $iInit ; set the current row as the start of the sequence

            For $j = 0 To $iCols -1
                $aFirst[$j] = $aArray[$i][$j] ; copy the first row [although we don't know where to put it yet]
            Next

            Do
                For $j = 0 To $iCols -1
                    $aArray[$i][$j] = $aArray[$aTrac[$i]][$j] ; overwrite each row [following the trail]
                Next
                $iNext = $aTrac[$i] ; get the index of the next row in the sequence
                $aTrac[$i] = $i ; set to ignore rows already processed [may be needed once, or not at all]
                $i = $iNext ; follow the trail as far as it goes [indices could be higher or lower]
            Until $aTrac[$i] = $iInit ; all tracking sequences end at this juncture

            For $j = 0 To $iCols -1
                $aArray[$i][$j] = $aFirst[$j] ; now we know where to put the initial row we copied earlier
            Next
            $aTrac[$i] = $i ; set to ignore rows already processed [as above]
        EndIf
    Next
EndFunc ;==> __SwapSequence2D

The only difference between tests are the number of rows. All tests start with exactly the same unsorted column - Random() is not used to generate anything. I first ran a test on arrays with 60000 rows and got the following results.

rows = 60000 : columns = 2
6688.33895839469 - optimized
7841.93644329876 - original

rows = 60000 : columns = 3
7058.07167484609 - optimized
8827.27645705228 - original

rows = 60000 : columns = 4
7041.3974421655 - optimized
9846.59108904367 - original

rows = 60000 : columns = 5
7232.11232308906 - optimized
10861.7479931254 - original

rows = 60000 : columns = 6
7330.42961004145 - optimized
11711.8757537952 - original

rows = 60000 : columns = 7
7577.11018069971 - optimized
12746.3836941054 - original

rows = 60000 : columns = 8
7665.39919248121 - optimized
13750.155901145 - original

rows = 60000 : columns = 9
7778.7459686858 - optimized
14646.356136098 - original

rows = 60000 : columns = 10
7974.09256068495 - optimized
15573.6885464625 - original

rows = 60000 : columns = 20
9124.66521753534 - optimized
24638.3847003185 - original

rows = 60000 : columns = 30
10499.2273200693 - optimized
33800.2543199998 - original

rows = 60000 : columns = 40
12268.1106528627 - optimized
43990.3008191317 - original

rows = 60000 : columns = 50
13968.142652187 - optimized
54514.0285547554 - original

rows = 60000 : columns = 60
15417.2697331897 - optimized
63363.1140621195 - original

rows = 60000 : columns = 70
17004.8570533027 - optimized
72884.9505976302 - original

rows = 60000 : columns = 80
18957.5453787915 - optimized
82537.8591323487 - original

rows = 60000 : columns = 90
47320.4118920993 - optimized
92081.8003205757 - original

rows = 60000 : columns = 100
96982.4626228597 - optimized
101614.366276862 - original

I then ran the test with just 6000 rows. The drop in performance at 100 columns no longer occurs, so lack of memory is surely the culprit.

rows = 6000 : columns = 2
546.861900162836 - optimized
652.33079947709 - original

rows = 6000 : columns = 3
563.589647071843 - optimized
727.460407661927 - original

rows = 6000 : columns = 4
577.025358827463 - optimized
797.621929621145 - original

rows = 6000 : columns = 5
586.71107012803 - optimized
872.846552864595 - original

rows = 6000 : columns = 6
603.739516034792 - optimized
961.373284313815 - original

rows = 6000 : columns = 7
622.520825953891 - optimized
1025.90234272186 - original

rows = 6000 : columns = 8
635.311454629581 - optimized
1104.57663146498 - original

rows = 6000 : columns = 9
644.231584825695 - optimized
1183.4802669013 - original

rows = 6000 : columns = 10
658.895211495876 - optimized
1258.08274173415 - original

rows = 6000 : columns = 20
790.026913652426 - optimized
2029.82307904852 - original

rows = 6000 : columns = 30
970.681483676522 - optimized
2796.02815212457 - original

rows = 6000 : columns = 40
1078.21996386515 - optimized
3586.638261654 - original

rows = 6000 : columns = 50
1210.28288640458 - optimized
4362.10894113544 - original

rows = 6000 : columns = 60
1379.41151094694 - optimized
5148.34726584158 - original

rows = 6000 : columns = 70
1524.30692703462 - optimized
5935.81896609783 - original

rows = 6000 : columns = 80
1756.72832209595 - optimized
7171.61976648138 - original

rows = 6000 : columns = 90
1840.17921077798 - optimized
7554.41323104714 - original

rows = 6000 : columns = 100
1998.99597114309 - optimized
8355.73710753047 - original

There are a number of variables at play here and no single test is likely to be able to define exactly when the new method will be faster. It seems that most of the time it is faster, providing you have enough RAM. Describing these findings needs some thought. @jpm thanks for the WinAPI code, I'll have a play with it.

Edited by czardas
Link to comment
Share on other sites

Hi I run the previous test with the new _ArraySort_Beta()

No regression but I Wonder if the default behavior can be , $iPivot = 0 can be equal to 1 whatever is the name of this parameter

Here are the regression testing

>Running:(3.3.14.2):C:\Program Files (x86)\AutoIt3\autoit3.exe "F:\AdmMesnage\_Data\Bureau\_ArraySort_New.au3"    
--> Press Ctrl+Alt+Break to Restart or Ctrl+Break to Stop
Init : WorkingSet = 12832 Kb

Test # 1 : Rows = 512 : Columns = 32768 : Penalty = 0 : WorkingSet = 1532440 Kb
104.571 Sec : _ArraySort($aArray2D, 0, 0, 0, 0, 0) : WorkingSet = 1532460 Kb
24.008 Sec : _ArraySort_Beta($aArray2D, 0, 0, 0, 1, 1) : WorkingSet = 1535444 Kb
Ratio = 4.4

Test # 2 : Rows = 1024 : Columns = 16384 : Penalty = 0 : WorkingSet = 1543268 Kb
124.482 Sec : _ArraySort($aArray2D, 0, 0, 0, 0, 0) : WorkingSet = 1543340 Kb
29.496 Sec : _ArraySort_Beta($aArray2D, 0, 0, 0, 1, 1) : WorkingSet = 1544820 Kb
Ratio = 4.2

Test # 4 : Rows = 4096 : Columns = 4096 : Penalty = 0 : WorkingSet = 1566736 Kb
140.798 Sec : _ArraySort($aArray2D, 0, 0, 0, 0, 0) : WorkingSet = 1566644 Kb
26.715 Sec : _ArraySort_Beta($aArray2D, 0, 0, 0, 1, 1) : WorkingSet = 1567060 Kb
Ratio = 5.3

+>08:13:05 AutoIt3.exe ended.rc:0
+>08:13:05 AutoIt3Wrapper Finished.
>Exit code: 0    Time: 631.2

 

Link to comment
Share on other sites

Do you mean setting the new method as the default method for 2D arrays? Setting the final (pivot) parameter to 1 would make pivot sort the default behaviour for 1D arrays, so unless that is the intention, I would be inclined to make a small change within the function instead. I think it is probably better to keep this new method as optional for the time being. See this earlier post : https://www.autoitscript.com/forum/topic/182168-optimized-_arraysort/?do=findComment&comment=1309021

Link to comment
Share on other sites

  • Moderators

czardas & jpm.

I would certainly feel happier if the Pivot for 1D and czardas' method for 2D were the optional choices - at least until we have a lot more confidence in their real-world performance.

M23

Public_Domain.png.2d871819fcb9957cf44f4514551a2935.png Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind

Open spoiler to see my UDFs:

Spoiler

ArrayMultiColSort ---- Sort arrays on multiple columns
ChooseFileFolder ---- Single and multiple selections from specified path treeview listing
Date_Time_Convert -- Easily convert date/time formats, including the language used
ExtMsgBox --------- A highly customisable replacement for MsgBox
GUIExtender -------- Extend and retract multiple sections within a GUI
GUIFrame ---------- Subdivide GUIs into many adjustable frames
GUIListViewEx ------- Insert, delete, move, drag, sort, edit and colour ListView items
GUITreeViewEx ------ Check/clear parent and child checkboxes in a TreeView
Marquee ----------- Scrolling tickertape GUIs
NoFocusLines ------- Remove the dotted focus lines from buttons, sliders, radios and checkboxes
Notify ------------- Small notifications on the edge of the display
Scrollbars ----------Automatically sized scrollbars with a single command
StringSize ---------- Automatically size controls to fit text
Toast -------------- Small GUIs which pop out of the notification area

 

Link to comment
Share on other sites

4 hours ago, czardas said:

Do you mean setting the new method as the default method for 2D arrays? Setting the final (pivot) parameter to 1 would make pivot sort the default behaviour for 1D arrays, so unless that is the intention, I would be inclined to make a small change within the function instead. I think it is probably better to keep this new method as optional for the time being. See this earlier post : https://www.autoitscript.com/forum/topic/182168-optimized-_arraysort/?do=findComment&comment=1309021

In fact i was referring to be 1 for 2D and perhaps 0 for 1D.

May be I don't fully understead what would be the change (I just look at the fantastic perf improvement >4)

Melba23 seems to understand better that the change can have real drawback

Do what you think is the best

Cheers

Link to comment
Share on other sites

With the suggestions from @Melba23 and @Bowmore (quoted below) there should be no real drawback.

On 5/5/2016 at 0:54 AM, Bowmore said:

If you you were to submit this as a replacement for the standard arraySort, then I think that adding an extra parameter where the default behaviour was to replicate the stable sort of the current function. Users could the set the parameter to the go faster mode when speed was more important than having a stable sort.

I will add a description for this shortly: I've been busily distracted today. The input in this thread has been very helpful. I am inclined to go with the above advice, along with a few more details in the description. The number of columns is an important factor.

Link to comment
Share on other sites

7 hours ago, jpm said:

In fact i was referring to be 1 for 2D and perhaps 0 for 1D.

The new method is for 2D only. The idea of sharing the final parameter is just to simplify the syntax. If that is likely to cause confusion, another parameter can be added instead. I personally prefer it as it is.

Edited by czardas
Link to comment
Share on other sites

I have just discovered the name for this procedure: it is a variant of 'Tag Sort'.

Quote

A sorting procedure in which the key fields are sorted first to create the correct order, and then the actual data records are placed into that order.

The term doesn't seem to appear so frequently and is often given as a side note within a larger text. It's a cool name! :)

Edited by czardas
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...