Jump to content

Speed difference when loop through array by row or by column


Recommended Posts

Hi All,

I wondered whether I could optimise my array manipulation functions any better and came across a discovery that's new to me.

If your array has more rows than columns, then it seems that it's quicker to loop through it column by column rather than row by row.

Here is a demonstration with some timing examples:

$iArrayRows = 400000
$iArrayCols = 2
$iLoopRows = $iArrayRows - 1
$iLoopColumns = $iArrayCols - 1

_ArrayCreateH()
_ArrayCreateH()
_ArrayCreateH()
_ArrayCreateH()
_ArrayCreateH()

ConsoleWrite("---------------" & @CRLF)

_ArrayCreateV()
_ArrayCreateV()
_ArrayCreateV()
_ArrayCreateV()
_ArrayCreateV()

Func _ArrayCreateH()

    Local $aArray[$iArrayRows][$iArrayCols]

    $hTimer = _Timer_Init()

    For $i = 1 To $iLoopRows
        For $ii = 0 To $iLoopColumns
            $aArray[$i][$ii] = "0000000000"
        Next
    Next

    $hTimer = _Timer_Diff($hTimer)

    ConsoleWrite(($hTimer / 1000) & @CRLF)
EndFunc

Func _ArrayCreateV()

    Local $aArray[$iArrayRows][$iArrayCols]

    $hTimer = _Timer_Init()

    For $ii = 0 To $iLoopColumns
        For $i = 1 To $iLoopRows

            $aArray[$i][$ii] = "0000000000"

        Next
    Next

    $hTimer = _Timer_Diff($hTimer)

    ConsoleWrite(($hTimer / 1000) & @CRLF)

EndFunc

Timings show a good saving with the above array size (has drastically more rows than columns):

1.7250778
1.755863
1.7677521
1.7441728
1.7371205
---------------
1.453319
1.4538587
1.4681471
1.4888286
1.4498426

I had no idea that there would be a difference but it seems so obvious now. Is this something that everybody has innately known forever and I'm only just finding out now?

Link to post
Share on other sites

Well, yes and no. You have to be aware that memory storage is linear, so when storing data with dimensionality higher than 1 (2 dims = array/matrix, 3+ = tensor), storage order becomes relevant, for an array or matrix this is either ColMajor (one cell in all rows per column before the next column, so rows iteration is the minor loop) or RowMajor (one cell in all columns per row before the next row, so cols iteration is the minor loop). In both cases, if you perform cell I/O in the other order it involves repeated jumping back and forth in linear memory space, which likely implies more memory paging swaps (esp. for large data sets). AutoIt arrays are more complicated because they contain pointers to data of any type instead of contiguous raw data of a single type, but the basic idea is the same. Different programming environments may opt for either storage order; efficiency depends on how the data are most commonly accessed. In a computational environment such as Eigen, the default is ColMajor, but RowMajor order can be imposed if necessary. More info is here.

Quote

Algorithms that traverse a matrix row by row will go faster when the matrix is stored in row-major order because of better data locality. Similarly, column-by-column traversal is faster for column-major matrices. It may be worthwhile to experiment a bit to find out what is faster for your particular application.

BTW, if you're manipulating large numerical data sets and are worried about speed, you might want to consider using matrices instead of arrays (AutoIt UDF here).

Edited by RTFC
Link to post
Share on other sites

??...  it seems that the difference in speed is not due to the different access mode of the elements of the array, but to the greater number of crossings of the different nesting levels of the two "for next" loops In fact, if you comment the two commands to access the array, you will see that the time difference persists.

$iArrayRows = 400000
$iArrayCols = 2
$iLoopRows = $iArrayRows - 1
$iLoopColumns = $iArrayCols - 1

_ArrayCreateH()
_ArrayCreateH()
_ArrayCreateH()
_ArrayCreateH()
_ArrayCreateH()

ConsoleWrite("---------------" & @CRLF)

_ArrayCreateV()
_ArrayCreateV()
_ArrayCreateV()
_ArrayCreateV()
_ArrayCreateV()

Func _ArrayCreateH()

    Local $aArray[$iArrayRows][$iArrayCols]

    $hTimer = TimerInit()

    For $i = 1 To $iLoopRows
        For $ii = 0 To $iLoopColumns
            ; $aArray[$i][$ii] = "0000000000"
        Next
    Next

    $hTimer = TimerDiff($hTimer)

    ConsoleWrite(($hTimer / 1000) & @CRLF)
EndFunc   ;==>_ArrayCreateH

Func _ArrayCreateV()

    Local $aArray[$iArrayRows][$iArrayCols]

    $hTimer = TimerInit()

    For $ii = 0 To $iLoopColumns
        For $i = 1 To $iLoopRows

            ; $aArray[$i][$ii] = "0000000000"

        Next
    Next

    $hTimer = TimerDiff($hTimer)

    ConsoleWrite(($hTimer / 1000) & @CRLF)

EndFunc   ;==>_ArrayCreateV

 

small minds discuss people average minds discuss events great minds discuss ideas.... and use AutoIt....

Link to post
Share on other sites

In the first example, you initialize 400,001 times a For loop.
In the second example you initialize a For loop 3 times.

I therefore suspect that the time difference is rather caused by the overhead of the for-loop initialization.

What RTFC says is true in principle, but since the pointer array in AutoIt requires jumping back and forth in memory each time, I think that no consecutive elements can be cast there.

Edited by AspirinJunkie
Link to post
Share on other sites

I think that speaks for itself :

$iArray1 = 400000
$iArray2 = 2

_ArrayCreate1()

ConsoleWrite("---------------" & @CRLF)

_ArrayCreate2()

ConsoleWrite("---------------" & @CRLF)

Func _ArrayCreate1()

  Local $aArray[$iArray1][$iArray2]

  $hTimer = TimerInit()
  For $i = 0 To $iArray1 - 1
    For $j = 0 To $iArray2 - 1
      ; $aArray[$i][$j] = "0000000000"
    Next
  Next
  $hTimer = TimerDiff($hTimer)
  ConsoleWrite(($hTimer / 1000) & @CRLF)

  $hTimer = TimerInit()
  For $j = 0 To $iArray2 - 1
    For $i = 0 To $iArray1 - 1
      ; $aArray[$i][$j] = "0000000000"
    Next
  Next
  $hTimer = TimerDiff($hTimer)
  ConsoleWrite(($hTimer / 1000) & @CRLF)

EndFunc   ;==>_ArrayCreateH

Func _ArrayCreate2()

  Local $aArray[$iArray2][$iArray1]

  $hTimer = TimerInit()
  For $i = 0 To $iArray2 - 1
    For $j = 0 To $iArray1 - 1
      ; $aArray[$i][j] = "0000000000"
    Next
  Next
  $hTimer = TimerDiff($hTimer)
  ConsoleWrite(($hTimer / 1000) & @CRLF)

  $hTimer = TimerInit()
  For $j = 0 To $iArray1 - 1
    For $i = 0 To $iArray2 - 1
      ; $aArray[$i][j] = "0000000000"
    Next
  Next
  $hTimer = TimerDiff($hTimer)
  ConsoleWrite(($hTimer / 1000) & @CRLF)
EndFunc   ;==>_ArrayCreateV

Results :

0.65700933651172
0.0324245133094954
---------------
0.0336796450801218
0.654968878294266
---------------

It is not the way the array is created, but the way the loop uses the array.  As you can imagine, there is more calculations involved when the short loop is second.  When the long loop is second there is less calculation and faster result.

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Recently Browsing   0 members

    No registered users viewing this page.

×
×
  • Create New...