czardas

Parallel Exponential Search Algorithm

4 posts in this topic

#1 ·  Posted (edited)

Haven't had much time to code recently. However the following thread inspired me.

The debate about linear, parallel and binary search methods was rather interesting and, in an attempt to be diplomatic, I decided to combine @jchd's suggestion with @LarsJ's binary search example. I decided that the binary search algorithm required modification to make it more linear. As usual, 'if you invent something, it probably already exists and if it already exists, it exists for a reason'. My first attempt was not all that good. The code worked but was really a mess. I blame peer pressure (to post an example of a parallel search method). :D I will delete that old code in due course.

With a little memory jogging and a glance at the help file, the solution turned out to be quite easy: I just needed a better understanding of Euler. Further modification will be needed to work with more complicated unicode strings. The output could be returned as an array or a delimitered string. I'm not so interested in those details. I'm just going to post the algorithm for now and anyone, who wants to, can modify it to suit their needs. Both arrays must contain at least 1 element.

Local $aFoo = [0,1,2,3,4,5,6,7,9,10,11,12,13,14,15,16,19,20,23,24,26,30,35,39,40,41]
Local $aBar = [0,1,5,6,7,8,9,10,11,12,13,14,17,18,19,21,24,25,26,27,34,35,38,40]

ParallelExponetialSearch($aFoo, $aBar)

; Compares two lists - returning positive matches. Each input array must be unique (individually) and in alphabetical order.

Func ParallelExponetialSearch($aFoo, $aBar)
    Local $sFind, _
    $iMin_F = -1, $iMax_F = UBound($aFoo) -1, $Lo_F = $iMin_F, $Hi_F, _
    $iMin_B = -1, $iMax_B = UBound($aBar) -1, $Lo_B = $iMin_B, $Hi_B

    While $iMin_F < $iMax_F And $iMin_B < $iMax_B
        ; Toggle Arrays - Which array has most untested elements? This is the one we want to search next,
        ; so we can bypass more comparisons because (in theory) mismatches have a greater chance of being skipped.

        If $iMax_F - $iMin_F >= $iMax_B - $iMin_B Then ; $aFoo has more (or an equal number of) untested elements

            $Hi_F = $iMax_F
            $iMin_B += 1
            $sFind = $aBar[$iMin_B]

            While $Lo_F < $Hi_F ; search $aFoo
                For $i = 0 To Floor(Log($Hi_F - $Lo_F) / Log(2))
                    $Lo_F = $iMin_F + 2^$i

                    If $aFoo[$Lo_F] = $sFind Then
                        $iMin_F = $Lo_F

                        ; each match should be added to the output [perhaps an array]
                        ConsoleWrite($sFind & " found at $aFoo[" & $Lo_F & "] = $aBar[" & $iMin_B & "]" & @LF)
                        ExitLoop 2

                    ElseIf $aFoo[$Lo_F] > $sFind Then
                        $Hi_F = $Lo_F -1
                        $iMin_F += Floor(2^($i -1))
                        $Lo_F = $iMin_F
                        ContinueLoop 2
                    EndIf
                Next
                $iMin_F = $Lo_F ; minimum increment is one
            WEnd

        Else ; $aBar has more untested elements

            $Hi_B = $iMax_B
            $iMin_F += 1
            $sFind = $aFoo[$iMin_F]

            While $Lo_B < $Hi_B ; search $aBar
                For $i = 0 To Floor(Log($Hi_B - $Lo_B) / Log(2))
                    $Lo_B = $iMin_B + 2^$i

                    If $aBar[$Lo_B] = $sFind Then
                        $iMin_B = $Lo_B

                        ; each match should be added to the output [perhaps an array]
                        ConsoleWrite($sFind & " found at $aFoo[" & $iMin_F & "] = $aBar[" & $Lo_B & "]" & @LF)
                        ExitLoop 2

                    ElseIf $aBar[$Lo_B] > $sFind Then
                        $Hi_B = $Lo_B -1
                        $iMin_B += Floor(2^($i -1))
                        $Lo_B = $iMin_B
                        ContinueLoop 2
                    EndIf
                Next
                $iMin_B = $Lo_B ; minimum increment is one
            WEnd
        EndIf

    WEnd
EndFunc ;==> ParallelExponetialSearch

I hope this will be useful to someone. I believe it deserved a thread of its own! :)

Edited by czardas
1 person likes this

Share this post


Link to post
Share on other sites

#2 ·  Posted (edited)

Time for a comparison. Searching within $aBar (for each element in $aFoo) using the standard binary search algorithm, is certain to involve many more comparisons when there are numerous positive matches. In the following test, 3/4 of the elements happen to occur in both arrays. The standard binary search method requires about eight times as many comparisons as the parallel exponential search. If these comparisons were more complex - such as StringComare() - then latency will become obvious.

#include <Array.au3>

Global $iComparisons = 0

ConsoleWrite("generating arrays" & @LF)

Local $aFoo[5000]
For $i = 0 To 4999
    $aFoo[$i] = Hex($i, 8)
Next
_ArrayShuffle($aFoo)

Local $aBar = $aFoo
_ArrayReverse($aBar)

ReDim $aFoo[4000]
ReDim $aBar[4000]

_ArraySort($aFoo)
_ArraySort($aBar)

ConsoleWrite("running tests" & @LF)

ParallelExponetialSearch($aFoo, $aBar)
ConsoleWrite("ParallelExponetialSearch ==> $iComparisons = " & $iComparisons & @LF)

$iComparisons = 0
StandardBinarySearch($aFoo, $aBar)
ConsoleWrite("StandardBinarySearch ==> $iComparisons = " & $iComparisons & @LF)

Func ParallelExponetialSearch($aFoo, $aBar)
    Local $sFind, _
    $iMin_F = -1, $iMax_F = UBound($aFoo) -1, $Lo_F = $iMin_F, $Hi_F, _
    $iMin_B = -1, $iMax_B = UBound($aBar) -1, $Lo_B = $iMin_B, $Hi_B

    While $iMin_F < $iMax_F And $iMin_B < $iMax_B
        ; Toggle Arrays - Which array has most untested elements? This is the one we want to search next,
        ; so we can bypass more comparisons because (in theory) mismatches have a greater chance of being skipped.

        If $iMax_F - $iMin_F >= $iMax_B - $iMin_B Then ; $aFoo has more (or an equal number of) untested elements

            $Hi_F = $iMax_F
            $iMin_B += 1
            $sFind = $aBar[$iMin_B]

            While $Lo_F < $Hi_F ; search $aFoo
                For $i = 0 To Floor(Log($Hi_F - $Lo_F) / Log(2))
                    $Lo_F = $iMin_F + 2^$i

                    $iComparisons += 1

                    If $aFoo[$Lo_F] = $sFind Then
                        $iMin_F = $Lo_F

                        ; each match should be added to the output [perhaps an array]
                        ;ConsoleWrite($sFind & " found at $aFoo[" & $Lo_F & "] = $aBar[" & $iMin_B & "]" & @LF)
                        ExitLoop 2

                    ElseIf $aFoo[$Lo_F] > $sFind Then
                        $Hi_F = $Lo_F -1
                        $iMin_F += Floor(2^($i -1))
                        $Lo_F = $iMin_F
                        ContinueLoop 2
                    EndIf
                Next
                $iMin_F = $Lo_F ; minimum increment is one
            WEnd

        Else ; $aBar has more untested elements

            $Hi_B = $iMax_B
            $iMin_F += 1
            $sFind = $aFoo[$iMin_F]

            While $Lo_B < $Hi_B ; search $aBar
                For $i = 0 To Floor(Log($Hi_B - $Lo_B) / Log(2))
                    $Lo_B = $iMin_B + 2^$i

                    $iComparisons += 1

                    If $aBar[$Lo_B] = $sFind Then
                        $iMin_B = $Lo_B

                        ; each match should be added to the output [perhaps an array]
                        ;ConsoleWrite($sFind & " found at $aFoo[" & $iMin_F & "] = $aBar[" & $Lo_B & "]" & @LF)
                        ExitLoop 2

                    ElseIf $aBar[$Lo_B] > $sFind Then
                        $Hi_B = $Lo_B -1
                        $iMin_B += Floor(2^($i -1))
                        $Lo_B = $iMin_B
                        ContinueLoop 2
                    EndIf
                Next
                $iMin_B = $Lo_B ; minimum increment is one
            WEnd
        EndIf

    WEnd
EndFunc ;==> ParallelExponetialSearch


Func StandardBinarySearch($aFoo, $aBar)
    Local $Lo = 0, $Hi, $iMax_F = UBound($aFoo) -1, $sFind, $iMax_B = UBound($aBar) -1, $iMid
    For $i = 0 To $iMax_F
        $Hi = $iMax_B
        $sFind = $aFoo[$i]

        $iMid = Int(($Hi + $Lo) / 2)

        $iComparisons += 1

        If $aBar[$Lo] > $sFind Or $aBar[$Hi] < $sFind Then ContinueLoop
        ; Search
        While $Lo <= $iMid And $sFind <> $aBar[$iMid]

            $iComparisons += 1

            If $sFind < $aBar[$iMid] Then
                $Hi = $iMid - 1
            Else
                $Lo = $iMid + 1
            EndIf
            $iMid = Int(($Hi + $Lo) / 2)
        WEnd
        If $Lo > $Hi Then ContinueLoop

        ;ConsoleWrite($sFind & " found at $aFoo[" & $i & "] = $aBar[" & $iMid & "]" & @LF)
    Next
EndFunc

The ParallelExponentialSearch() algorithm is frequently going to be a super-efficient method (regardless of language).

Results (may vary slightly on subsequent runs):

ParallelExponetialSearch ==> $iComparisons = 5109
StandardBinarySearch ==> $iComparisons = 39397

This test is a little rough and ready, but the results are as I would expect. A more accurate test would not make much difference to these results.

Edited by czardas

Share this post


Link to post
Share on other sites

#3 ·  Posted (edited)

Unfortunately there was a bug in the code (implementation), which I believe I have now fixed: using a While loop (instead of Do Until). Both the above examples have been modified. Please report if you encounter any problems running this code - thanks.

Edited by czardas

Share this post


Link to post
Share on other sites
On 9/12/2017 at 10:25 PM, czardas said:

I hope this will be useful to someone.

Definitely, many thanks for this.:D

On 9/12/2017 at 10:25 PM, czardas said:

I just needed a better understanding of Euler.

We all do.;)

1 person likes this

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now

  • Similar Content

    • Ascer
      By Ascer
      1. Description.
      Udf working with MSDN System.Collections.ArrayList. Allow you to make fast operations on huge arrays, speed is even x10 better than basic _ArrayAdd.  Not prefered for small arrays < 600 items. 2. Requirements
      .NET Framework 1.1 - 4.5 (on this version Microsoft destroy old rules) System Windows 3. Possibilities.
      ;=============================================================================================================== ; UDF Name: List.au3 ; ; Date: 2018-02-17, 10:52 ; Description: Simple udf to create System Collections as ArrayList and make multiple actions on them. ; ; Function(s): _ListCreate -> Creates a new list ; _ListCapacity -> Gets a list size in bytes ; _ListCount -> Gets items count in list ; _ListIsFixedSize -> Get bool if list if fixed size ; _ListIsReadOnly -> Get bool if list is read only ; _ListIsSynchronized -> Get bool if list is synchronized ; _ListGetItem -> Get item on index ; _ListSetItem -> Set item on index ; ; _ListAdd -> Add item at end of list ; _ListClear -> Remove all list items ; _ListClone -> Duplicate list in new var ; _ListContains -> Get bool if item is in list ; _ListGetHashCode -> Get hash code for list ; _ListGetRange -> Get list with items between indexs ; _ListIndexOf -> Get index of item ; _ListInsert -> Insert a new item on index ; _ListInsertRange -> Insert list into list on index ; _ListLastIndexOf -> Get index last of item ; _ListRemove -> Remove first found item ; _ListRemoveAt -> Remove item in index ; _ListRemoveRange -> Remove items between indexs ; _ListReverse -> Reverse all items in list ; _ListSetRange -> Set new value for items in range ; _ListSort -> Sort items in list (speed of reading) ; _ListToString -> Get list object name ; _ListTrimToSize -> Remove unused space in list ; ; Author(s): Ascer ;=============================================================================================================== 4. Downloads
      List.au3 5. Examples
      SpeedTest _ArrayAdd vs ListAdd SpeedTest ArraySearch vs ListIndexOf Basic usage - crating guild with members  
    • ronmage
      By ronmage
      So I have a loop that keeps reading data from an array and searching it for the same value. If the value is no there it does work then adds the value to the array to prevent it from doing the same work.
      If _ArraySearch($ID,$filearray[$i]) = -1 Then Work.... _ArrayAdd($ID,$filearray[$i]) EndIf This is in a for loop hence $i
      So what is happening is the code works great for several hours. After a period of time _ArraySearch($ID,$filearray[$i]) will result in -1 even if $ID = $filearray. So it ready as if there is no data in the array. Anyone have this problem? 
       
      Also I am just running in using F5 not compiling it and running it if that makes a difference.
       
    • FMS
      By FMS
      Hello,
      I'm trying to read a binary file to an array but couln't get it to work.
      Also I coul not find any help in the forum around this subject whish was helpfull.
      Is there any way it could be done?
      I tried a lot of ways but maybe somebody know's the right way?
      #AutoIt3Wrapper_Au3Check_Parameters=-d -w 1 -w 2 -w 3 -w 4 -w 5 -w 6 -w 7 #include <File.au3> #include <Array.au3> #include <AutoItConstants.au3> Local $in=FileOpen("TEST_labels.idx1-ubyte",16) ; 16+0=Read binary Local $data = FileRead($in) Local $FileArray = BinaryToString($data,4) ;~ $FileArray = StringSplit($BinarydData, @CRLF, 1+2) ;~ Local $FileArray = StringRegExp($BinarydData, "[^\r\n]+", 3) FileClose($in) _ArrayDisplay($FileArray,"$FileArray","",32) MsgBox($MB_SYSTEMMODAL, "", "$FileArray = " & $FileArray )  
      TEST_labels.idx1-ubyte
    • ur
      By ur
      I am reading a CSV file which is tab seperated as below.
      Local $aArray = FileReadToArray($file) And now, I am splitting this main array record wise so that Array contains internally another arrow to represent each row.
      For $i = 0 to (UBound($aArray) - 1) ;MsgBox(0,"",$aArray[$i]) $aArray[$i] = StringSplit(StringStripCR($aArray[$i]), Chr(9),2);Chr(9) for tab ;_ArrayDisplay($aArray[$i]) Next Afther that, _ArrayDIsplay is able to see the individual internal arrays.
      _ArrayDisplay($aArray[1]) But If I try to access the individual element of it as below.It is not showing any result.
      MsgBox(0,"",$aArray[1][1]) Any suggestion, below is the sample csv file.
      New Text Document.csv
    • Jibsbrown
      By Jibsbrown
      Need some help understanding why the ConsoleWrite works inside 2nd For loop but not out side. Between Audit Wiki, Help file , Forum searching (lots of code reading), and YouTube ( shout out to TutsTeach), I have not been able to find the reason why. 
      $sIniPath = "installLog.ini" ; - Get section name $iniSctionNames = IniReadSectionNames($sIniPath) ; - Get Keys and Vaules For $a = 1 to UBound($iniSctionNames) - 1 $keys = IniReadSection($sIniPath , $iniSctionNames[$a]) For $b = 1 to UBound($keys) - 1 $oldSysInfo = IniRead($sIniPath , $iniSctionNames[1], $keys[$b][0], "") $PntIPInfo = IniRead($sIniPath , $iniSctionNames[2], $keys[$b][0], "") $NewPCInfor = IniRead($sIniPath , $iniSctionNames[3], $keys[$b][0], "") ;ConsoleWrite($oldSysInfo & @LF) Next ;ConsoleWrite($oldSysInfo & @LF) Next ConsoleWrite($oldSysInfo) My intention is to use the variables later for Listboxes. Any explanation, forum post links or whatever would help. Sorry also very very new to Autoit.
      Also here's the ini file.
      [OldSysInfo] 4=192.168.0.4|DESKTOP-RDIU2SN|R90M05Q8 5=192.168.0.5|SD0123456789101|R9WGP9P 6=192.168.0.6|SD0123456789102|R9WGP9PT 3=192.168.0.3|DESKTOP-3RS4LKL|R9WGP9P 23=192.168.0.23|SD0123456789102|MXL1234P5I [PrinterIp] 50=192.168.0.50 48=192.168.0.48 47=192.168.0.47 [NewSysInfo] newPC = SD0123456789adfs|192.168.0.185|2UA1234FTR Thank you for your time.