Jump to content
czardas

Parallel Exponential Search Algorithm

Recommended Posts

czardas

Haven't had much time to code recently. However the following thread inspired me.

The debate about linear, parallel and binary search methods was rather interesting and, in an attempt to be diplomatic, I decided to combine @jchd's suggestion with @LarsJ's binary search example. I decided that the binary search algorithm required modification to make it more linear. As usual, 'if you invent something, it probably already exists and if it already exists, it exists for a reason'. My first attempt was not all that good. The code worked but was really a mess. I blame peer pressure (to post an example of a parallel search method). :D I will delete that old code in due course.

With a little memory jogging and a glance at the help file, the solution turned out to be quite easy: I just needed a better understanding of Euler. Further modification will be needed to work with more complicated unicode strings. The output could be returned as an array or a delimitered string. I'm not so interested in those details. I'm just going to post the algorithm for now and anyone, who wants to, can modify it to suit their needs. Both arrays must contain at least 1 element.

Local $aFoo = [0,1,2,3,4,5,6,7,9,10,11,12,13,14,15,16,19,20,23,24,26,30,35,39,40,41]
Local $aBar = [0,1,5,6,7,8,9,10,11,12,13,14,17,18,19,21,24,25,26,27,34,35,38,40]

ParallelExponetialSearch($aFoo, $aBar)

; Compares two lists - returning positive matches. Each input array must be unique (individually) and in alphabetical order.

Func ParallelExponetialSearch($aFoo, $aBar)
    Local $sFind, _
    $iMin_F = -1, $iMax_F = UBound($aFoo) -1, $Lo_F = $iMin_F, $Hi_F, _
    $iMin_B = -1, $iMax_B = UBound($aBar) -1, $Lo_B = $iMin_B, $Hi_B

    While $iMin_F < $iMax_F And $iMin_B < $iMax_B
        ; Toggle Arrays - Which array has most untested elements? This is the one we want to search next,
        ; so we can bypass more comparisons because (in theory) mismatches have a greater chance of being skipped.

        If $iMax_F - $iMin_F >= $iMax_B - $iMin_B Then ; $aFoo has more (or an equal number of) untested elements

            $Hi_F = $iMax_F
            $iMin_B += 1
            $sFind = $aBar[$iMin_B]

            While $Lo_F < $Hi_F ; search $aFoo
                For $i = 0 To Floor(Log($Hi_F - $Lo_F) / Log(2))
                    $Lo_F = $iMin_F + 2^$i

                    If $aFoo[$Lo_F] = $sFind Then
                        $iMin_F = $Lo_F

                        ; each match should be added to the output [perhaps an array]
                        ConsoleWrite($sFind & " found at $aFoo[" & $Lo_F & "] = $aBar[" & $iMin_B & "]" & @LF)
                        ExitLoop 2

                    ElseIf $aFoo[$Lo_F] > $sFind Then
                        $Hi_F = $Lo_F -1
                        $iMin_F += Floor(2^($i -1))
                        $Lo_F = $iMin_F
                        ContinueLoop 2
                    EndIf
                Next
                $iMin_F = $Lo_F ; minimum increment is one
            WEnd

        Else ; $aBar has more untested elements

            $Hi_B = $iMax_B
            $iMin_F += 1
            $sFind = $aFoo[$iMin_F]

            While $Lo_B < $Hi_B ; search $aBar
                For $i = 0 To Floor(Log($Hi_B - $Lo_B) / Log(2))
                    $Lo_B = $iMin_B + 2^$i

                    If $aBar[$Lo_B] = $sFind Then
                        $iMin_B = $Lo_B

                        ; each match should be added to the output [perhaps an array]
                        ConsoleWrite($sFind & " found at $aFoo[" & $iMin_F & "] = $aBar[" & $Lo_B & "]" & @LF)
                        ExitLoop 2

                    ElseIf $aBar[$Lo_B] > $sFind Then
                        $Hi_B = $Lo_B -1
                        $iMin_B += Floor(2^($i -1))
                        $Lo_B = $iMin_B
                        ContinueLoop 2
                    EndIf
                Next
                $iMin_B = $Lo_B ; minimum increment is one
            WEnd
        EndIf

    WEnd
EndFunc ;==> ParallelExponetialSearch

I hope this will be useful to someone. I believe it deserved a thread of its own! :)

Edited by czardas
  • Like 1

Share this post


Link to post
Share on other sites
czardas

Time for a comparison. Searching within $aBar (for each element in $aFoo) using the standard binary search algorithm, is certain to involve many more comparisons when there are numerous positive matches. In the following test, 3/4 of the elements happen to occur in both arrays. The standard binary search method requires about eight times as many comparisons as the parallel exponential search. If these comparisons were more complex - such as StringComare() - then latency will become obvious.

#include <Array.au3>

Global $iComparisons = 0

ConsoleWrite("generating arrays" & @LF)

Local $aFoo[5000]
For $i = 0 To 4999
    $aFoo[$i] = Hex($i, 8)
Next
_ArrayShuffle($aFoo)

Local $aBar = $aFoo
_ArrayReverse($aBar)

ReDim $aFoo[4000]
ReDim $aBar[4000]

_ArraySort($aFoo)
_ArraySort($aBar)

ConsoleWrite("running tests" & @LF)

ParallelExponetialSearch($aFoo, $aBar)
ConsoleWrite("ParallelExponetialSearch ==> $iComparisons = " & $iComparisons & @LF)

$iComparisons = 0
StandardBinarySearch($aFoo, $aBar)
ConsoleWrite("StandardBinarySearch ==> $iComparisons = " & $iComparisons & @LF)

Func ParallelExponetialSearch($aFoo, $aBar)
    Local $sFind, _
    $iMin_F = -1, $iMax_F = UBound($aFoo) -1, $Lo_F = $iMin_F, $Hi_F, _
    $iMin_B = -1, $iMax_B = UBound($aBar) -1, $Lo_B = $iMin_B, $Hi_B

    While $iMin_F < $iMax_F And $iMin_B < $iMax_B
        ; Toggle Arrays - Which array has most untested elements? This is the one we want to search next,
        ; so we can bypass more comparisons because (in theory) mismatches have a greater chance of being skipped.

        If $iMax_F - $iMin_F >= $iMax_B - $iMin_B Then ; $aFoo has more (or an equal number of) untested elements

            $Hi_F = $iMax_F
            $iMin_B += 1
            $sFind = $aBar[$iMin_B]

            While $Lo_F < $Hi_F ; search $aFoo
                For $i = 0 To Floor(Log($Hi_F - $Lo_F) / Log(2))
                    $Lo_F = $iMin_F + 2^$i

                    $iComparisons += 1

                    If $aFoo[$Lo_F] = $sFind Then
                        $iMin_F = $Lo_F

                        ; each match should be added to the output [perhaps an array]
                        ;ConsoleWrite($sFind & " found at $aFoo[" & $Lo_F & "] = $aBar[" & $iMin_B & "]" & @LF)
                        ExitLoop 2

                    ElseIf $aFoo[$Lo_F] > $sFind Then
                        $Hi_F = $Lo_F -1
                        $iMin_F += Floor(2^($i -1))
                        $Lo_F = $iMin_F
                        ContinueLoop 2
                    EndIf
                Next
                $iMin_F = $Lo_F ; minimum increment is one
            WEnd

        Else ; $aBar has more untested elements

            $Hi_B = $iMax_B
            $iMin_F += 1
            $sFind = $aFoo[$iMin_F]

            While $Lo_B < $Hi_B ; search $aBar
                For $i = 0 To Floor(Log($Hi_B - $Lo_B) / Log(2))
                    $Lo_B = $iMin_B + 2^$i

                    $iComparisons += 1

                    If $aBar[$Lo_B] = $sFind Then
                        $iMin_B = $Lo_B

                        ; each match should be added to the output [perhaps an array]
                        ;ConsoleWrite($sFind & " found at $aFoo[" & $iMin_F & "] = $aBar[" & $Lo_B & "]" & @LF)
                        ExitLoop 2

                    ElseIf $aBar[$Lo_B] > $sFind Then
                        $Hi_B = $Lo_B -1
                        $iMin_B += Floor(2^($i -1))
                        $Lo_B = $iMin_B
                        ContinueLoop 2
                    EndIf
                Next
                $iMin_B = $Lo_B ; minimum increment is one
            WEnd
        EndIf

    WEnd
EndFunc ;==> ParallelExponetialSearch


Func StandardBinarySearch($aFoo, $aBar)
    Local $Lo = 0, $Hi, $iMax_F = UBound($aFoo) -1, $sFind, $iMax_B = UBound($aBar) -1, $iMid
    For $i = 0 To $iMax_F
        $Hi = $iMax_B
        $sFind = $aFoo[$i]

        $iMid = Int(($Hi + $Lo) / 2)

        $iComparisons += 1

        If $aBar[$Lo] > $sFind Or $aBar[$Hi] < $sFind Then ContinueLoop
        ; Search
        While $Lo <= $iMid And $sFind <> $aBar[$iMid]

            $iComparisons += 1

            If $sFind < $aBar[$iMid] Then
                $Hi = $iMid - 1
            Else
                $Lo = $iMid + 1
            EndIf
            $iMid = Int(($Hi + $Lo) / 2)
        WEnd
        If $Lo > $Hi Then ContinueLoop

        ;ConsoleWrite($sFind & " found at $aFoo[" & $i & "] = $aBar[" & $iMid & "]" & @LF)
    Next
EndFunc

The ParallelExponentialSearch() algorithm is frequently going to be a super-efficient method (regardless of language).

Results (may vary slightly on subsequent runs):

ParallelExponetialSearch ==> $iComparisons = 5109
StandardBinarySearch ==> $iComparisons = 39397

This test is a little rough and ready, but the results are as I would expect. A more accurate test would not make much difference to these results.

Edited by czardas

Share this post


Link to post
Share on other sites
czardas

Unfortunately there was a bug in the code (implementation), which I believe I have now fixed: using a While loop (instead of Do Until). Both the above examples have been modified. Please report if you encounter any problems running this code - thanks.

Edited by czardas

Share this post


Link to post
Share on other sites
RTFC
On 9/12/2017 at 10:25 PM, czardas said:

I hope this will be useful to someone.

Definitely, many thanks for this.:D

On 9/12/2017 at 10:25 PM, czardas said:

I just needed a better understanding of Euler.

We all do.;)

  • Like 1

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

  • Similar Content

    • Skeletor
      By Skeletor
      Hi Virtual People,
      My array works perfectly fine. However, what is the best practice if the line in the array doesn't have the correct amount of columns and if I can add a placeholder?

       
      For $count = 1 To _FileCountLines($FileRead1) Step 1 $string = FileReadLine($FileRead1, $count) $input = StringSplit($string, ",", 1) $value1 = $input[1] $value2 = $input[2] $value3 = $input[3] _Excel_RangeWrite($oWorkbook, $oWorkbook.Activesheet, $value2, "A1") _Excel_RangeWrite($oWorkbook, $oWorkbook.Activesheet, $value1, "B1") _Excel_RangeWrite($oWorkbook, $oWorkbook.Activesheet, $value3, "C1") Next  
    • MrCheese
      By MrCheese
      hi all,
      reviewing the forum, this thread is applicable: 
       
       
      I wanted to know if there is now a better way to do this?
      In essence, I load a tab delimited txt file into an array (works well). I used tab, as some fields in the original csv contains commas.
      However, I needed autoit to manipulate this array, and output it as a csv.
      IF my array contains items with a comma, without double quotes around the field, then how best do I get a csv out of this?
      My current workaround is to filewritefromarray tab delimited, then open it in excel and save as a csv. I will need to check this to see how the address fields behave that contain a comma.
       
      Any thoughts would be appreciated.
       
    • Skeletor
      By Skeletor
      Hi All,

      I would like to know how you would take a FileLineRead and insert it into an array which then inserts it into Excel?
      One thing to know is the files content is broken up, so I only use half of the content within $FileRead1.
      So its imperative that the $value1, $value2, etc variables be used. 
      Code below:
      $FileRead1 = FileReadLine("C:\temp\sample.txt",1) For $count = 1 To _FileCountLines($FileRead1) Step 1 $string = FileReadLine($FileRead1, $count) $input = StringSplit($string, ",", 1) $value1 = $input[1] $value2 = $input[2] $value3 = $input[3] $value4 = $input[4] _Excel_RangeWrite($oWorkbook, $oWorkbook.Activesheet, $value1, "A1") _Excel_RangeWrite($oWorkbook, $oWorkbook.Activesheet, $value2, "B1") _Excel_RangeWrite($oWorkbook, $oWorkbook.Activesheet, $value3, "C1") _Excel_RangeWrite($oWorkbook, $oWorkbook.Activesheet, $value4, "D1") Next  
    • AnonymousX
      By AnonymousX
      Hello,
      I'm trying to write a script that moves copies excel cells into an array. I'll than manipulate the values and send array into another program. 
      I don't want range to be specific to a workbook, or sheet, or set of cells.
      I want user to be able to highlight desired cells and to copy either normally ("Ctrl+C") or by a hotkey ("Alt+C"). 
      Could someone help me with this?
      Thank you,
      I've tried to write the framework: (edited)
      #include <MsgBoxConstants.au3> #include <Array.au3> #include <Excel.au3> HotKeySet("!v", "Pastedata") While True Sleep(1000) WEnd func Makearray() local $bArray ;User has cells already copied ;Convert clipboard into an array ;I don;t know how excel stores data to clipboard so don;t know how to bring it into array _Arraydisplay($bArray) MsgBox(0,0,$bArray) return $bArray endfunc func Pastedata() Local $aArray MsgBox(0,0,"wait",1) ;make array based on assumption user has already copied a range to clipboard $aArray = Makearray() ;paste code ;don;t worry about this I got the rest endfunc  
    • Dzenan03
      By Dzenan03
      I want to make a while loop, that creates variables based on a array. For thist I created the array $iDsO with the number and the name of folders in an other folder. Every folder has a different name an I want to create variables(arrays) for each folder that show me all the files in that folder. For example: I have the Folder \Folder1. In it there are the Folders \1, \2, \3. In 1, 2 and 3 there are some files(.png). The array for Folder1 is $iDsO and now I want to crate the arrays $iDsO1, $iDsO2 and $iDsO3 with the files in them can I make something like this:
      While $iDs > 0 ;$iDs is the number of files in Folder1>> $iDsO[0] $iDs#here should come the Foldername for example '1'# = _FileListtoArray(@ProgramFilesDir&"\Folder1\"&$iDsO[$iDs]) $iDs = $iDs - 1 Wend So that in the End I have three variabels ($iDs1, $iDs2 and $iDs3)
       
      Is this posible or if not what could I do instead ( I don´t know the number of folders in Folder1 in the begining).
×