czardas Posted September 12, 2017 Posted September 12, 2017 (edited) Haven't had much time to code recently. However the following thread inspired me. The debate about linear, parallel and binary search methods was rather interesting and, in an attempt to be diplomatic, I decided to combine @jchd's suggestion with @LarsJ's binary search example. I decided that the binary search algorithm required modification to make it more linear. As usual, 'if you invent something, it probably already exists and if it already exists, it exists for a reason'. My first attempt was not all that good. The code worked but was really a mess. I blame peer pressure (to post an example of a parallel search method). I will delete that old code in due course. With a little memory jogging and a glance at the help file, the solution turned out to be quite easy: I just needed a better understanding of Euler. Further modification will be needed to work with more complicated unicode strings. The output could be returned as an array or a delimitered string. I'm not so interested in those details. I'm just going to post the algorithm for now and anyone, who wants to, can modify it to suit their needs. Both arrays must contain at least 1 element. expandcollapse popupLocal $aFoo = [0,1,2,3,4,5,6,7,9,10,11,12,13,14,15,16,19,20,23,24,26,30,35,39,40,41] Local $aBar = [0,1,5,6,7,8,9,10,11,12,13,14,17,18,19,21,24,25,26,27,34,35,38,40] ParallelExponetialSearch($aFoo, $aBar) ; Compares two lists - returning positive matches. Each input array must be unique (individually) and in alphabetical order. Func ParallelExponetialSearch($aFoo, $aBar) Local $sFind, _ $iMin_F = -1, $iMax_F = UBound($aFoo) -1, $Lo_F = $iMin_F, $Hi_F, _ $iMin_B = -1, $iMax_B = UBound($aBar) -1, $Lo_B = $iMin_B, $Hi_B While $iMin_F < $iMax_F And $iMin_B < $iMax_B ; Toggle Arrays - Which array has most untested elements? This is the one we want to search next, ; so we can bypass more comparisons because (in theory) mismatches have a greater chance of being skipped. If $iMax_F - $iMin_F >= $iMax_B - $iMin_B Then ; $aFoo has more (or an equal number of) untested elements $Hi_F = $iMax_F $iMin_B += 1 $sFind = $aBar[$iMin_B] While $Lo_F < $Hi_F ; search $aFoo For $i = 0 To Floor(Log($Hi_F - $Lo_F) / Log(2)) $Lo_F = $iMin_F + 2^$i If $aFoo[$Lo_F] = $sFind Then $iMin_F = $Lo_F ; each match should be added to the output [perhaps an array] ConsoleWrite($sFind & " found at $aFoo[" & $Lo_F & "] = $aBar[" & $iMin_B & "]" & @LF) ExitLoop 2 ElseIf $aFoo[$Lo_F] > $sFind Then $Hi_F = $Lo_F -1 $iMin_F += Floor(2^($i -1)) $Lo_F = $iMin_F ContinueLoop 2 EndIf Next $iMin_F = $Lo_F ; minimum increment is one WEnd Else ; $aBar has more untested elements $Hi_B = $iMax_B $iMin_F += 1 $sFind = $aFoo[$iMin_F] While $Lo_B < $Hi_B ; search $aBar For $i = 0 To Floor(Log($Hi_B - $Lo_B) / Log(2)) $Lo_B = $iMin_B + 2^$i If $aBar[$Lo_B] = $sFind Then $iMin_B = $Lo_B ; each match should be added to the output [perhaps an array] ConsoleWrite($sFind & " found at $aFoo[" & $iMin_F & "] = $aBar[" & $Lo_B & "]" & @LF) ExitLoop 2 ElseIf $aBar[$Lo_B] > $sFind Then $Hi_B = $Lo_B -1 $iMin_B += Floor(2^($i -1)) $Lo_B = $iMin_B ContinueLoop 2 EndIf Next $iMin_B = $Lo_B ; minimum increment is one WEnd EndIf WEnd EndFunc ;==> ParallelExponetialSearch I hope this will be useful to someone. I believe it deserved a thread of its own! Edited September 14, 2017 by czardas rootx 1 operator64 ArrayWorkshop
czardas Posted September 14, 2017 Author Posted September 14, 2017 (edited) Time for a comparison. Searching within $aBar (for each element in $aFoo) using the standard binary search algorithm, is certain to involve many more comparisons when there are numerous positive matches. In the following test, 3/4 of the elements happen to occur in both arrays. The standard binary search method requires about eight times as many comparisons as the parallel exponential search. If these comparisons were more complex - such as StringComare() - then latency will become obvious. expandcollapse popup#include <Array.au3> Global $iComparisons = 0 ConsoleWrite("generating arrays" & @LF) Local $aFoo[5000] For $i = 0 To 4999 $aFoo[$i] = Hex($i, 8) Next _ArrayShuffle($aFoo) Local $aBar = $aFoo _ArrayReverse($aBar) ReDim $aFoo[4000] ReDim $aBar[4000] _ArraySort($aFoo) _ArraySort($aBar) ConsoleWrite("running tests" & @LF) ParallelExponetialSearch($aFoo, $aBar) ConsoleWrite("ParallelExponetialSearch ==> $iComparisons = " & $iComparisons & @LF) $iComparisons = 0 StandardBinarySearch($aFoo, $aBar) ConsoleWrite("StandardBinarySearch ==> $iComparisons = " & $iComparisons & @LF) Func ParallelExponetialSearch($aFoo, $aBar) Local $sFind, _ $iMin_F = -1, $iMax_F = UBound($aFoo) -1, $Lo_F = $iMin_F, $Hi_F, _ $iMin_B = -1, $iMax_B = UBound($aBar) -1, $Lo_B = $iMin_B, $Hi_B While $iMin_F < $iMax_F And $iMin_B < $iMax_B ; Toggle Arrays - Which array has most untested elements? This is the one we want to search next, ; so we can bypass more comparisons because (in theory) mismatches have a greater chance of being skipped. If $iMax_F - $iMin_F >= $iMax_B - $iMin_B Then ; $aFoo has more (or an equal number of) untested elements $Hi_F = $iMax_F $iMin_B += 1 $sFind = $aBar[$iMin_B] While $Lo_F < $Hi_F ; search $aFoo For $i = 0 To Floor(Log($Hi_F - $Lo_F) / Log(2)) $Lo_F = $iMin_F + 2^$i $iComparisons += 1 If $aFoo[$Lo_F] = $sFind Then $iMin_F = $Lo_F ; each match should be added to the output [perhaps an array] ;ConsoleWrite($sFind & " found at $aFoo[" & $Lo_F & "] = $aBar[" & $iMin_B & "]" & @LF) ExitLoop 2 ElseIf $aFoo[$Lo_F] > $sFind Then $Hi_F = $Lo_F -1 $iMin_F += Floor(2^($i -1)) $Lo_F = $iMin_F ContinueLoop 2 EndIf Next $iMin_F = $Lo_F ; minimum increment is one WEnd Else ; $aBar has more untested elements $Hi_B = $iMax_B $iMin_F += 1 $sFind = $aFoo[$iMin_F] While $Lo_B < $Hi_B ; search $aBar For $i = 0 To Floor(Log($Hi_B - $Lo_B) / Log(2)) $Lo_B = $iMin_B + 2^$i $iComparisons += 1 If $aBar[$Lo_B] = $sFind Then $iMin_B = $Lo_B ; each match should be added to the output [perhaps an array] ;ConsoleWrite($sFind & " found at $aFoo[" & $iMin_F & "] = $aBar[" & $Lo_B & "]" & @LF) ExitLoop 2 ElseIf $aBar[$Lo_B] > $sFind Then $Hi_B = $Lo_B -1 $iMin_B += Floor(2^($i -1)) $Lo_B = $iMin_B ContinueLoop 2 EndIf Next $iMin_B = $Lo_B ; minimum increment is one WEnd EndIf WEnd EndFunc ;==> ParallelExponetialSearch Func StandardBinarySearch($aFoo, $aBar) Local $Lo = 0, $Hi, $iMax_F = UBound($aFoo) -1, $sFind, $iMax_B = UBound($aBar) -1, $iMid For $i = 0 To $iMax_F $Hi = $iMax_B $sFind = $aFoo[$i] $iMid = Int(($Hi + $Lo) / 2) $iComparisons += 1 If $aBar[$Lo] > $sFind Or $aBar[$Hi] < $sFind Then ContinueLoop ; Search While $Lo <= $iMid And $sFind <> $aBar[$iMid] $iComparisons += 1 If $sFind < $aBar[$iMid] Then $Hi = $iMid - 1 Else $Lo = $iMid + 1 EndIf $iMid = Int(($Hi + $Lo) / 2) WEnd If $Lo > $Hi Then ContinueLoop ;ConsoleWrite($sFind & " found at $aFoo[" & $i & "] = $aBar[" & $iMid & "]" & @LF) Next EndFunc The ParallelExponentialSearch() algorithm is frequently going to be a super-efficient method (regardless of language). Results (may vary slightly on subsequent runs): ParallelExponetialSearch ==> $iComparisons = 5109 StandardBinarySearch ==> $iComparisons = 39397 This test is a little rough and ready, but the results are as I would expect. A more accurate test would not make much difference to these results. Edited September 14, 2017 by czardas operator64 ArrayWorkshop
czardas Posted September 14, 2017 Author Posted September 14, 2017 (edited) Unfortunately there was a bug in the code (implementation), which I believe I have now fixed: using a While loop (instead of Do Until). Both the above examples have been modified. Please report if you encounter any problems running this code - thanks. Edited September 14, 2017 by czardas operator64 ArrayWorkshop
RTFC Posted September 16, 2017 Posted September 16, 2017 On 9/12/2017 at 10:25 PM, czardas said: I hope this will be useful to someone. Definitely, many thanks for this. On 9/12/2017 at 10:25 PM, czardas said: I just needed a better understanding of Euler. We all do. czardas 1 My Contributions and Wrappers Spoiler BitMaskSudokuSolver BuildPartitionTable CodeCrypter CodeScanner DigitalDisplay Eigen4AutoIt FAT Suite HighMem MetaCodeFileLibrary OSgrid Pool RdRand SecondDesktop SimulatedAnnealing Xbase I/O
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now