czardas Posted August 25, 2017 Posted August 25, 2017 (edited) 19 hours ago, AspirinJunkie said: I'm not sure if this was adressed to me I was not aiming my comment at you or anyone in particular. This is the approach I would take in particular cases (as mentioned). I thought my comments might be potentially valuable to some readers. In fact a combination of both approaches also seems doable. It really does depend on the array discrepancies (which we all seem to agree about). Today I'm busy. I'll try it some time during the week. Edited August 25, 2017 by czardas operator64 ArrayWorkshop
czardas Posted August 27, 2017 Posted August 27, 2017 (edited) As an experiment, I have reinvented the binary search algorithm. I'm sure this method has been tried before and it might turn out to be a fruitless exercise. However - given the idea that many duplicate file names appear in some backup folder along with various differences, using the standard binary search algorithm might not necessarily be the most optimal approach. This code has not been thoroughly tested. CODE DELETED See here: https://www.autoitscript.com/forum/topic/190340-parallel-exponential-search-algorithm/ This is a proof of concept and will need further adaptation to work with all unicode strings. Incremental of powers of two are used for the binary search (not division by two). Numbers are used here for testing the algorithm. It's also a parallel search. I might have overlooked something. Edited September 12, 2017 by czardas operator64 ArrayWorkshop
czardas Posted August 28, 2017 Posted August 28, 2017 (edited) I have just run a couple of precursory tests so far, using the following similar arrays: #include <Array.au3> Local $aOne[50001] For $i = 0 To 50000 $aOne[$i] = Hex($i, 8) Next _ArrayShuffle($aOne) Local $aTwo = $aOne _ArrayReverse($aTwo) ReDim $aOne[35000] ReDim $aTwo[35000] _ArraySort($aOne) _ArraySort($aTwo) _ArrayDisplay($aOne) _ArrayDisplay($aTwo) It appears that binary search is less efficient than a straight forward linear search on these arrays. A parallel linear search might be best when there are more matches. Current code differences produce unfair tests, so I haven't got a straight forward answer yet. I can still imagine how these methods will compare given a variety of starting conditions. Inverted/exponential binary search needs testing separately against the standard binary search (outside of this context). Again varied starting conditions will play an important role. Test results were as follows: Parallel Binary Search: 603.656865104227 Average Binary search: 2276.60675885885 Average Linear scan: 160.069939565546 Edited August 28, 2017 by czardas operator64 ArrayWorkshop
czardas Posted September 12, 2017 Posted September 12, 2017 An improved version of the code I posted earlier can be found here: operator64 ArrayWorkshop
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now