sshrum Posted March 5, 2015 Share Posted March 5, 2015 Are there any functions for analyzing 2 arrays...like to find similar and/or to find dissimilar entries and have that output to a 3rd array short of doing it myself? I'm creating a DIR-cmd array (array 1) that would be compared to a database-created array (array 2). I'm trying to create a new 3rd array of any entries not in the database that are in the DIR-cmd array (new files). Subsequently, I then want to do the same sort of thing but create another array that would list files in the database array that are not in the DIR array (deleted files) TIA Sean Shrum :: http://www.shrum.net All my published AU3-based apps and utilities 'Make it idiot-proof, and someone will make a better idiot' Link to comment Share on other sites More sharing options...
mpower Posted March 5, 2015 Share Posted March 5, 2015 (edited) Not sure if there is a simpler way than just doing a loop which searches every value in the generated array against your control array. For example: #include <Array.au3> Local $control_Array = StringSplit("apple,bread,dog,cat,engine,frog,giant,horse,indigo", ",") _ArraySort($control_Array) Local $generated_Array = StringSplit("zed,bread,yale,cat,engine,kite,giant,lion,indigo", ",") Dim $final_Array[1] Local $j = 0, $i, $r For $i = 1 to Ubound($generated_Array) - 1 $r = _ArrayBinarySearch($control_Array, $generated_Array[$i], 1) If $r <> - 1 Then ReDim $final_Array[Ubound($final_Array)+1] $final_Array[$j] = $generated_Array[$i] $j += 1 EndIf Next ;clean up (remove the last empty entry) _ArrayDelete($final_Array, Ubound($final_Array) - 1) _ArrayDisplay($final_Array, '$final_Array') Are there any functions for analyzing 2 arrays...like to find similar and/or to find dissimilar entries and have that output to a 3rd array short of doing it myself? I'm creating a DIR-cmd array (array 1) that would be compared to a database-created array (array 2). I'm trying to create a new 3rd array of any entries not in the database that are in the DIR-cmd array (new files). Subsequently, I then want to do the same sort of thing but create another array that would list files in the database array that are not in the DIR array (deleted files) TIA Edited March 5, 2015 by mpower Link to comment Share on other sites More sharing options...
Moderators Melba23 Posted March 5, 2015 Moderators Share Posted March 5, 2015 sshrum,A quick search brought up this thread. M23 Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind Open spoiler to see my UDFs: Spoiler ArrayMultiColSort ---- Sort arrays on multiple columnsChooseFileFolder ---- Single and multiple selections from specified path treeview listingDate_Time_Convert -- Easily convert date/time formats, including the language usedExtMsgBox --------- A highly customisable replacement for MsgBoxGUIExtender -------- Extend and retract multiple sections within a GUIGUIFrame ---------- Subdivide GUIs into many adjustable framesGUIListViewEx ------- Insert, delete, move, drag, sort, edit and colour ListView itemsGUITreeViewEx ------ Check/clear parent and child checkboxes in a TreeViewMarquee ----------- Scrolling tickertape GUIsNoFocusLines ------- Remove the dotted focus lines from buttons, sliders, radios and checkboxesNotify ------------- Small notifications on the edge of the displayScrollbars ----------Automatically sized scrollbars with a single commandStringSize ---------- Automatically size controls to fit textToast -------------- Small GUIs which pop out of the notification area Link to comment Share on other sites More sharing options...
mpower Posted March 5, 2015 Share Posted March 5, 2015 Actually, here is a comparison of a few methods and Scripting Dictionary seems to be extremely fast compared to all other methods. This method is new to me so I am not sure of any limitations. expandcollapse popup#include <Array.au3> Local $control_Array = StringSplit("5,7,8,12,18,19,20,24,25,36,38,40,41,42,46,50,54,61,62,64,65,66,67,68,69,70,71,"& _ "72,76,77,81,84,86,88,89,95,96,99,101,102,103,105,106,113,117,122,125,130,132,137,"& _ "143,145,146,147,156,157,160,163,165,168,172,173,176,178,180,181,193,198,204,205,"& _ "211,218,220,222,225,231,234,235,237,240,241,243,244,249,251,256,257,260,265,267,"& _ "270,271,272,274,275,276,279,280,284,290,291,295,297,299,302,308,309,310,311,324,"& _ "326,327,329,339,345,346,350,356,357,358,360,368,369,370,371,375,376,377,381,390,"& _ "392,393,398,402,403,404,406,407,408,420,421,423,427,430,436,437,446,453,458,460,"& _ "461,467,473,474,475,478,481,484,496,500,502,504,505,507,508,511,514,517,518,520,"& _ "521,529,532,535,541,545,546,547,549,551,552,553,556,557,559,560,573,577,581,582,"& _ "585,587,589,594,604,607,609,615,626,628,634,649,650,655,662,667,668,670,673,675,"& _ "681,683,685,694,696,700,704,712,714,718,720,726,729,732,734,736,737,744,751,760,"& _ "761,763,769,770,776,778,783,795,800,805,806,809,812,814,815,820,822,824,826,827,"& _ "832,841,848,849,850,855,865,868,874,878,879,881,883,885,886,887,888,890,893,894,"& _ "899,901,910,916,917,924,929,934,935,939,942,943,946,947,948,951,953,956,958,959,"& _ "960,966,971,975,980,986,989,990,994,996", ",", 2) _ArraySort($control_Array) Local $generated_Array = StringSplit("1,2,3,12,13,15,17,19,23,26,32,35,42,43,47,49,52,56,58,60,67,68,69,72,73,76,78,"& _ "84,85,86,88,90,91,98,103,104,105,108,113,116,118,119,120,122,123,124,125,129,130,"& _ "134,135,138,142,144,145,163,168,173,174,175,180,181,183,188,189,192,193,194,199,"& _ "200,204,208,217,220,224,227,228,234,238,241,251,252,254,260,274,277,283,287,290,"& _ "292,297,302,305,313,316,322,323,324,328,333,334,337,341,342,344,348,355,357,360,"& _ "363,367,371,373,374,381,382,384,385,389,392,407,408,409,411,413,418,422,424,425,"& _ "430,431,434,444,449,450,451,453,454,457,466,469,475,477,479,480,487,491,495,499,"& _ "500,503,508,510,511,512,513,517,544,546,549,556,560,567,569,570,571,572,576,578,"& _ "580,585,587,595,599,600,601,608,615,618,619,624,627,629,633,636,637,639,642,643,"& _ "646,650,651,659,662,665,670,673,686,689,690,692,693,697,702,704,713,718,720,721,"& _ "724,726,727,731,733,736,739,742,743,746,748,749,751,753,754,759,762,767,770,772,"& _ "773,777,781,783,788,790,794,798,800,812,815,820,824,825,826,827,829,834,835,836,"& _ "837,840,841,842,843,844,851,857,858,866,871,874,880,882,888,889,898,899,901,906,"& _ "907,909,913,914,920,921,924,925,926,927,931,936,937,942,943,945,947,949,952,954,"& _ "956,958,959,960,962,965,975,984,986,994,996", ",", 2) Dim $final_Array[1] Local $j = 0, $i, $r $timer = TimerInit() For $i = 0 to Ubound($generated_Array) - 1 $r = _ArrayBinarySearch($control_Array, $generated_Array[$i]) If $r <> - 1 Then ReDim $final_Array[Ubound($final_Array)+1] $final_Array[$j] = $generated_Array[$i] $j += 1 EndIf Next $tdiff_asb = TimerDiff($timer) _ArrayDisplay($final_Array, '$final_Array _ArrayBinarySearch method') Local $aBoth[0] $timer2 = TimerInit() For $i = ubound($control_Array) - 1 to 0 step -1 $iMatch = _ArraySearch($generated_Array , $control_Array[$i]) If $iMatch <> -1 Then _ArrayAdd($aBoth , $control_Array[$i]) Next For $i = ubound($generated_Array) - 1 to 0 step -1 $iMatch = _ArraySearch($control_Array , $generated_Array[$i]) If $iMatch <> -1 Then _ArrayAdd($aBoth , $generated_Array[$i]) Next $tdiff_as = TimerDiff($timer2) _ArrayDisplay($final_Array, '$aBoth _ArraySearch method') $timer3 = TimerInit() _Separate($control_Array, $generated_Array) $tdiff_sep = TimerDiff($timer3) Func _Separate(ByRef $in0, ByRef $in1) $in0 = _ArrayUnique($in0, 0, Default, Default, 0) $in1 = _ArrayUnique($in1, 0, Default, Default, 0) Local $z[2] = [UBound($in0), UBound($in1)], $low = 1 * ($z[0] > $z[1]), $aTemp[$z[Not $low]][3], $aOut = $aTemp, $aNdx[3] For $i = 0 To $z[Not $low] - 1 If $i < $z[0] Then $aTemp[$i][0] = $in0[$i] If $i < $z[1] Then $aTemp[$i][1] = $in1[$i] Next For $i = 0 To $z[$low] - 1 $x = _ArrayFindAll($aTemp, $aTemp[$i][$low], 0, 0, 1, 0, Not $low) If Not @error Then ; both For $j = 0 To UBound($x) - 1 $aTemp[$x[$j]][2] = 1 Next $aOut[$aNdx[2]][2] = $aTemp[$i][$low] $aNdx[2] += 1 Else ; only in $low $aOut[$aNdx[$low]][$low] = $aTemp[$i][$low] $aNdx[$low] += 1 EndIf Next For $i = 0 To $z[Not $low] - 1 If $aTemp[$i][2] <> 1 Then $aOut[$aNdx[Not $low]][Not $low] = $aTemp[$i][Not $low] $aNdx[Not $low] += 1 EndIf Next ReDim $aOut[_ArrayMax($aNdx)][3] Return $aOut EndFunc ;==>_Separate $timer4 = TimerInit() $sda = ObjCreate("Scripting.Dictionary") $sdb = ObjCreate("Scripting.Dictionary") $sdc = ObjCreate("Scripting.Dictionary") For $i In $control_Array $sda.Item($i) Next For $i In $generated_Array $sdb.Item($i) Next For $i In $control_Array If $sdb.Exists($i) Then $sdc.Item($i) Next $asd3 = $sdc.Keys() $tdiff_scr = TimerDiff($timer4) _ArrayDisplay($asd3, '$asd3 Scripting Dictionary method') ConsoleWrite('_ArrayBinarySearch method took '&Round($tdiff_asb, 2)&' ms'&@CRLF) ConsoleWrite('_ArraySearch method took '&Round($tdiff_as, 2)&' ms'&@CRLF) ConsoleWrite('_Separate method took '&Round($tdiff_sep, 2)&' ms'&@CRLF) ConsoleWrite('Scripting Dictionary method took '&Round($tdiff_scr, 2)&' ms'&@CRLF) Link to comment Share on other sites More sharing options...
sshrum Posted March 5, 2015 Author Share Posted March 5, 2015 The thing to keep in mind is that I'm dealing with 2 arrays with over 70,000 records each. Most of the search options presented, while effective on small sets of data, tend to choke on this many records. Even doing ArraySearch or even ArrayBinarySearch after doing ArraySort still proves too time consuming (let's just say I haven't sat around long enough for it to complete before breaking it). Sean Shrum :: http://www.shrum.net All my published AU3-based apps and utilities 'Make it idiot-proof, and someone will make a better idiot' Link to comment Share on other sites More sharing options...
Moderators Melba23 Posted March 5, 2015 Moderators Share Posted March 5, 2015 sshrum,If your data-sets are that size it sounds as if you need to think about a database solution - especially as one of your sets comes from a database already. M23 Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind Open spoiler to see my UDFs: Spoiler ArrayMultiColSort ---- Sort arrays on multiple columnsChooseFileFolder ---- Single and multiple selections from specified path treeview listingDate_Time_Convert -- Easily convert date/time formats, including the language usedExtMsgBox --------- A highly customisable replacement for MsgBoxGUIExtender -------- Extend and retract multiple sections within a GUIGUIFrame ---------- Subdivide GUIs into many adjustable framesGUIListViewEx ------- Insert, delete, move, drag, sort, edit and colour ListView itemsGUITreeViewEx ------ Check/clear parent and child checkboxes in a TreeViewMarquee ----------- Scrolling tickertape GUIsNoFocusLines ------- Remove the dotted focus lines from buttons, sliders, radios and checkboxesNotify ------------- Small notifications on the edge of the displayScrollbars ----------Automatically sized scrollbars with a single commandStringSize ---------- Automatically size controls to fit textToast -------------- Small GUIs which pop out of the notification area Link to comment Share on other sites More sharing options...
sshrum Posted March 5, 2015 Author Share Posted March 5, 2015 For now I'm running this. At first it's sluggish but it gets faster as it goes as I'm remove entries that match making the search deal with less and less over time.... For $i = $aFiles[0]-1 to 0 step -1 $iMatch = _ArrayBinarySearch($aDatabase , $aFiles[$i], 1) If $iMatch <> -1 Then _Arraydelete($aDatabase, $iMatch) _ArrayDelete($aFiles, $i) ConsoleWrite("=") Else ConsoleWrite("/") EndIf Next _FileWriteFromArray($sPlayer & "\deleted.txt", $aDatabase, 1) _FileWriteFromArray($sPlayer & "\new.txt", $aFiles, 1) Not sure if it's 100% but will see after I wake up tomorrow...hopefully I'll have 2 files with the results I want. Sean Shrum :: http://www.shrum.net All my published AU3-based apps and utilities 'Make it idiot-proof, and someone will make a better idiot' Link to comment Share on other sites More sharing options...
sshrum Posted March 5, 2015 Author Share Posted March 5, 2015 (edited) The database route is still an option...UNION and all but my SQL-foo hasn't been used in awhile. :-P ...actually if someone has a code snippet on doing the union comparisons that would be awesome. Each array is just 1 field with the full pathname. Edited March 5, 2015 by sshrum Sean Shrum :: http://www.shrum.net All my published AU3-based apps and utilities 'Make it idiot-proof, and someone will make a better idiot' Link to comment Share on other sites More sharing options...
mpower Posted March 5, 2015 Share Posted March 5, 2015 (edited) I just tried this: expandcollapse popup#include <Array.au3> #include <File.au3> Global $control_Array, $generated_Array _FileReadToArray(@ScriptDir & '\control_array.txt', $control_Array) _ArraySort($control_Array) _FileReadToArray(@ScriptDir & '\generated_array.txt', $generated_Array) _ArraySort($generated_Array) ConsoleWrite('$control_Array has '&Ubound($control_Array)-1&' items.'&@CRLF) ConsoleWrite('$generated_Array has '&Ubound($generated_Array)-1&' items.'&@CRLF) Dim $final_Array[1] Local $j = 0, $i, $r $timer = TimerInit() For $i = 0 to Ubound($generated_Array) - 1 $r = _ArrayBinarySearch($control_Array, $generated_Array[$i]) If $r <> - 1 Then ReDim $final_Array[Ubound($final_Array)+1] $final_Array[$j] = $generated_Array[$i] $j += 1 EndIf Next _ArrayDelete($final_Array, Ubound($final_Array)-1) $final_Array = _ArrayUnique($final_Array) _ArraySort($final_Array) $tdiff_asb = TimerDiff($timer) $timer2 = TimerInit() $sda = ObjCreate("Scripting.Dictionary") $sdb = ObjCreate("Scripting.Dictionary") $sdc = ObjCreate("Scripting.Dictionary") For $i In $control_Array $sda.Item($i) Next For $i In $generated_Array $sdb.Item($i) Next For $i In $control_Array If $sdb.Exists($i) Then $sdc.Item($i) Next $asd3 = $sdc.Keys() $asd3 = _ArrayUnique($asd3) $tdiff_scr = TimerDiff($timer2) ConsoleWrite('_ArrayBinarySearch method took '&Round($tdiff_asb/1000, 2)&' seconds. Matches found: '&Ubound($final_Array)-1&@CRLF) ConsoleWrite('Scripting Dictionary method took '&Round($tdiff_scr/1000, 2)&' seconds. Matches found: '&Ubound($asd3)-1&@CRLF) My results were: $control_Array has 90000 items. $generated_Array has 77508 items. _ArrayBinarySearch method took 54.58 seconds. Matches found: 17008 Scripting Dictionary method took 1.78 seconds. Matches found: 17008 generated_array.txtcontrol_array.txt Edited March 5, 2015 by mpower Link to comment Share on other sites More sharing options...
jchd Posted March 5, 2015 Share Posted March 5, 2015 You're better use the power of your database engine to perform the comparison instead of relying on pedestrian slower applicative code. Here are some topics which you could use to get started: This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe hereRegExp tutorial: enough to get startedPCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta. SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt) Link to comment Share on other sites More sharing options...
Solution sshrum Posted March 7, 2015 Author Solution Share Posted March 7, 2015 Found it was quick enough for my use to just create a 3rd array and load it with the index values of the entries in the files array that failed _ArrayBinarySearch on the database array. I guess you'd call that a key-reference array. Got the idea from the code snippets above. Thx. Searches against my two 70,000+ entry arrays now takes ~6 seconds. Sean Shrum :: http://www.shrum.net All my published AU3-based apps and utilities 'Make it idiot-proof, and someone will make a better idiot' Link to comment Share on other sites More sharing options...
Zedna Posted March 7, 2015 Share Posted March 7, 2015 (edited) I just tried this: For $i = 0 to Ubound($generated_Array) - 1 $r = _ArrayBinarySearch($control_Array, $generated_Array[$i]) If $r <> - 1 Then ReDim $final_Array[Ubound($final_Array)+1] $final_Array[$j] = $generated_Array[$i] $j += 1 EndIf Next My results were: This is completely non-effective because of doing Redim inside of loop! allocate dimense of final array to the same size as input array BEFORE main loop and at the end AFTER main loop do just one Redim to the correct size. Edited March 7, 2015 by Zedna Resources UDF ResourcesEx UDF AutoIt Forum Search Link to comment Share on other sites More sharing options...
mpower Posted March 7, 2015 Share Posted March 7, 2015 Thanks Zedna! You are absolutely right, moving ReDim outside the loop has increased the functions speed 10-fold!!! Dim $final_Array[Ubound($generated_Array)-1] Local $j = 0, $i, $r For $i = 0 to Ubound($generated_Array) - 1 $r = _ArrayBinarySearch($control_Array, $generated_Array[$i]) If $r <> - 1 Then $final_Array[$j] = $generated_Array[$i] $j += 1 EndIf Next ReDim $final_Array[$j] $final_Array = _ArrayUnique($final_Array) _ArraySort($final_Array) Now this functions can compare the two arrays (one with ~90k rows and other with ~77k rows) in just 5 seconds (previously nearly 55 seconds)!! Still though, the Scripting Dictionary method is faster (1.72 seconds). Link to comment Share on other sites More sharing options...
Zedna Posted March 7, 2015 Share Posted March 7, 2015 (edited) Try SQLite (with memory database), it's very good specially for this kind of tasks. Edited March 7, 2015 by Zedna Resources UDF ResourcesEx UDF AutoIt Forum Search Link to comment Share on other sites More sharing options...
mpower Posted March 8, 2015 Share Posted March 8, 2015 Zedna, is there a way to include SQLite capability without Administrator Rights ? Link to comment Share on other sites More sharing options...
jchd Posted March 8, 2015 Share Posted March 8, 2015 For us AutoIt users, SQLite is nothing more than a simple DLL. So running as admin is no more an issue than running any other non-SQLite script, provided of course that you have the required DLL in some user-reachable place. This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe hereRegExp tutorial: enough to get startedPCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta. SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt) Link to comment Share on other sites More sharing options...
Zedna Posted March 8, 2015 Share Posted March 8, 2015 Thanks Zedna! You are absolutely right, moving ReDim outside the loop has increased the functions speed 10-fold!!! Dim $final_Array[Ubound($generated_Array)-1] Local $j = 0, $i, $r For $i = 0 to Ubound($generated_Array) - 1 $r = _ArrayBinarySearch($control_Array, $generated_Array[$i]) If $r <> - 1 Then $final_Array[$j] = $generated_Array[$i] $j += 1 EndIf Next ReDim $final_Array[$j] $final_Array = _ArrayUnique($final_Array) _ArraySort($final_Array) Now this functions can compare the two arrays (one with ~90k rows and other with ~77k rows) in just 5 seconds (previously nearly 55 seconds)!! Still though, the Scripting Dictionary method is faster (1.72 seconds). You can speed up this little bit by doing local modified copy of Func _ArrayBinarySearch() and removing all not neccessary checking from beginning of that function (IsArray,UBound, $iStart,$iEnd). You can do some checking (for example array boundary) only once before main loop and remove it from Func _ArrayBinarySearch(). Resources UDF ResourcesEx UDF AutoIt Forum Search Link to comment Share on other sites More sharing options...
Gianni Posted March 10, 2015 Share Posted March 10, 2015 (edited) The database route is still an option...UNION and all but my SQL-foo hasn't been used in awhile. :-P ...actually if someone has a code snippet on doing the union comparisons that would be awesome. Each array is just 1 field with the full pathname. I have used this problem to post a possible solution as an example on using my ArraySQL udf in >this post on the "Example Scripts" forum. Have a look if you are interested on trying with sql Edited March 10, 2015 by Chimp Chimp small minds discuss people average minds discuss events great minds discuss ideas.... and use AutoIt.... Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now