Sign in to follow this  
Followers 0
nullschritt

detecting a difference between two arrays

44 posts in this topic

I was was wondering if there was a way to quickly spot a difference between two arrays, say I have to arrays and as soon as they mismatch return where.

Now that I think about it I can probably code something that does this, though I don't know if it will be the most efficient way. My idea will be to loop through the arrays and if value from array 1 <> value from array 2 return. Though there might be a better way?

Share this post


Link to post
Share on other sites



With 'looping through arrays' it will be very long if the arrays are large

As on the forum I could only find funcs checking if arrays are the same or not, for personal use I made this to return the differences (with the line #)

Very fast even with arrays having several thousands lines

#include <SQLite.au3>
#include <SQLite.dll.au3>
#include <Array.au3>

; example WindowsConstants.au3 
$file = FileRead(StringRegExpReplace(@Autoitexe, '(.+)\\[^\\]+', "$1") & "\Include\WindowsConstants.au3")
$aLines = StringRegExp($file, '(?m)(^.*)\R?', 3)   ; array 1

$aLines2 = $aLines   ; array 2
For $i = 15 to UBound($aLines)-1 step 100    ; introduce differences
    _ArrayInsert($aLines, $i, "Global Const $WS_FAKE = extra line*" & Ceiling($i/100))
    _ArrayInsert($aLines2, $i+3, "Global Const $WS_EX_FAKE = extra line 2*" & Ceiling($i/100))
    _ArrayInsert($aLines2, $i+6, "Global Const $WM_FAKE = extra line 2*" & Ceiling($i/100) & "bis")
Next
; _ArrayDisplay($aLines)


$res = _ArrayCompareAndGetResults($aLines, $aLines2, 0)
 _ArrayDisplay($res)


Func _ArrayCompareAndGetResults($array1, $array2, $flag = 0)
  If $flag < 0 OR $flag > 2 Then $flag = 0
  Local $array, $aTemp, $iRows, $iColumns
  _SQLite_Startup()
  _SQLite_Open()   ; ':memory:'
  _SQLite_Exec (-1, "CREATE TABLE table1 (id, items1); CREATE TABLE table2 (id, items2);") 
  _SQLite_Exec(-1, "Begin;")
  For $i = 0 to UBound($array1)-1
        _SQLite_Exec(-1, "INSERT INTO table1 VALUES (" & $i & ", " & _SQLite_FastEscape($array1[$i]) & ");")
  Next
  For $i = 0 to UBound($array2)-1
        _SQLite_Exec(-1, "INSERT INTO table2 VALUES (" & $i & ", " & _SQLite_FastEscape($array2[$i]) & ");")
  Next
  _SQLite_Exec(-1, "Commit;")

 Switch $flag
   Case 1
     _SQLite_GetTable2d(-1, "SELECT * FROM table1 " & _
        "WHERE items1 NOT IN (SELECT items2 FROM table2) ;", $array, $iRows, $iColumns)   
   Case 2
     _SQLite_GetTable2d(-1, "SELECT * FROM table2 " & _
        "WHERE items2 NOT IN (SELECT items1 FROM table1) ;", $array, $iRows, $iColumns)  
   Case 0
     _SQLite_GetTable2d(-1, "SELECT * FROM table1 " & _
        "WHERE items1 NOT IN (SELECT items2 FROM table2) ;", $array, $iRows, $iColumns)   
     _SQLite_GetTable2d(-1, "SELECT * FROM table2 " & _ 
        "WHERE items2 NOT IN (SELECT items1 FROM table1) ;", $aTemp, $iRows, $iColumns)  
      Local $n = UBound($array)-1, $m = UBound($aTemp)-1
      Local $s = ($n > $m) ? $n : $m
      Redim $array[$s+1][4]
      For $i = 0 to $m
        $array[$i][2] = $aTemp[$i][0]
        $array[$i][3] = $aTemp[$i][1]
      Next
      $array[0][0] = $n
      $array[0][1] = ""
      $array[0][2] = $m
      $array[0][3] = ""
  EndSwitch
  _SQLite_Close()
  _SQLite_Shutdown()
    Return $array
EndFunc

Share this post


Link to post
Share on other sites

#7 ·  Posted (edited)

Why do you have two of the same arrays?

They arn't always the same. One holds current data, the other holds data after it's been modified. I need to retrieve the point at which the data changed, so that I can update a database, without clearing it out and re-inserting all values. 

 

 

With 'looping through arrays' it will be very long if the arrays are large

As on the forum I could only find funcs checking if arrays are the same or not, for personal use I made this to return the differences (with the line #)

Very fast even with arrays having several thousands lines

 

I'll check it out soon. One array is 1d and the other is 2d, but I can easily scale the 2d down for comparison. Though your function seems redundant. You are still looping through every value of both arrays, in order to place them in the database In fact you're using two separate loops to do it even.

This can be accomplished with 1 loop using if logic, perhaps the way I thought up would be the most efficient.

Edited by nullschritt

Share this post


Link to post
Share on other sites

#8 ·  Posted (edited)

Func _arraygetdifference($array1, $array2)
local $pos= 'Null'
  For $i = 0 to UBound($array1)-1
        if $array1[$i] <> $array2[$i]
            $pos = $i
             exitloop
        endif
  Next

  return $pos
EndFunc

Edited by nullschritt

Share this post


Link to post
Share on other sites

if you control the creation of the arrays, the fastest way would be to do a comparison against the old data as you add a value to the current data.

Share this post


Link to post
Share on other sites

I'm not sure how you are using the arrays exactly, but do you really only want to find the "first" difference and skip the rest?  By your design can there only be one changed value?

My thought on the approach would be to have a two dim array with one dim holding the old values and the other holding the new arrays.  This would allow you to scrub through the whole array and see every change.

Just my two cents, but like I said...I'm not sure the scope so it may not suit your purpose.

Share this post


Link to post
Share on other sites

I'm not sure how you are using the arrays exactly, but do you really only want to find the "first" difference and skip the rest?  By your design can there only be one changed value?

My thought on the approach would be to have a two dim array with one dim holding the old values and the other holding the new arrays.  This would allow you to scrub through the whole array and see every change.

Just my two cents, but like I said...I'm not sure the scope so it may not suit your purpose.

By design my application only allows one change to occur at a time. Also the second array of new data, is returned when an event occurs. The maximum number of rows returned possible is also limited to 25,000 by design, which takes a maximum of about 0.1 seconds to enumerate through. The returned data doesn't always get routed to the same function either, each function keeps track of the previous data for itself, which is what the new returned data is compared against.

Share this post


Link to post
Share on other sites

nullschritt,

I made this func to grab the content and the # of ALL different lines between array1/array2 AND array2/array1

For this the SQlite engine is a lot faster and efficient than the usual ways

Of course if all you need is to know if the arrays are different or not, there is a bunch of simple ArrayCompare example codes on the forum

Share this post


Link to post
Share on other sites

I have to imagine there is more elegant way to handle the change of one item/entity/element (what have you) without having to process an entire array, but without knowing the inner workings I can't really recommend anything.  It just seems inefficient to me, especially when you could have 25,000 results to filter through (even if it does so quickly).  It will always be faster to deal with the single change than have to dig for it. 

Share this post


Link to post
Share on other sites

I have to imagine there is more elegant way to handle the change of one item/entity/element (what have you) without having to process an entire array, but without knowing the inner workings I can't really recommend anything.  It just seems inefficient to me, especially when you could have 25,000 results to filter through (even if it does so quickly).  It will always be faster to deal with the single change than have to dig for it. 

there's no way of locating any point within an array without enumerating it, at some point., even for the function getting the changed data(from a gui control), I would have to enumerate all the values for the control, to check what has changed.

Share this post


Link to post
Share on other sites

#15 ·  Posted (edited)

there's no way of locating any point within an array without enumerating it, at some point., even for the function getting the changed data(from a gui control), I would have to enumerate all the values for the control, to check what has changed.

Understood. What I was eluding to is there may be an alternative method to track individual changes without having to utilize an array.

edit:

I don't disagree that an array is a good way to store data for a series of controls, but rather than having to search through an array a mechanism could be made to tie a control to an index of an array and go straight to a desired value. It just seems overkill and inefficient to me to have to scrub and array for a single change.

Edited by spudw2k

Share this post


Link to post
Share on other sites

#16 ·  Posted (edited)

Func _arraygetdifference($array1, $array2)
local $pos=0
  For $i = 0 to UBound($array1)-1
        if $array1[$i] <> $array2[$i]
            $pos = $i
             exitloop
        endif
  Next

  return $pos
EndFunc

That assumes the arrays are the same size. Edited by guinness

_AdapterConnections()_AlwaysRun()_AppMon()_AppMonEx()_BinaryBin()_CheckMsgBox()_CmdLineRaw()_ContextMenu()_ConvertLHWebColor()/_ConvertSHWebColor()_DesktopDimensions()_DisplayPassword()_DotNet_Load()/_DotNet_Unload()_Fibonacci()_FileCompare()_FileCompareContents()_FileNameByHandle()_FilePrefix/SRE()_FindInFile()_GetBackgroundColor()/_SetBackgroundColor()_GetConrolID()_GetCtrlClass()_GetDirectoryFormat()_GetDriveMediaType()_GetFilename()/_GetFilenameExt()_GetHardwareID()_GetIP()_GetIP_Country()_GetOSLanguage()_GetSavedSource()_GetStringSize()_GetSystemPaths()_GetURLImage()_GIFImage()_GoogleWeather()_GUICtrlCreateGroup()_GUICtrlListBox_CreateArray()_GUICtrlListView_CreateArray()_GUICtrlListView_SaveCSV()_GUICtrlListView_SaveHTML()_GUICtrlListView_SaveTxt()_GUICtrlListView_SaveXML()_GUICtrlMenu_Recent()_GUICtrlMenu_SetItemImage()_GUICtrlTreeView_CreateArray()_GUIDisable()_GUIImageList_SetIconFromHandle()_GUIRegisterMsg()_GUISetIcon()_Icon_Clear()/_Icon_Set()_IdleTime()_InetGet()_InetGetGUI()_InetGetProgress()_IPDetails()_IsFileOlder()_IsGUID()_IsHex()_IsPalindrome()_IsRegKey()_IsStringRegExp()_IsSystemDrive()_IsUPX()_IsValidType()_IsWebColor()_Language()_Log()_MicrosoftInternetConnectivity()_MSDNDataType()_PathFull/GetRelative/Split()_PathSplitEx()_PrintFromArray()_ProgressSetMarquee()_ReDim()_RockPaperScissors()/_RockPaperScissorsLizardSpock()_ScrollingCredits_SelfDelete()_SelfRename()_SelfUpdate()_SendTo()_ShellAll()_ShellFile()_ShellFolder()_SingletonHWID()_SingletonPID()_Startup()_StringCompact()_StringIsValid()_StringRegExpMetaCharacters()_StringReplaceWholeWord()_StringStripChars()_Temperature()_TrialPeriod()_UKToUSDate()/_USToUKDate()_WinAPI_Create_CTL_CODE()_WinAPI_CreateGUID()_WMIDateStringToDate()/_DateToWMIDateString()Au3 script parsingAutoIt SearchAutoIt3 PortableAutoIt3WrapperToPragmaAutoItWinGetTitle()/AutoItWinSetTitle()CodingDirToHTML5FileInstallrFileReadLastChars()GeoIP databaseGUI - Only Close ButtonGUI ExamplesGUICtrlDeleteImage()GUICtrlGetBkColor()GUICtrlGetStyle()GUIEventsGUIGetBkColor()Int_Parse() & Int_TryParse()IsISBN()LockFile()Mapping CtrlIDsOOP in AutoItParseHeadersToSciTE()PasswordValidPasteBinPosts Per DayPreExpandProtect GlobalsQueue()Resource UpdateResourcesExSciTE JumpSettings INISHELLHOOKShunting-YardSignature CreatorStack()Stopwatch()StringAddLF()/StringStripLF()StringEOLToCRLF()VSCROLLWM_COPYDATAMore Examples...

Updated: 04/09/2015

Share this post


Link to post
Share on other sites

That assumes the arrays are the same size.

Again by design the two sets of data are always identical in size, it's only data in the values which can change.

Share this post


Link to post
Share on other sites

Understood. What I was eluding to is there may be an alternative method to track individual changes without having to utilize an array.

edit:

I don't disagree that an array is a good way to store data for a series of controls, but rather than having to search through an array a mechanism could be made to tie a control to an index of an array and go straight to a desired value. It just seems overkill and inefficient to me to have to scrub and array for a single change.

Care to share an example?

Share this post


Link to post
Share on other sites

That assumes the arrays are the same size.

 

And even if so, if one insert is done at index 0 in array2, you will get UBound($array) differences whereas only the index changes

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0