Jump to content
Sign in to follow this  
Jewtus

Diff two arrays with a progress bar

Recommended Posts

Jewtus

I'm working with two csv files that I'm parsing into two arrays. I'm then comparing them to find the duplication and remove them from the first array. This works great on 100 or so records, but I'm trying to compare arrays with more than 70,000 records so I wanted to add in a loading bar so I can tell how far/how much longer it will take.

This is my code:

ProgressSet(0,0&"%","Checking already searched")
$aProcess = _ParseCSV($oOutfile,"|","",0)
$aAlreadyChecked = _ParseCSV($AlreadyProcessed,"|","",0)
For $a = UBound($aProcess) -1 to 0 Step -1
                for $b = 0 to UBound($aAlreadyChecked) -1
                                if $aProcess[$a][0] = $aAlreadyChecked[$b][0] Then
                                                _ArrayDelete($aProcess, $a)
                                                MsgBox(0,"",($a-UBound($aProcess)) & @TAB & $b)
                                                ProgressSet(($b/$a),Round($b/$a)&"%","Cleaning up")
                                                ExitLoop
                                EndIf
                Next
Next

I cannot get the percentage logic to show anything that seems rational or accurate. Does anyone know of a more efficient way of doing this or how to fix the progressset to actually show how far in the process it already is?

Share this post


Link to post
Share on other sites
UEZ

Is that working for you?

#include <Array.au3>

Global $array1[100000], $array2[111111], $i, $t, $fProgress
ConsoleWrite("Creating test array... ")
$t = TimerInit()
For $i = 0 To UBound($array1) - 1
    $array1[$i] = Random(0, 100000, 1)
    $array2[$i] = Random(0, 111111, 1)
Next
ConsoleWrite("done in " & Round(TimerDiff($t), 2) & " ms." & @CRLF)

Global $aResult = ArrayCompare($array1, $array2)
ConsoleWrite(UBound($aResult) & @CRLF)
;~ _ArrayDisplay($aResult)


Func ArrayCompare(ByRef $a1, $a2)
    ConsoleWrite("Sorting 2nd array... ")
    Local $t = TimerInit()
    _ArraySort($a2)
    ConsoleWrite("done in " & Round(TimerDiff($t), 2) & " ms." & @CRLF)
    Local $i, $c = 0, $iUB = UBound($a1) > UBound($a2) ? UBound($a1) : UBound($a2), $aNew[$iUB], $iUB = UBound($a1) - 1
    ConsoleWrite("Searching " & $iUB & " elements in " & UBound($a2) - 1 & " elements ... ")
    AdlibRegister("Show_Progress", 500)
    $fProgress = 0
    ProgressOn("Progress Meter", "Be patient, searching for duplicates...", "0%")
    $t = TimerInit()
    For $i = 0 To $iUB
        If _ArrayBinarySearch($a2, String($a1[$i])) > -1 Then
            ContinueLoop
        Else
            $aNew[$c] = $a1[$i]
            $c += 1
        EndIf
        $fProgress = $i / $iUB * 100
    Next
    ReDim $aNew[$c]
    ConsoleWrite("done in " & Round(TimerDiff($t), 2) & " ms." & @CRLF & @CRLF)
    AdlibUnRegister("Show_Progress")
    ProgressOff()
    Return $aNew
EndFunc

Func Show_Progress()
    ProgressSet($fProgress, StringFormat("%.2f %", $fProgress))
EndFunc

Br,

UEZ

Edited by UEZ
  • Like 1

Please don't send me any personal message and ask for support! I will not reply!

Selection of finest graphical examples at Codepen.io

The own fart smells best!
Her 'sikim hıyar' diyene bir avuç tuz alıp koşma!
¯\_(ツ)_/¯  ٩(●̮̮̃•̃)۶ ٩(-̮̮̃-̃)۶ૐ

Share this post


Link to post
Share on other sites
Jewtus

I tried replacing the arrays but they are 2D arrays so I get an error "array variable has incorrect number of subscripts or subscript dimensions range exceeded"

What would I need to do it fix that? I tried this:

Global $i, $t, $fProgress
ConsoleWrite("Creating test array... ")
$t = TimerInit()

Global $aResult = ArrayCompare($aProcess, $aAlreadyChecked)
ConsoleWrite(UBound($aResult) & @CRLF)
_ArrayDisplay($aResult)


Func ArrayCompare(ByRef $a1, $a2)
    ConsoleWrite("Sorting 2nd array... ")
    Local $t = TimerInit()
    _ArraySort($a2)
    ConsoleWrite("done in " & Round(TimerDiff($t), 2) & " ms." & @CRLF)
    Local $i, $c = 0, $iUB = UBound($a1) > UBound($a2) ? UBound($a1) : UBound($a2), $aNew[$iUB], $iUB = UBound($a1) - 1
    ConsoleWrite("Searching " & $iUB & " elements in " & UBound($a2) - 1 & " elements ... ")
    AdlibRegister("Show_Progress", 500)
    $fProgress = 0
    ProgressOn("Progress Meter", "Be patient, searching for duplicates...", "0%")
    $t = TimerInit()
    For $i = 0 To $iUB
        If _ArrayBinarySearch($a2, String($a1[$i][0])) > -1 Then
            ContinueLoop
        Else
            $aNew[$c] = $a1[$i][0]
            $c += 1
        EndIf
        $fProgress = $i / $iUB * 100
    Next
    ReDim $aNew[$c]
    ConsoleWrite("done in " & Round(TimerDiff($t), 2) & " ms." & @CRLF & @CRLF)
    AdlibUnRegister("Show_Progress")
    ProgressOff()
    Return $aNew
EndFunc

Func Show_Progress()
    ProgressSet($fProgress, StringFormat("%.2f %", $fProgress))
EndFunc

which seems to function, but it doesn't seem to be able to see the difference in the two files. The result array ends up being a 1D version of the first array.

Share this post


Link to post
Share on other sites
kylomas

Jewtus,

I'm then comparing them to find the duplication

 

Please define what you mean by "duplication"...

kylomas


Forum Rules         Procedure for posting code

"I like pigs.  Dogs look up to us.  Cats look down on us.  Pigs treat us as equals."

- Sir Winston Churchill

Share this post


Link to post
Share on other sites
Jewtus

Something that exists in both arrays

EX: 

Array1

[1,2,3,4]

Array2

[3,4,5,6]

I want to remove 3 and 4 from array1 because they exist in both lists.

Edited by Jewtus

Share this post


Link to post
Share on other sites
kylomas

Jewtus,

These are 1D arrays.  In post #3 you allude to a 2D aray.  Do you want to eliminate dups anywhere they exist, or, only by column?

kylomas


Forum Rules         Procedure for posting code

"I like pigs.  Dogs look up to us.  Cats look down on us.  Pigs treat us as equals."

- Sir Winston Churchill

Share this post


Link to post
Share on other sites
Jewtus

I want to eliminate the entire row if there is a match in the first column

 

I'm looking at search results and I'm comparing them to a new set of search results, but I'm trying to avoid doing more work on the results that I've already processed.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

  • Similar Content

    • Iceburg
      By Iceburg
      Hi everyone, I'm at best a noobie.  I have read through the Array helps, and specifically the 2D array help file, and I'm struggling to get my code working.
      I have an array that is read from a file, thats working great.  I'm trying to do some math on the array, so I can find the largest, average, lowest, day over day change %, etc.
      The array read working fine, I get 43 lines, line 0 is 44, and then I get data that looks like
      0519 $10,000
      0520 $10,001
      0521 $10,002
      The data in this array is a single 1D array, breaking it out into 2 columns so I can do the math is what I can get to happen.  
      How do I reference the array to store this data?  Second, how do I assign this data to the appropriate row/column?
      Thanks in advance.
      Dim $all_cash_amounts[UBound($aInput)][2] Dim $max_amount_in_account Dim $min_amount_in_account _FileReadToArray($LC_Check_file_path, $aInput) _ArrayDisplay($aInput) local $date = StringRegExp($aInput[1], "(\d\d\d\d)", 1) local $cash = StringRegExp($aInput[1], "\d+\s(-?[0-9\.\,]+)", 1) ConsoleWrite("Date is: " & $date & @CRLF) For $i = 1 To UBound($aInput)-1     $date = StringRegExp($aInput[$i], "(\d\d\d\d)", 1)     $all_cash_amounts[$i][2] = $date[$i][0], $cash[$i][1]      Next _ArrayDisplay($all_cash_amounts)  
    • OldGuyWalking
      By OldGuyWalking
      Given an array with multiple columns that is displayed in a listview,
       ===> What is the fastest/most efficient way to create and manage multiple filters and display results in ListView.
      I have a text file that loads into a listview that has string, numeric, and date columns.  The main file contains about 5100 rows. It's loaded into an array and (in this ListView) it's pre-filtered to display a range of rows based on a start and end date.  On the form I have menu options for various filters. (see below).
      I have options to filter on an "Air Date" column (=Today, >=Today, <=Today) and on a numeric field that is either 1 or 0 that indicate Active or Ended.

      For each filter option I have a prebuilt array that holds a subset of the main array based on a single filter.  For the list above I have the main Array and 5 additional arrays.  None of the arrays are updated since this is for "view only" purposes.  This is a short list and I could have done the filtering "live" but I have several of these forms and so kept the same functionality in each. I have another ListView that displays the complete 5100 row list with 3 filters that, when building the filters live was considerably slower than using prebuilt arrays.
      If I want to expand past simple single column filtering, using an array for each filter becomes cumbersome especially if I want to combine filters using AND & OR.
      The text file I'm working with has 16 columns. If I setup filters for 4 columns and include AND / OR capability that would require prebuilding 24 arrays to cover the various combinations.
      If using the slower method of building a filtered array in real time each time a different filter is selected is the only way to go with this then I'll live with it. It is less overhead. .
      Below is the code I'm currently using to "filter" an array.  My next change was going to add AND / OR functionality (see the info above the header for where I was going with this) .
      ; Description ...: Delete rows from an array and only keep rows that meet the crtieria of identified columns. ; ; Next Change: Add AND/OR to combine filters. Use array to hold multiple criteria and values? ; ; Local $aCriteria[][] = [["",$iColNbr1, $sOperator1, $vValue1], ["AND",$iColNbr2, $sOperator2, $vValue2], ["OR",$iColNbr3, $sOperator3, $vValue3]] ; The first set of criteria ["", $iColNbr1, $sOperator1, $vValue1] must start with a "". ; If anything is entered in that first parameter it will be ignored. ; If the first parameter in any additional criteria set is left blank, or it is not OR, it will default to AND. ; If $aArray is 1 dimension with more than one set of criteria, only the first set will be used. ; Any criteria that uses a column that is less than 0 or higher than the total number of columns in the array will return an error. ; ; Recognized data types for this function are: S (String), D (Date), N (Number). ; ; Recognized Operators are: "EQ", "NEQ", "IN", "GT", "GE", "LT", "LE", "BETWEEN". ; ****** Not all operators work with all data types. ; #FUNCTION# ==================================================================================================================== ; Name ..........: _ArrayFilter ; Description ...: Delete rows from an array and only keep rows that meet the crtieria of identified columns. ; Syntax ........: _ArrayFilter(Byref $aArray[, $iCol = 0[, $sOperator = "EQ"[, $vValue = ""[, $iOptionBase = 0]]]]) ; Parameters ....: $aArray - Array being filtered. ; $iCol - [optional] Column to filter. Default is 0. ; $sOperator - [optional] Operator. Default is "EQ". ; $vValue - [optional] Criteria to compare the column/row value against. ; $iOptionBase - [optional] Starting row. Default is 0. ; Return values .: None ; Author ........: OldGuyWalking ; Modified ......: ; Remarks .......: ; Related .......: ; Link ..........: ; Example .......: No ; =============================================================================================================================== Func _ArrayFilter(ByRef $aArray, $iCol = 0, $sOperator = "EQ", $vValue = "", $iOptionBase = 0) Local $hFunc = _ArrayFilter $vValue = StringStripWS($vValue, 3) If $vValue = "[Today]" Then $vValue = _NowCalcDate() EndIf Local $sMsg Local $sMsgHdr Local $n1 Local $sDeleteIndex Local $aDeleteIndex Local $iCnt = 0 Local $iRows Local $iColMax Local $iDim Local $sData Local $sVType Local $sDType Local $LBound Local $iDiff If $iOptionBase <> 0 Then $iOptionBase = 1 EndIf If _IsValueEmpty($aArray) Then Return SetError(1, 0, "") EndIf $iDim = UBound($aArray, $UBOUND_DIMENSIONS) If $iDim = 1 Then If $iCol <> 0 Then $iCol = 0 EndIf EndIf If $iDim = 2 Then $iColMax = UBound($aArray, $UBOUND_COLUMNS) - 1 If $iCol > $iColMax Or $iCol < 0 Then Return SetError(1, 0, "") EndIf EndIf If Not _IsBetween($iDim, 1, 2) Then ;############### MSG2 - START ############### $sMsgHdr = FuncName($hFunc) & " :Line: " & @ScriptLineNumber & " :Error= " & @error $sMsg = "Invalid Dimensioned Array. Must be a 1 or 2 dimensional array." MsgBox(0, $sMsgHdr, $sMsg) Return SetError(1, 0, "") ;############### MSG2 - END ############### EndIf ; Identify what the value is ; If it is not a String, Int, Number, or Date then skip. Select Case _DateIsValid($vValue) = 1 $sVType = "D" Case IsNumber($vValue) = 1 $sVType = "N" Case IsString($vValue) = 1 $sVType = "S" Case Else ;############### MSG2 - START ############### $sMsgHdr = FuncName($hFunc) & " :Line: " & @ScriptLineNumber & " :Error= " & @error $sMsg = "Comparison value must be a " & @CRLF & _ "1. Date in YYYY/MM/DD format " & @CRLF & _ "2. A string " & @CRLF & _ "3. A number " & @CRLF MsgBox(0, $sMsgHdr, $sMsg) Return SetError(1, 0, "") ;############### MSG2 - END ############### EndSelect $iCnt = 0 For $n1 = UBound($aArray) - 1 To $iOptionBase Step -1 If $iDim = 1 Then $sData = StringStripWS($aArray[$n1], 3) ElseIf $iDim = 2 Then $sData = StringStripWS($aArray[$n1][$iCol], 3) EndIf Select Case _DateIsValid($sData) = 1 $sDType = "D" Case IsNumber($sData) = 1 $sDType = "N" Case IsString($sData) = 1 $sDType = "S" Case Else $sDType = "U" EndSelect If _IsValueEmpty($sData) Then $sDeleteIndex &= $n1 & "," $iCnt += 1 ContinueLoop ; $sDType = $sVType EndIf If Not _IsValueEmpty($sData) And $sDType <> $sVType Then $sDeleteIndex = $sDeleteIndex & $n1 & "," $iCnt += 1 ContinueLoop EndIf Select Case $sOperator = "EQ" Switch $sDType Case "D" $iDiff = _DateDiff("D", $vValue, $sData) If $iDiff = 0 Then ContinueLoop EndIf $sDeleteIndex &= $n1 & "," $iCnt += 1 ContinueLoop Case "S" If $sData = $vValue Then ContinueLoop EndIf $sDeleteIndex &= $n1 & "," $iCnt += 1 ContinueLoop Case "N" If $sData = $vValue Then ContinueLoop EndIf $sDeleteIndex &= $n1 & "," $iCnt += 1 ContinueLoop EndSwitch Case $sOperator = "NEQ" Switch $sDType Case "D" If $sData <> $vValue Then ContinueLoop EndIf $sDeleteIndex &= $n1 & "," $iCnt += 1 ContinueLoop Case "S" If $sData <> $vValue Then ContinueLoop EndIf $sDeleteIndex &= $n1 & "," $iCnt += 1 ContinueLoop Case "N" If $sData <> $vValue Then ContinueLoop EndIf $sDeleteIndex &= $n1 & "," $iCnt += 1 ContinueLoop EndSwitch Case $sOperator = "IN" Switch $sDType Case "S" If StringInStr($sData, $vValue) Then ContinueLoop EndIf $sDeleteIndex &= $n1 & "," $iCnt += 1 ContinueLoop EndSwitch Case $sOperator = "GT" Switch $sDType Case "N" If $sData > $vValue Then ContinueLoop EndIf $sDeleteIndex &= $n1 & "," $iCnt += 1 ContinueLoop Case "D" $iDiff = _DateDiff("D", $vValue, $sData) If $iDiff > 0 Then ContinueLoop EndIf $sDeleteIndex &= $n1 & "," $iCnt += 1 ContinueLoop EndSwitch Case $sOperator = "GE" Switch $sDType Case "N" If $sData >= $vValue Then ContinueLoop EndIf $sDeleteIndex &= $n1 & "," $iCnt += 1 ContinueLoop Case "D" $iDiff = _DateDiff("D", $vValue, $sData) If $iDiff >= 0 Then ContinueLoop EndIf $sDeleteIndex &= $n1 & "," $iCnt += 1 ContinueLoop EndSwitch Case $sOperator = "LT" Switch $sDType Case "N" If $sData < $vValue Then ContinueLoop EndIf $sDeleteIndex &= $n1 & "," $iCnt += 1 ContinueLoop Case "D" $iDiff = _DateDiff("D", $vValue, $sData) If $iDiff < 0 Then ContinueLoop EndIf $sDeleteIndex &= $n1 & "," $iCnt += 1 ContinueLoop EndSwitch Case $sOperator = "LE" Switch $sDType Case "N" If $sData <= $vValue Then ContinueLoop EndIf $sDeleteIndex &= $n1 & "," $iCnt += 1 ContinueLoop Case "D" $iDiff = _DateDiff("D", $vValue, $sData) If $iDiff <= 0 Then ContinueLoop EndIf $sDeleteIndex &= $n1 & "," $iCnt += 1 ContinueLoop EndSwitch EndSelect Next If $iCnt > 0 Then _DeleteArrayRows($aArray, $sDeleteIndex) EndIf EndFunc ;==>_ArrayFilter Thanks in advance.
      OldGuyWalking
    • MyEarth
      By MyEarth
      Hello. I have a 1D array. Is made in this way:
      1. This is a line Messages: a message etc Context: a context etc 2. This is a line Messages: a message etc Context: a context etc 3. This is a line Messages: a message etc Correction: a correction etc Context: a context etc I need to make something like:
      1. This is a line|Messages: a message etc|Context: a context etc 2. This is a line|Messages: a message etc|Context: a context etc 3. This is a line|Messages: a message etc|Correction: a correction|Context: a context etc For exporting in another software. I need to split every time there is a number when the line start, can be 1. until something like 3.976. Since i don't know if there a 2 line after a number or 3 i have opened this thread. Thanks
    • SOF-TECH
      By SOF-TECH
      Dear all,
      Can someone show  me how to en hance the below function to write in CSV  into column  and rows the input values ? 
      I am getting this result: 

      I would like the result to be as this 

      From A1:C1 is for headers
      From A2:C2 is for input Data
      Global Const $GUI_EVENT_CLOSE = -3 $sDataFilePath = @ScriptDir & "\Records.csv" #region ### START Koda GUI section ### Form= $Form1 = GUICreate("Demo1: New Record", 580, 115) $Input1 = GUICtrlCreateInput("", 10, 30, 270, 21) $Input2 = GUICtrlCreateInput("", 300, 30, 270, 21) $Input3 = GUICtrlCreateInput("", 10, 80, 270, 21) $Label1 = GUICtrlCreateLabel("Name:", 10, 10, 35, 17) $Label2 = GUICtrlCreateLabel("ID:", 300, 10, 18, 17) $Label3 = GUICtrlCreateLabel("Phone No:", 10, 60, 55, 17) $Button1 = GUICtrlCreateButton("Save to CSV", 450, 70, 120, 30) GUISetState(@SW_SHOW) #endregion ### END Koda GUI section ### While 1 $nMsg = GUIGetMsg() Switch $nMsg Case $GUI_EVENT_CLOSE Exit Case $Button1 _ExportData() MsgBox(64, @ScriptName, "Record Saved.") EndSwitch WEnd Func _ExportData() If Not FileExists($sDataFilePath) Then FileWriteLine($sDataFilePath, "Name;ID;Phone No.;") EndIf For $i = $Input1 To $Input3 FileWrite($sDataFilePath, GUICtrlRead($i) & ";") Next FileWriteLine($sDataFilePath, "") EndFunc ;==>_ExportData May be Excel UDF has be to be added but I can manage that my self  
      Thank you in advance
    • FMS
      By FMS
      Hello,
      The last couple of day's I was searching on this forum for the best way to put array's inside array's.
      The best example's i found where a little outdatet (2010) whit a lot of pro's and con's.
      Now I've a big script where a lot of computations and big array's are involved, so speed is a big issue.
      Also I wanna try the script below but don't know iff speed is a problem this way or maybe there is a better way to do this.

      Does somebody know's the best way to put array's inside array's and get the data back from them?
      I've made an example of something I was thinking about.
      (maybe something totaly wrong but I'm open for sugestions)
      I'm doing it this way because I don't think I can access the data inside the array (and doing some calculations to it) some other way if the array is inside another array.
      Or is there?
      Thanks in advanced.
      #include <Array.au3> Global $aArray[Random(5,10,1)][Random(5,10,1)] Global $Holder[2][2] For $x = 0 To UBound($aArray,1) - 1 For $y = 0 To UBound($aArray,2) - 1 $aArray[$x][$y] = Round(Random(-1,1),4) Next Next $Holder[0][0] = $aArray $aArray = "" Global $get_array = $Holder[0][0] _ArrayDisplay($get_array)  
×

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.