Jump to content
Sign in to follow this  
Jewtus

Diff two arrays with a progress bar

Recommended Posts

Jewtus

I'm working with two csv files that I'm parsing into two arrays. I'm then comparing them to find the duplication and remove them from the first array. This works great on 100 or so records, but I'm trying to compare arrays with more than 70,000 records so I wanted to add in a loading bar so I can tell how far/how much longer it will take.

This is my code:

ProgressSet(0,0&"%","Checking already searched")
$aProcess = _ParseCSV($oOutfile,"|","",0)
$aAlreadyChecked = _ParseCSV($AlreadyProcessed,"|","",0)
For $a = UBound($aProcess) -1 to 0 Step -1
                for $b = 0 to UBound($aAlreadyChecked) -1
                                if $aProcess[$a][0] = $aAlreadyChecked[$b][0] Then
                                                _ArrayDelete($aProcess, $a)
                                                MsgBox(0,"",($a-UBound($aProcess)) & @TAB & $b)
                                                ProgressSet(($b/$a),Round($b/$a)&"%","Cleaning up")
                                                ExitLoop
                                EndIf
                Next
Next

I cannot get the percentage logic to show anything that seems rational or accurate. Does anyone know of a more efficient way of doing this or how to fix the progressset to actually show how far in the process it already is?

Share this post


Link to post
Share on other sites
UEZ

Is that working for you?

#include <Array.au3>

Global $array1[100000], $array2[111111], $i, $t, $fProgress
ConsoleWrite("Creating test array... ")
$t = TimerInit()
For $i = 0 To UBound($array1) - 1
    $array1[$i] = Random(0, 100000, 1)
    $array2[$i] = Random(0, 111111, 1)
Next
ConsoleWrite("done in " & Round(TimerDiff($t), 2) & " ms." & @CRLF)

Global $aResult = ArrayCompare($array1, $array2)
ConsoleWrite(UBound($aResult) & @CRLF)
;~ _ArrayDisplay($aResult)


Func ArrayCompare(ByRef $a1, $a2)
    ConsoleWrite("Sorting 2nd array... ")
    Local $t = TimerInit()
    _ArraySort($a2)
    ConsoleWrite("done in " & Round(TimerDiff($t), 2) & " ms." & @CRLF)
    Local $i, $c = 0, $iUB = UBound($a1) > UBound($a2) ? UBound($a1) : UBound($a2), $aNew[$iUB], $iUB = UBound($a1) - 1
    ConsoleWrite("Searching " & $iUB & " elements in " & UBound($a2) - 1 & " elements ... ")
    AdlibRegister("Show_Progress", 500)
    $fProgress = 0
    ProgressOn("Progress Meter", "Be patient, searching for duplicates...", "0%")
    $t = TimerInit()
    For $i = 0 To $iUB
        If _ArrayBinarySearch($a2, String($a1[$i])) > -1 Then
            ContinueLoop
        Else
            $aNew[$c] = $a1[$i]
            $c += 1
        EndIf
        $fProgress = $i / $iUB * 100
    Next
    ReDim $aNew[$c]
    ConsoleWrite("done in " & Round(TimerDiff($t), 2) & " ms." & @CRLF & @CRLF)
    AdlibUnRegister("Show_Progress")
    ProgressOff()
    Return $aNew
EndFunc

Func Show_Progress()
    ProgressSet($fProgress, StringFormat("%.2f %", $fProgress))
EndFunc

Br,

UEZ

Edited by UEZ
  • Like 1

Please don't send me any personal message and ask for support! I will not reply!

Selection of finest graphical examples at Codepen.io

The own fart smells best!
Her 'sikim hıyar' diyene bir avuç tuz alıp koşma!
¯\_(ツ)_/¯  ٩(●̮̮̃•̃)۶ ٩(-̮̮̃-̃)۶ૐ

Share this post


Link to post
Share on other sites
Jewtus

I tried replacing the arrays but they are 2D arrays so I get an error "array variable has incorrect number of subscripts or subscript dimensions range exceeded"

What would I need to do it fix that? I tried this:

Global $i, $t, $fProgress
ConsoleWrite("Creating test array... ")
$t = TimerInit()

Global $aResult = ArrayCompare($aProcess, $aAlreadyChecked)
ConsoleWrite(UBound($aResult) & @CRLF)
_ArrayDisplay($aResult)


Func ArrayCompare(ByRef $a1, $a2)
    ConsoleWrite("Sorting 2nd array... ")
    Local $t = TimerInit()
    _ArraySort($a2)
    ConsoleWrite("done in " & Round(TimerDiff($t), 2) & " ms." & @CRLF)
    Local $i, $c = 0, $iUB = UBound($a1) > UBound($a2) ? UBound($a1) : UBound($a2), $aNew[$iUB], $iUB = UBound($a1) - 1
    ConsoleWrite("Searching " & $iUB & " elements in " & UBound($a2) - 1 & " elements ... ")
    AdlibRegister("Show_Progress", 500)
    $fProgress = 0
    ProgressOn("Progress Meter", "Be patient, searching for duplicates...", "0%")
    $t = TimerInit()
    For $i = 0 To $iUB
        If _ArrayBinarySearch($a2, String($a1[$i][0])) > -1 Then
            ContinueLoop
        Else
            $aNew[$c] = $a1[$i][0]
            $c += 1
        EndIf
        $fProgress = $i / $iUB * 100
    Next
    ReDim $aNew[$c]
    ConsoleWrite("done in " & Round(TimerDiff($t), 2) & " ms." & @CRLF & @CRLF)
    AdlibUnRegister("Show_Progress")
    ProgressOff()
    Return $aNew
EndFunc

Func Show_Progress()
    ProgressSet($fProgress, StringFormat("%.2f %", $fProgress))
EndFunc

which seems to function, but it doesn't seem to be able to see the difference in the two files. The result array ends up being a 1D version of the first array.

Share this post


Link to post
Share on other sites
kylomas

Jewtus,

I'm then comparing them to find the duplication

 

Please define what you mean by "duplication"...

kylomas


Forum Rules         Procedure for posting code

"I like pigs.  Dogs look up to us.  Cats look down on us.  Pigs treat us as equals."

- Sir Winston Churchill

Share this post


Link to post
Share on other sites
Jewtus

Something that exists in both arrays

EX: 

Array1

[1,2,3,4]

Array2

[3,4,5,6]

I want to remove 3 and 4 from array1 because they exist in both lists.

Edited by Jewtus

Share this post


Link to post
Share on other sites
kylomas

Jewtus,

These are 1D arrays.  In post #3 you allude to a 2D aray.  Do you want to eliminate dups anywhere they exist, or, only by column?

kylomas


Forum Rules         Procedure for posting code

"I like pigs.  Dogs look up to us.  Cats look down on us.  Pigs treat us as equals."

- Sir Winston Churchill

Share this post


Link to post
Share on other sites
Jewtus

I want to eliminate the entire row if there is a match in the first column

 

I'm looking at search results and I'm comparing them to a new set of search results, but I'm trying to avoid doing more work on the results that I've already processed.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

  • Similar Content

    • lattey
      By lattey
      hi,
      i have checkboxes and each checkbox that checked, i put in array. 
      now, im stuck on how to loop the checked array and store in in one variable. what i can do now, is only write the result into a text file. 
      below is the code:
      #include <GUIConstantsEx.au3> ;~ #include <MsgBoxConstants.au3> #include <ButtonConstants.au3> #include <Array.au3> Global $Count = 3 Global $CheckBoxP[$Count] Global $step[$Count] global $array1[1] Global $ExitResult $hGUI = GUICreate("Summary Steps", 500, 400) GUISetFont(12, 400, "Tahoma") GUICtrlCreateLabel( "Please Select the Summary Steps for Script Check", 70, 20) GUISetFont(10, 400, "Tahoma") Global $array_Pstep[3] = ["fix2","fix1","fix3"] global $step[3] = ["2","3","4"] $Spacing = 50 For $i = 0 To UBound($array_Pstep) - 1 $CheckBoxP[$i] = GUICtrlCreateCheckbox($array_Pstep[$i], 80, $Spacing + (20 * $i), 65, 17) Next $submit = GUICtrlCreateButton("Submit",180, 280, 80, 30) $exit = GUICtrlCreateButton("Exit",180, 320, 80, 30) GUISetState() While 1 $Msg = GUIGetMsg() Select case $Msg=$submit For $i = 0 To $Count - 1 If GUICtrlRead($CheckBoxP[$i]) = $GUI_CHECKED Then _ArrayAdd($array1, $step[$i]) EndIf Next Global $logfilerray = @WorkingDir & "\checkedlist.txt" FileDelete ($logfilerray) Global $readlogfile = FileOpen($logfilerray,1) for $a = 1 to UBound($array1) - 1 ;~ $var=$array1[$a] FileWriteLine($readlogfile,$array1[$a]) Next FileClose($readlogfile) Exit case $Msg=$exit $ExitResult = MsgBox(1,"Summary Step", "Continue to Exit ?") if $ExitResult = 1 Then ;ok Exit EndIf Exit EndSelect WEnd  
    • omicron
      By omicron
      How do you perform a nested loop function with a multidimensional array from 2 lists.
      for i in list1
      (open file) extract variable
          while open for i in list 2
          (open file2) extract variable
       
      var1 + var2 = (search term)

      The list sizes will more than likely consist of different lengths.
       
      What is the best approach to accomplishing this method?
             
    • omicron
      By omicron
      Hello!

      I am working on a function that I am just getting lost on. The goal is a multiple nested loop.

      Here are the steps:
      Contents of file1.txt::
      [topic] var1=Name var2=OtherName var3=SomeotheName Contents of file2.txt::
      [subTopic] top=sub1 top2=sub2 top3=sub3 The Shell I am working from::
      #include <file.au3> $file = "c:\yourfile.txt" FileOpen($file, 0) For $i = 1 to _FileCountLines($file) $line = FileReadLine($file, $i) msgbox(0,'','the line ' & $i & ' is ' & $line) Next FileClose($file) Understanding however that the "msgbox" needs to then become a variable. in example the following::
      $file = "c:\yourfile.txt" FileOpen($file, 0) While true( prog.exe is running && "WinName" is open) do For $i = 1 to _FileCountLines($file) $line = FileReadLine($file, $i) ;Open File to log "current location of file 1" FileWriteLine ("filename", $i & ' is ' & $line) var = $line Next $file2 = "c:\yourfile.txt" FileOpen($file, 0) For $i = 1 to _FileCountLines($file) $line = FileReadLine($file, $i) ; OpenFile to log "Current location of file 2" FileWriteLine ("filename", $i & ' is ' & $line) Next FileClose($file2) FileClose($file) The goal in written form is the following ::

      While in "OpenWindow"
          read from file 1 starting at line 1 until end of file.
         file 1 is a list of names to be searched.
         With $line selected, add this element to the element in file 2.
       
      The search of a variables in list 1 and list 2 differ on the amount of posts that day. (This is not a web based platform, it is a game) I need to search 2 names and take a screenshot of the out put. The sizes of the names list depend on the activity of names at the time of search.
      This loop continues until all the names from both lists have been searched. Mostly in the format of::
      File1= item
      File2= Vendor
       
      Item + Vendor  ( Capture screen, scroll) -- Not sure how to detect if I need to scroll)
       
      Thank you for your help and support!
    • Skeletor
      By Skeletor
      Hi Virtual People,
      My array works perfectly fine. However, what is the best practice if the line in the array doesn't have the correct amount of columns and if I can add a placeholder?

       
      For $count = 1 To _FileCountLines($FileRead1) Step 1 $string = FileReadLine($FileRead1, $count) $input = StringSplit($string, ",", 1) $value1 = $input[1] $value2 = $input[2] $value3 = $input[3] _Excel_RangeWrite($oWorkbook, $oWorkbook.Activesheet, $value2, "A1") _Excel_RangeWrite($oWorkbook, $oWorkbook.Activesheet, $value1, "B1") _Excel_RangeWrite($oWorkbook, $oWorkbook.Activesheet, $value3, "C1") Next  
    • MrCheese
      By MrCheese
      hi all,
      reviewing the forum, this thread is applicable: 
       
       
      I wanted to know if there is now a better way to do this?
      In essence, I load a tab delimited txt file into an array (works well). I used tab, as some fields in the original csv contains commas.
      However, I needed autoit to manipulate this array, and output it as a csv.
      IF my array contains items with a comma, without double quotes around the field, then how best do I get a csv out of this?
      My current workaround is to filewritefromarray tab delimited, then open it in excel and save as a csv. I will need to check this to see how the address fields behave that contain a comma.
       
      Any thoughts would be appreciated.
       
×