Jump to content
Sign in to follow this  
Jewtus

Diff two arrays with a progress bar

Recommended Posts

Jewtus

I'm working with two csv files that I'm parsing into two arrays. I'm then comparing them to find the duplication and remove them from the first array. This works great on 100 or so records, but I'm trying to compare arrays with more than 70,000 records so I wanted to add in a loading bar so I can tell how far/how much longer it will take.

This is my code:

ProgressSet(0,0&"%","Checking already searched")
$aProcess = _ParseCSV($oOutfile,"|","",0)
$aAlreadyChecked = _ParseCSV($AlreadyProcessed,"|","",0)
For $a = UBound($aProcess) -1 to 0 Step -1
                for $b = 0 to UBound($aAlreadyChecked) -1
                                if $aProcess[$a][0] = $aAlreadyChecked[$b][0] Then
                                                _ArrayDelete($aProcess, $a)
                                                MsgBox(0,"",($a-UBound($aProcess)) & @TAB & $b)
                                                ProgressSet(($b/$a),Round($b/$a)&"%","Cleaning up")
                                                ExitLoop
                                EndIf
                Next
Next

I cannot get the percentage logic to show anything that seems rational or accurate. Does anyone know of a more efficient way of doing this or how to fix the progressset to actually show how far in the process it already is?

Share this post


Link to post
Share on other sites
UEZ

Is that working for you?

#include <Array.au3>

Global $array1[100000], $array2[111111], $i, $t, $fProgress
ConsoleWrite("Creating test array... ")
$t = TimerInit()
For $i = 0 To UBound($array1) - 1
    $array1[$i] = Random(0, 100000, 1)
    $array2[$i] = Random(0, 111111, 1)
Next
ConsoleWrite("done in " & Round(TimerDiff($t), 2) & " ms." & @CRLF)

Global $aResult = ArrayCompare($array1, $array2)
ConsoleWrite(UBound($aResult) & @CRLF)
;~ _ArrayDisplay($aResult)


Func ArrayCompare(ByRef $a1, $a2)
    ConsoleWrite("Sorting 2nd array... ")
    Local $t = TimerInit()
    _ArraySort($a2)
    ConsoleWrite("done in " & Round(TimerDiff($t), 2) & " ms." & @CRLF)
    Local $i, $c = 0, $iUB = UBound($a1) > UBound($a2) ? UBound($a1) : UBound($a2), $aNew[$iUB], $iUB = UBound($a1) - 1
    ConsoleWrite("Searching " & $iUB & " elements in " & UBound($a2) - 1 & " elements ... ")
    AdlibRegister("Show_Progress", 500)
    $fProgress = 0
    ProgressOn("Progress Meter", "Be patient, searching for duplicates...", "0%")
    $t = TimerInit()
    For $i = 0 To $iUB
        If _ArrayBinarySearch($a2, String($a1[$i])) > -1 Then
            ContinueLoop
        Else
            $aNew[$c] = $a1[$i]
            $c += 1
        EndIf
        $fProgress = $i / $iUB * 100
    Next
    ReDim $aNew[$c]
    ConsoleWrite("done in " & Round(TimerDiff($t), 2) & " ms." & @CRLF & @CRLF)
    AdlibUnRegister("Show_Progress")
    ProgressOff()
    Return $aNew
EndFunc

Func Show_Progress()
    ProgressSet($fProgress, StringFormat("%.2f %", $fProgress))
EndFunc

Br,

UEZ

Edited by UEZ
  • Like 1

Please don't send me any personal message and ask for support! I will not reply!

Selection of finest graphical examples at Codepen.io

The own fart smells best!
Her 'sikim hıyar' diyene bir avuç tuz alıp koşma!
¯\_(ツ)_/¯  ٩(●̮̮̃•̃)۶ ٩(-̮̮̃-̃)۶ૐ

Share this post


Link to post
Share on other sites
Jewtus

I tried replacing the arrays but they are 2D arrays so I get an error "array variable has incorrect number of subscripts or subscript dimensions range exceeded"

What would I need to do it fix that? I tried this:

Global $i, $t, $fProgress
ConsoleWrite("Creating test array... ")
$t = TimerInit()

Global $aResult = ArrayCompare($aProcess, $aAlreadyChecked)
ConsoleWrite(UBound($aResult) & @CRLF)
_ArrayDisplay($aResult)


Func ArrayCompare(ByRef $a1, $a2)
    ConsoleWrite("Sorting 2nd array... ")
    Local $t = TimerInit()
    _ArraySort($a2)
    ConsoleWrite("done in " & Round(TimerDiff($t), 2) & " ms." & @CRLF)
    Local $i, $c = 0, $iUB = UBound($a1) > UBound($a2) ? UBound($a1) : UBound($a2), $aNew[$iUB], $iUB = UBound($a1) - 1
    ConsoleWrite("Searching " & $iUB & " elements in " & UBound($a2) - 1 & " elements ... ")
    AdlibRegister("Show_Progress", 500)
    $fProgress = 0
    ProgressOn("Progress Meter", "Be patient, searching for duplicates...", "0%")
    $t = TimerInit()
    For $i = 0 To $iUB
        If _ArrayBinarySearch($a2, String($a1[$i][0])) > -1 Then
            ContinueLoop
        Else
            $aNew[$c] = $a1[$i][0]
            $c += 1
        EndIf
        $fProgress = $i / $iUB * 100
    Next
    ReDim $aNew[$c]
    ConsoleWrite("done in " & Round(TimerDiff($t), 2) & " ms." & @CRLF & @CRLF)
    AdlibUnRegister("Show_Progress")
    ProgressOff()
    Return $aNew
EndFunc

Func Show_Progress()
    ProgressSet($fProgress, StringFormat("%.2f %", $fProgress))
EndFunc

which seems to function, but it doesn't seem to be able to see the difference in the two files. The result array ends up being a 1D version of the first array.

Share this post


Link to post
Share on other sites
kylomas

Jewtus,

I'm then comparing them to find the duplication

 

Please define what you mean by "duplication"...

kylomas


Forum Rules         Procedure for posting code

"I like pigs.  Dogs look up to us.  Cats look down on us.  Pigs treat us as equals."

- Sir Winston Churchill

Share this post


Link to post
Share on other sites
Jewtus

Something that exists in both arrays

EX: 

Array1

[1,2,3,4]

Array2

[3,4,5,6]

I want to remove 3 and 4 from array1 because they exist in both lists.

Edited by Jewtus

Share this post


Link to post
Share on other sites
kylomas

Jewtus,

These are 1D arrays.  In post #3 you allude to a 2D aray.  Do you want to eliminate dups anywhere they exist, or, only by column?

kylomas


Forum Rules         Procedure for posting code

"I like pigs.  Dogs look up to us.  Cats look down on us.  Pigs treat us as equals."

- Sir Winston Churchill

Share this post


Link to post
Share on other sites
Jewtus

I want to eliminate the entire row if there is a match in the first column

 

I'm looking at search results and I'm comparing them to a new set of search results, but I'm trying to avoid doing more work on the results that I've already processed.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

  • Similar Content

    • Skeletor
      By Skeletor
      Hi Virtual People,
      My array works perfectly fine. However, what is the best practice if the line in the array doesn't have the correct amount of columns and if I can add a placeholder?

       
      For $count = 1 To _FileCountLines($FileRead1) Step 1 $string = FileReadLine($FileRead1, $count) $input = StringSplit($string, ",", 1) $value1 = $input[1] $value2 = $input[2] $value3 = $input[3] _Excel_RangeWrite($oWorkbook, $oWorkbook.Activesheet, $value2, "A1") _Excel_RangeWrite($oWorkbook, $oWorkbook.Activesheet, $value1, "B1") _Excel_RangeWrite($oWorkbook, $oWorkbook.Activesheet, $value3, "C1") Next  
    • MrCheese
      By MrCheese
      hi all,
      reviewing the forum, this thread is applicable: 
       
       
      I wanted to know if there is now a better way to do this?
      In essence, I load a tab delimited txt file into an array (works well). I used tab, as some fields in the original csv contains commas.
      However, I needed autoit to manipulate this array, and output it as a csv.
      IF my array contains items with a comma, without double quotes around the field, then how best do I get a csv out of this?
      My current workaround is to filewritefromarray tab delimited, then open it in excel and save as a csv. I will need to check this to see how the address fields behave that contain a comma.
       
      Any thoughts would be appreciated.
       
    • Skeletor
      By Skeletor
      Hi All,

      I would like to know how you would take a FileLineRead and insert it into an array which then inserts it into Excel?
      One thing to know is the files content is broken up, so I only use half of the content within $FileRead1.
      So its imperative that the $value1, $value2, etc variables be used. 
      Code below:
      $FileRead1 = FileReadLine("C:\temp\sample.txt",1) For $count = 1 To _FileCountLines($FileRead1) Step 1 $string = FileReadLine($FileRead1, $count) $input = StringSplit($string, ",", 1) $value1 = $input[1] $value2 = $input[2] $value3 = $input[3] $value4 = $input[4] _Excel_RangeWrite($oWorkbook, $oWorkbook.Activesheet, $value1, "A1") _Excel_RangeWrite($oWorkbook, $oWorkbook.Activesheet, $value2, "B1") _Excel_RangeWrite($oWorkbook, $oWorkbook.Activesheet, $value3, "C1") _Excel_RangeWrite($oWorkbook, $oWorkbook.Activesheet, $value4, "D1") Next  
    • AnonymousX
      By AnonymousX
      Hello,
      I'm trying to write a script that moves copies excel cells into an array. I'll than manipulate the values and send array into another program. 
      I don't want range to be specific to a workbook, or sheet, or set of cells.
      I want user to be able to highlight desired cells and to copy either normally ("Ctrl+C") or by a hotkey ("Alt+C"). 
      Could someone help me with this?
      Thank you,
      I've tried to write the framework: (edited)
      #include <MsgBoxConstants.au3> #include <Array.au3> #include <Excel.au3> HotKeySet("!v", "Pastedata") While True Sleep(1000) WEnd func Makearray() local $bArray ;User has cells already copied ;Convert clipboard into an array ;I don;t know how excel stores data to clipboard so don;t know how to bring it into array _Arraydisplay($bArray) MsgBox(0,0,$bArray) return $bArray endfunc func Pastedata() Local $aArray MsgBox(0,0,"wait",1) ;make array based on assumption user has already copied a range to clipboard $aArray = Makearray() ;paste code ;don;t worry about this I got the rest endfunc  
×