Jump to content

Recommended Posts

Posted

Hello, i have a large file ~2 million lines (max 20 characters each line) and i need to recursively loop through all the lines and do some calculations(each one's lenght and total lenght and so on).

#include <Array.au3>

For $i = 1 to 2000000 step +1 
$readline1 = FileReadLine(@ScriptDir & "\data.csv", $i)
For $z = 1 to 2000000 step +1
$readline2 = FileReadLine(@ScriptDir & "\data.csv", $z)
;ToolTip($readline1 & $z &"-"& $readline2)
;Calculate
Next
Next
#include <Array.au3>

$arr = FileReadToArray(@ScriptDir & "\data.csv")
For $b = 1 to 2000000 step +1
For $v = 1 to 2000000 step +1
    ;ToolTip($arr[$b] &"-"& $arr[$v])
    ;Do stuff
Next
Next
_ArrayDisplay($arr)

The problem is that it takes to long to loop through all those lines millions times.

Is there a better approach to this?

Thank you.

  • Developers
Posted
  On 3/26/2019 at 6:59 PM, JohnyX said:

Is there a better approach to this?

Expand  

Before even trying to answer that: Why do you need to loop through the whole file for each record?

Jos

SciTE4AutoIt3 Full installer Download page   - Beta files       Read before posting     How to post scriptsource   Forum etiquette  Forum Rules 
 
Live for the present,
Dream of the future,
Learn from the past.
  :)

Posted (edited)

Thanks for your answer.

Like i said, i need to compare and perform calculations for each record. As a last resort i will import the csv to excel but i will have to split the file because excel can handle less over 1 million rows.

Edited by JohnyX
  • Developers
Posted

Hence my question....    and you are still utterly vague on what it is you need to do with the content of the record so how can you expect to get any proper help? ;) 

Jos

SciTE4AutoIt3 Full installer Download page   - Beta files       Read before posting     How to post scriptsource   Forum etiquette  Forum Rules 
 
Live for the present,
Dream of the future,
Learn from the past.
  :)

Posted

Going by just the script you posted, this is the fastest way to do it. You read through the whole file 2,000,000 times, because of your 2 loops. Do you REALLY need to do that?

#include <Array.au3>
$hFile1 = FileOpen(@ScriptDir & "\data.csv")
While 1
     $readline1 = FileReadLine($hFile1)
     If @error = -1 Then ExitLoop 
         ;ToolTip($readline1 & $z &"-"& $readline2)
     ;Calculate
WEnd

If you use a file handle instead of the file name in the FileReadLine, it's much faster, plus FileReadLine will automatically increase by one line every time you read one line.

If I posted any code, assume that code was written using the latest release version unless stated otherwise. Also, if it doesn't work on XP I can't help with that because I don't have access to XP, and I'm not going to.
Give a programmer the correct code and he can do his work for a day. Teach a programmer to debug and he can do his work for a lifetime - by Chirag Gude
How to ask questions the smart way!

  Reveal hidden contents

I hereby grant any person the right to use any code I post, that I am the original author of, on the autoitscript.com forums, unless I've specifically stated otherwise in the code or the thread post. If you do use my code all I ask, as a courtesy, is to make note of where you got it from.

Back up and restore Windows user files _Array.au3 - Modified array functions that include support for 2D arrays.  -  ColorChooser - An add-on for SciTE that pops up a color dialog so you can select and paste a color code into a script.  -  Customizable Splashscreen GUI w/Progress Bar - Create a custom "splash screen" GUI with a progress bar and custom label.  -  _FileGetProperty - Retrieve the properties of a file  -  SciTE Toolbar - A toolbar demo for use with the SciTE editor  -  GUIRegisterMsg demo - Demo script to show how to use the Windows messages to interact with controls and your GUI.  -   Latin Square password generator

Posted

I'd also suggest you check out:

FileReadToArray()

; and for delimited data

_FileReadToArray()

In my experience with large files it is always faster to read the entire file into an array and then work with the array.

Problem solving step 1: Write a simple, self-contained, running, replicator of your problem.

Posted

Hi, sorry for late reply and for my vague previous answers, i was on a rush.

This is what my script should do:

#include <Array.au3>

For $i = 1 to 2000000 step +1
$readline1 = FileReadLine(@ScriptDir & "\data.csv", $i)   
For $z = 1 to 2000000 step +1
$readline2 = FileReadLine(@ScriptDir & "\data.csv", $z)
ToolTip($readline1 & $z &"-"& $readline2)
;Calculate
If StringLen($readline1) <> StringLen($readline2) Then
    $len = StringLen($readline1 & $readline2)
If $len >= 21 And StringLeft($readline1, 1) = "t" Then
    FileWriteLine(@ScriptDir & "\output.txt", $readline1&$readline2 &"-"& $len)
ElseIf $len < 21 And StringRight($readline2,1) = "v" Then
    FileWriteLine(@ScriptDir & "\output.txt", $readline1&$readline2 &"-"& $len)
EndIf
EndIf
;Done
Next
Next

I am looping so many times because i can't think of another way to get the total lenght and the first and last character of the two records.

Posted
  On 3/27/2019 at 3:57 PM, JohnyX said:

two records.

Expand  

What are you doing in comparing the 2? From what I can see, you read the first line of the file, then compare it to the 2,000,000 other lines in the file. After that you read the second line of the file, and proceed to loop through 2,000,000 lines of the file again to compare that one. Then you write those 4 trillion lines of seemingly useless data to the another  file.

If I posted any code, assume that code was written using the latest release version unless stated otherwise. Also, if it doesn't work on XP I can't help with that because I don't have access to XP, and I'm not going to.
Give a programmer the correct code and he can do his work for a day. Teach a programmer to debug and he can do his work for a lifetime - by Chirag Gude
How to ask questions the smart way!

  Reveal hidden contents

I hereby grant any person the right to use any code I post, that I am the original author of, on the autoitscript.com forums, unless I've specifically stated otherwise in the code or the thread post. If you do use my code all I ask, as a courtesy, is to make note of where you got it from.

Back up and restore Windows user files _Array.au3 - Modified array functions that include support for 2D arrays.  -  ColorChooser - An add-on for SciTE that pops up a color dialog so you can select and paste a color code into a script.  -  Customizable Splashscreen GUI w/Progress Bar - Create a custom "splash screen" GUI with a progress bar and custom label.  -  _FileGetProperty - Retrieve the properties of a file  -  SciTE Toolbar - A toolbar demo for use with the SciTE editor  -  GUIRegisterMsg demo - Demo script to show how to use the Windows messages to interact with controls and your GUI.  -   Latin Square password generator

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...