20,073,124 array - How would you do it ?

argumentum · Saturday at 11:19 PM

I have a CSV file.
Is 3.12 GB (3,357,068,152 bytes).
FileReadToArray() loaded ( just the count ) and UBound() say that is 20073124 and that just happens to be a tiny bit more that the max of 16 million elements an array can hold.

How can a file that size be read, chop in @CRLF chunks, filtered ( If Not StringInStr($aFile[$n], ".au3") Then ContinueLoop ) and write to another file the resultant ?

Anything goes: SQLite, memory, etc. .

Thanks

Nine · 2026-01-25T03:52:18Z

Read it line by line. First thing that comes to mind...

argumentum · 2026-01-25T04:27:44Z

35 minutes ago, Nine said:

First thing that comes to mind...

First thing that came to my mind was to read up to X position, split and work on that, then read the next until done 🤷‍♂️

Line by line would be kind of slow I think

Edited yesterday at 04:28 AM by argumentum

jchd · 2026-01-25T10:19:52Z

Import in SQLite and apply any SQL magic massage needed to your data. You can just use the SQLite CLI (the command-line interface) with a memory (default) or disc-based database. You can define row and column separators if needed, .import the file as --csv, apply SQL to filter out unwanted rows, rewrite the resultant table to csv or whatever format you wish. If this operation turns to routine, store all these commands in a file and run it with the CLI.

Use a recent sqlite3.exe, it's now full of rich features.

Else slap a good regex in the face of the file!

Edited 20 hours ago by jchd

jchd · 2026-01-25T10:25:39Z

11 hours ago, argumentum said:

FileReadToArray() loaded ( just the count ) and UBound() say that is 20073124 and that just happens to be a tiny bit more that the max of 16 million elements an array can hold.

IIRC the 16M limit is just an order of magnitude; the actual limit significantly depends on the volume of data.

Nine · 2026-01-25T12:13:35Z

8 hours ago, argumentum said:

Line by line would be kind of slow I think

In fact it would be faster. You do not need to create a useless intermediate array. Reading a bunch of lines won't get it faster either. The file is preemptively in memory buffers by the OS. I remember trying to use an overlapped read to perform tasks while reading the file and it ends up with a very little gain but made the script way more complex. Try it and you will be gladly surprised.

Edited 18 hours ago by Nine

argumentum · 2026-01-25T21:44:26Z

9 hours ago, Nine said:

Try it and you will be gladly surprised.

True. I did and yes, it was good

Block all input without UAC	Save/Retrieve Images to/from Text	Monitor Management (VCP commands)
Tool to search in text (au3) files	Date Range Picker	Virtual Desktop Manager
Sudoku Game 2020	Overlapped Named Pipe IPC	HotString 2.0 - Hot keys with string
x64 Bitwise Operations	Multi-keyboards HotKeySet	Recursive Array Display
Fast and simple WCD IPC	Multiple Folders Selector	Printer Manager
GIF Animation (cached) Debug Messages Monitor UDF	Screen Scraping Round Corner GUI UDF	Multi-Threading Made Easy Interface Object based on Tag

Block all input without UAC	Save/Retrieve Images to/from Text	Monitor Management (VCP commands)
Tool to search in text (au3) files	Date Range Picker	Virtual Desktop Manager
Sudoku Game 2020	Overlapped Named Pipe IPC	HotString 2.0 - Hot keys with string
x64 Bitwise Operations	Multi-keyboards HotKeySet	Recursive Array Display
Fast and simple WCD IPC	Multiple Folders Selector	Printer Manager
GIF Animation (cached) Debug Messages Monitor UDF	Screen Scraping Round Corner GUI UDF	Multi-Threading Made Easy Interface Object based on Tag

Sign In

20,073,124 array - How would you do it ?

Recommended Posts

argumentum

Nine

argumentum

jchd

jchd

Nine

argumentum

Create an account or sign in to comment

Create an account

Sign in

Browse

AutoIt Resources

Release

Beta