Jump to content
Sign in to follow this  
DJKMan

_ArrayTo2DArray() - Parse Large Text Files To 2D Array Quickly [With Chunk Size]

Recommended Posts

DJKMan

post-47664-0-24916600-1371846835_thumb.j

 

This script is fairly straightforward. If you ever worked with large files before then this may be of help. By large I mean files of 2 MB or so. Granted this doesn't sound so big but going through the file and parsing it to a 2D array all at once took an astronomical amount of time so I wrote my own function to handle this. I discovered that chunking a large array can boost the performance of iterating through the elements and theoretically this should maintain the performance no matter how large the array size is. I know there is room for improvement so please feel free to contribute! 

Note: I wasn't able to fully test this on larger files such as 200 MB in size due to AutoIt complaining about an error allocating memory while executing _FileReadToArray(). Any help is appreciated.

Features:

  • Chunking (Performance will never degrade over time; I.E. Capable of parsing 200 lines or 20,000 and no performance hit will occur)
  • Automatically re-sizes to dynamic columns 
  • Preserves Columns while parsing
  • FAST!!!!!! (I can parse a file that contains 24,000 lines with variable columns up to 8 columns and it will finish under a second.)

Script:

_ArrayTo2DArray.au3

Example usage:

Local $aExport ;Initialize array
_FileReadToArray("LARGE TEXT.txt", $aExport) ;Returns 1D array of file
Local $aSheet = _ArrayTo2DArray($aExport) ;Converts it to 2D

Example Text File:

LARGE TEXT.txt

 

This script was inspired by >this post.

*Updated attachment: Minor bug fixes*

 

*UPDATE June 6, 2013: I apologize! I just realized I made a complete mess of the algorithm. I'm working on a fix now.*

*UPDATE June 6, 2013: Bug fixed! It's attached in the post now.

Edited by DJKMan
  • Like 1

My work in AutoIt (Not many yet):

Parse Large Text Files To 2D Array Quickly [With Chunk Size]

 

My artificial intelligence project coded entirely in AutoIt. Meet Alice Assistant: http://facebook.com/ProjectAliceAI

 

Share this post


Link to post
Share on other sites
DJKMan

I'm glad you like it! I have an idea to improve on it. It involves timing itself and automatically adjusting the chunk size for the best performance. This should allow for the algorithm to attempt to achieve the best performance possible, thereby, even faster parsing! 

Plus, it will become much easier to implement as we will no longer have to manually figure out the optimum chunk size on different platforms! :)


My work in AutoIt (Not many yet):

Parse Large Text Files To 2D Array Quickly [With Chunk Size]

 

My artificial intelligence project coded entirely in AutoIt. Meet Alice Assistant: http://facebook.com/ProjectAliceAI

 

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

  • Similar Content

    • AnonymousX
      By AnonymousX
      Hello,
      I'm trying to make a program that can look at a folder directory, find all the CSV files, and then add the data from CSV's to an array. 
      The problem I seem to be running into is on the 2nd iteration (2nd file) when the script will not create an array. Could someone please help? Thanks in advance
       
      #include <Array.au3> #include <File.au3> #include <MsgBoxConstants.au3> #include <Excel.au3> #include <MsgBoxConstants.au3> Global $MasterArray RefineData() Func RefineData() Local $i, $filenum, $file, $csvArray, $sFilePath = @ScriptDir $fileList = _FileListToArrayRec($sFilePath, "*.csv", 1) ;Create and array of all .csv files within folder ;=====Loop through the .csv files within the folder====== For $filenum = 1 To UBound($fileList) - 1 Step 1 $file = $fileList[$filenum] $sFilePath = $sFilePath & "\" & $file ;=====Create array based on csv file===== _FileReadToArray($sFilePath, $csvArray, $FRTA_NOCOUNT, ",") _ArrayDisplay($csvArray,"File: " & $filenum) If $filenum = 1 Then $MasterArray = $csvArray _ArrayDisplay($MasterArray, "Master") Else $MasterArray = _ArrayColInsert($MasterArray, UBound($MasterArray)) ;want column added at end For $i = 0 To UBound($MasterArray)-1 Step 1 $MasterArray[$i][UBound($MasterArray) - 1] = $csvArray[$i][4] Next _ArrayDisplay($MasterArray, "Master") EndIf Next EndFunc ;==>RefineData  
    • Hanukka
      By Hanukka
      Hello peeps, can any one please give an example of passing an array[1D] to another script, then read and display it. Thanks
    • RyukShini
      By RyukShini
      #Include <file.au3> #Include <Array.au3> Local $nobrainArray $var = _FileReadToArray("example.txt", $nobrainArray) $split = StringSplit($var, ":"); split by colon? _ArrayDisplay($split) Its getting later and I am getting more and more tired so I think I should go to bed and give this another look tmr.
      but if someone could help me i'd be grateful!
       
      randomfirstname:randomlastname\nrandomfirstname:randomlastname\nrandomfirstname:randomlastname\nrandomfirstname:randomlastname\nrandomfirstname:randomlastname\nrandomfirstname:randomlastname\nrandomfirstname:randomlastname\nrandomfirstname:randomlastname\nrandomfirstname:randomlastname\n----------------------------------------------------------------------\n\nThe topic can be found here:\nhttps://www.websitehere.com\n\n\nYou can unsubscribe at any time here: https://www.websitehere.com/unsubscribe/Zm9ydW1zO2ZvcnVtczs0MzszOTc0MTA7Mzk3NDEwO25pa29sYXppbmRvQGdtYWlsLmNvbQ,,/\n\nIf you are not following any forums and wish to stop receiving notifications, uncheck the setting\n\"Send me news and information\" found in \'My Settings\' under \'Notification Options\'.\n',545627,'followed_forums','https://www.websitehere.com/topic/','forums','forums',43,'4745c9f0607baec3e8bc38f47d07f9bd'),(622776,49813,1457299052,1,'<a href=\'https://www.websitehere.com/!545627\'>Antepliemmo</a> posted topic <a href=\'https://www.websitehere.com\'>\n\n----------------------------------------------------------------------\n As you can see this is very messy!
      There is random first names and last names everywhere and then there is a lot of junk....
      I am extracting all the names/last names for a buddy, but I just can't seem to figure it out.

      Any help is appreciated, I'll keep working on this tomorrow again wish a fresh mindset!
       
      Regards

      Ryuk
    • TheDcoder
      By TheDcoder
      Hello, I wonder if there is a better way than this!:
      #include <Array.au3> Local $aArray[1][3] $aArray[0][0] = 1 $aArray[0][1] = 2 $aArray[0][2] = 3 ;$aArray[0] = [1, 2, 3] _ArrayDisplay($aArray) IIRC line no. 9 should work, but its not
       
      Thanks in Advance, TD
    • TThomasson
      By TThomasson
      Hi everyone. New guy here. I'm still learning this awesome language and I'm unable to figure this one out from google searches. Heres my problem:
      I'm working on a small application to help users in my environment connect to wireless projectors. To keep this easily updated with new projectors I'm reading the room names and IP addresses from a csv file and putting them into a 2D array. (MeetingRoom1,xxx.xxx.xxx.xxx)
      So far I'm able to read the 0 column and display the room names in a combo box. Where I am stuck is how to take the user's room selection from the gui and associate it with an IP address in the array. After that point I've got things prepared to pass the address to the connection application.  
      Any help you all could provide would be greatly appreciated. 
×