Sign in to follow this  
Followers 0
gcue

filereadtoarray to read CSV log

10 posts in this topic

#1 ·  Posted (edited)

I am trying to create an array from a text file.

trying to use filereadtoarray but the each line has commas seperating data fields.

user1,prog,08/05/2009 10:15:36,k:\tools\das,@hostname

user2,prog,08/05/2009 07:16:39,C:\CRTools\Das,@hostname

user3,prog,08/05/2009 08:24:12,C:\Das,@hostname

i would like to use each of those commas as inidication for a new dimension for the array

so in the example shown, the array would have 5 dimensions

this file can get very large - what would be the fastest way to do this?

thanks!

Edited by gcue

Share this post


Link to post
Share on other sites



#include <Array.au3>

Global Const $sFile = @ScriptDir & '\file.csv'
Global $hFile
Global $sText
Global $avRowArray, $avColArray, $avCSVArray[1][1] = [[0]]

$hFile = FileOpen($sFile, 0)
$sText = FileRead($hFile)
FileClose($hFile)

$avRowArray = StringSplit($sText, @CRLF, 1)
If @error = 0 Then
    ReDim $avCSVArray[$avRowArray[0]+1][1]
    $avCSVArray[0][0] = $avRowArray[0]
    
    For $i = 1 To $avRowArray[0]
        $avColArray = StringSplit($avRowArray[$i], ',')
        
        If Not @error Then
            If $avColArray[0] >= UBound($avCSVArray, 2) Then ReDim $avCSVArray[$avCSVArray[0][0]+1][$avColArray[0]]
        
            For $j = 1 To $avColArray[0]
                $avCSVArray[$i][$j-1] = $avColArray[$j]
            Next
        EndIf
    Next
    
    _ArrayDisplay($avCSVArray)
EndIf

Seems like you'll need to manipulate the resulting array using a loop.

Share this post


Link to post
Share on other sites

thanks authenticity - ill try it =)

Share this post


Link to post
Share on other sites

#4 ·  Posted (edited)

I am trying to create an array from a text file.

trying to use filereadtoarray but the each line has commas seperating data fields.

i would like to use each of those commas as inidication for a new dimension for the array

so in the example shown, the array would have 5 dimensions

this file can get very large - what would be the fastest way to do this?

thanks!

This method of getting a csv file into an AutoIt 2D array also appears to work.

;
; This example reads any csv file to a AutoIt 2D array.
#include <array.au3>

Local $sFile = "Button63x17.csv"

;Get number of lines / rows
Local $aREResult = StringRegExp(FileRead($sFile), ".+(?=\v+|$)", 3) ; returns array of every line
Local $iNumLines = UBound($aREResult)
ConsoleWrite("$iNumLines; " & $iNumLines & @CRLF)

;Get number of commas / columns.
Local $aREResult = StringReplace(FileRead($sFile), ",", ",") ; returns number of commas in file
Local $iNumCommas = @extended
ConsoleWrite("$iNumCommas per row; " & Int($iNumCommas / $iNumLines) + 1 & @CRLF)

Global $aMain[$iNumLines][($iNumCommas / $iNumLines) + 1], $iRow = 0 ; Array for csv file

_CSVFileToArray($sFile) ; Fill array from file

_ArrayDisplay($aMain, "csv file Results")


Func _CSVFileToArray($sFile)
    Execute(StringTrimRight(StringRegExpReplace(StringRegExpReplace(FileRead($sFile), '"', '""'), "(\V+)(\v+|$)", 'Test1(StringRegExp("\1","([^,]+)(?:,|$)",3)) & '), 3))
EndFunc   ;==>_CSVFileToArray

; Fills each row of the required 2D array
Func Test1($aArr)
    For $x = 0 To UBound($aArr) - 1
        $aMain[$iRow][$x] = $aArr[$x]
    Next
    $iRow += 1
    Return
EndFunc   ;==>Test1
;

Edit:Fixed Number of commas in file.

Fixed the presence of double quote in file.

Edited by Malkey

Share this post


Link to post
Share on other sites

works great!!!

many thanks!

Share this post


Link to post
Share on other sites

#6 ·  Posted (edited)

It does get used using the StringRegExpReplace(). It's one of Malkey's new techniques to mimic callback RE, >_< . If you know your file is gonna be organized in a consistent format, i.e. each row is presented and consisting of a 5 separated fields you can drop the conditions inside the loop, it takes more time than you may think, i think :( .

Edit: As I expected, the conditions consume most of the process time, this one took ~13 seconds on my 8 years old PC (no kidding heh):

#include <Array.au3>

Global Const $sFile = @ScriptDir & '\file.csv'
Global $hFile
Global $sText
Global $avRowArray, $avColArray

Global $iInit = TimerInit()
$hFile = FileOpen($sFile, 0)
$sText = FileRead($hFile)
FileClose($hFile)

$avRowArray = StringSplit($sText, @CRLF, 1)
If @error = 0 Then
    Dim $avCSVArray[$avRowArray[0]+1][5]
    $avCSVArray[0][0] = $avRowArray[0]
    
    For $i = 1 To $avRowArray[0]
        $avColArray = StringSplit($avRowArray[$i], ',')
        For $j = 1 To $avColArray[0]
            $avCSVArray[$i][$j-1] = $avColArray[$j]
        Next
    Next
    
    ConsoleWrite(TimerDiff($iInit) & @CRLF)
    ;_ArrayDisplay($avCSVArray)
EndIf

The file is a simple duplication of the initial example of the OP, fitting 100K lines of 5 values each line.

Edit: This one specific to this example took ~20 seconds to fill array out of 160K lines:

#include <Array.au3>

Global Const $sFile = @ScriptDir & '\file.csv'
Global $hFile
Global $sText
Global $aMatch

Global $iInit = TimerInit()
$hFile = FileOpen($sFile, 0)
$sText = FileRead($hFile)
FileClose($hFile)

$aMatch = StringRegExp($sText, '([^,\r]+)(?:,|\r\n)?', 3)

Global Const $iUpperBound = UBound($aMatch)
Global Const $iRows = $iUpperBound/5
Global Const $iCols = 5
Global $avCSVArray[$iRows][$iCols]
Global $iCounter = 0

For $i = 0 To $iUpperBound-$iCols Step $iCols
    For $j = 0 To $iCols-1
        $avCSVArray[$iCounter][$j] = $aMatch[$i+$j]
    Next
    
    $iCounter += 1
Next

ConsoleWrite(TimerDiff($iInit) & @LF)
Edited by Authenticity

Share this post


Link to post
Share on other sites

Thanubis

Thanks for the feed back.

You made me have another look at my script.

I noticed that if a double quote was present in the cvs file, my script did not work.

Also, if two commas were next to each other, the wrong number of commas in the file was returned.

Both these bugs have been fixed, and I have re-posted the corrected script.

Ready for your scrutiny once more.

Malkey

Share this post


Link to post
Share on other sites

:( You sure got the time. You must have a nice computer if it took you ~3 seconds to parse 36,000 lines, it took me about triple amount of time.

I've tried a few more experiments and found one that work quite nice by not shifting and may allow the last value also to be omitted but this can be problematic. It doesn't record empty line and they are completely ignored, and it's just a simple change >_< $aMatch = StringRegExp($sText, '([^,\r]*+)(?>,|(?>\r\n)*)', 3)

Share this post


Link to post
Share on other sites

#9 ·  Posted (edited)

Not sure that is the best and faster way, but I have made a function for that.

This function take care about tree things.

The first is that a line can have more elements than the previous line.

The second is that sometime there is empty line on a cvs file that you don't want.

The third is that you can chose the separator.

It produce a multidimentionnal array according to the nuber of elements.

I try it on a 36000 lines file, and the time seems to be correct. About 3 seconds on my C2D 1.66Ghz. >_<

Take the file attached to this post : http://www.autoitscript.fr/forum/viewtopic.php?p=16275#p16275

Exemples are included.

Hope that it's what you search. :(

Edited by Tlem

Best Regards.Thierry

Share this post


Link to post
Share on other sites

wow tlem, that was insanely fast!!! it did mine in like 3 seconds as well..

amazing job!

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0