Jump to content

UDF: _FileReadTo2DArray


mooseydoom
 Share

Recommended Posts

This function reads a delimited file into an array. =)

Version 2:

Taking Blindwig's advice, this only reads the file twice, once to get the dimensions of the array and once to read the info into the array.

;===============================================================================
;
; Description:    Returns the maximum number of occurances of a substring
;                  in a single line of the specified file
; Syntax:          _FileMaxOccurances( $sFilePath, $sSubString [, $sIgnore [, $iSkipHeaderLines=0 [, $iMaxLinesToSearch ]]] )
; Parameter(s):  $sFilePath - Path and filename of the file to be read
;                  $sSubString - the substring to search for on each line
;                  [optional]$iSkipHeaderLines - skip this many lines at the beginning of the file
;                  [optional]$sIgnore - if a line begins with these characters,
;                     it is considered a comment line and is not searched
;                  [optional]$iMaxLinesToSearch - only use this many lines (not counting
;                    commented lines or header lines). Use 0 to search all lines.  (default is 0)
; Requirement(s):   None
; Return Value(s):  On Success - The maximum number of occurances of the substring
;                  in one line
;                  @extended is set to the number of valid lines which were
;                    searched.  (lines which weren't comment lines and weren't
;                    header lines.  This number won't exceed the
;                    $iMaxLinesToSearch if $iMaxLinesToSearch is greater than zero.)
;                  On Failure - Returns -1 and sets @error = -1
; Author(s):        Kristi Tsukida <kristi.tsukida@gmail.com>
;
;===============================================================================
Func _FileMaxOccurances($sFilePath, $sSubString, $sIgnore = "#", $iSkipHeaderLines = 0, $iMaxLinesToSearch = 0)
    Local $ERROR = -1
    Local $READMODE = 0
    Local $FILEOPENERROR = -1
    Local $iMaxOccurances = 0
    Local $iMaxOccuranceLine = 0
    Local $iLineNum = 0;current line number
    Local $sLine;one line of text
    Local $iCountSearchedLines = 0
    
;Determine if we are ignoring comment lines
    Local $iIgnoreComments = IsString($sIgnore) And StringLen($sIgnore) > 0
    
;Open the file
    Local $handle = FileOpen($sFilePath, $READMODE)
    If $handle == $FILEOPENERROR Then
    ;error opening the file
        SetError($ERROR)
        Return $ERROR
    EndIf
    
    While 1
    ;Search each line for occurances
        $sLine = FileReadLine($handle)
        If @error <> 0 Then ExitLoop
        $iLineNum = $iLineNum + 1
        If $iLineNum <= $iSkipHeaderLines Then ContinueLoop
        If $iIgnoreComments And StringInStr($sLine, $sIgnore, 1) == 1 Then ContinueLoop
        StringReplace($sLine, $sSubString, "", 0, 1)
        If @extended > $iMaxOccurances Then
            $iMaxOccurances = @extended
            $iMaxOccuranceLine = $iLineNum
        EndIf
        $iCountSearchedLines = $iCountSearchedLines + 1
        If $iMaxLinesToSearch > 0 AND $iCountSearchedLines >= $iMaxLinesToSearch Then ExitLoop
    WEnd
;Close the file
    FileClose($handle)
;Return the number of valid (non-comment, non-header) lines which were searched
    SetExtended($iCountSearchedLines)
;Return maximum number of times the substring appeared
    Return $iMaxOccurances
EndFunc  ;==>_FileMaxOccurances

;===============================================================================
;
; Description:    Reads in a delimited file into a 2D array
; Syntax:          _FileReadTo2DArray( $sFilePath [, $sDelim [, $sIgnore [, $iSkipHeaderLines [, $iMaxRowsToUse]]])
; Parameter(s):  $sFilePath - Path and filename of the file to be read
;                  [optional] $sDelim - Characters which separate columns (default is tab)
;                  [optional] $sIgnore - Ignore lines starting with this 
;                     character. (default is #)
;                  [optional] $iSkipHeaderLines - skip this many lines at the beginning of the file
;                    (default is 0)
;                  [optional] $iMaxRowsToUse - if a number greater than zero is given, 
;                    use up to this many lines in the array.  (The maximum number of rows
;                    in the returned array)
; Requirement(s):   _FileMaxOccurances UDF (in file2.au3)
;                  _FileCountUnCommentedLines UDF (in file2.au3)
; Return Value(s):  On Success - Returns a 2 dimensional array (0-indexed) of the delimited file
;                    @extended = number of rows
;                  On Failure - Returns 0 and sets @error = 1
; Author(s):        Kristi Tsukida <kristi.tsukida@gmail.com>
; Version:        2.0
;
;===============================================================================
Func _FileReadTo2DArray($sFilePath, $sDelim = @TAB, $sIgnore = "#", $iSkipHeaderLines = 0, $iMaxRowsToUse=0)
    Local $ERROR = 1; @error value if there's an error reading the file
    Local $RETURN = 0; value to return if there's an error reading the file
    Local $READMODE = 0; Open the file in read mode
    Local $FILEOPENERROR = -1; this is an invalid file handle
    Local $IGNORELOCATION = 1; location of the $sIgnore char: beginning of a line
    
;Determine if we are ignoring comment lines
    Local $iIgnoreComments = IsString($sIgnore) And StringLen($sIgnore) > 0
    
;Check if file exists
    If Not FileExists($sFilePath) Then
    ;error - there is no file!
        SetError($ERROR)
        Return $RETURN
    EndIf
    
;Set the array dimensions
;Func _FileMaxOccurances($sFilePath, $sSubString, $sIgnore = "#", $iSkipHeaderLines = 0, $iMaxLinesToSearch = 0)
    Local $numCols = _FileMaxOccurances($sFilePath, $sDelim, $sIgnore, $iSkipHeaderLines, $iMaxRowsToUse) + 1
    Local $numRows = @extended
;DEBUG MsgBox(0, "Array dimensions", "rows: " & $numRows & "  cols: " & $numCols)
    If $numRows <= 0 Then
    ;No data found, return 0
        Return 0
    EndIf
    Local $aArray[$numRows][$numCols]
    
;Open the file
    Local $handle = FileOpen($sFilePath, $READMODE)
    If $handle == $FILEOPENERROR Then
    ;error opening the file
        SetError($ERROR)
        Return $RETURN
    EndIf
    
; Read in lines of text until the EOF is reached
    Local $linenum = 0
    Local $rowIndex = 0
    
    While 1
        $line = FileReadLine($handle)
    ; Check for EOF
        If @error = -1 Then ExitLoop
    ; Since the line was sucessfully read, advance the line number
        $linenum = $linenum + 1
    ; Skip lines
        If $linenum <= $iSkipHeaderLines Then ContinueLoop
    ; Check for lines to skip if they're commented out
        If $iIgnoreComments And StringInStr($line, $sIgnore) == $IGNORELOCATION Then ContinueLoop
        
    ; Split the line according to the delimiter
        Local $arrayRow = StringSplit($line, $sDelim)
        
    ; Assign the elements into the original array
        For $col = 0 To $arrayRow[0] - 1
            $aArray[$rowIndex][$col] = $arrayRow[$col + 1]
        Next
        $rowIndex = $rowIndex + 1
        
        If $iMaxRowsToUse > 0 AND $rowIndex >= $iMaxRowsToUse Then ExitLoop
    WEnd
    
    FileClose($handle)
    SetExtended($numRows)
    Return $aArray
EndFunc  ;==>_FileReadTo2DArray
Edited by mooseydoom
Link to comment
Share on other sites

Nice function!

But do you realize that you read the entire file 3 times? Why don't you combine your _FileCountUnCommentedLines() and _FileMaxOccurances() functions, then you'd only need to read it twice. Or write your routine to adjust the array using ReDim, then you could get by only reading it once.

Here's the function I use:

;Function reads a given file and returns a 2d array based on $sDelim and EOL for the dimensions.
;$vFile can be a string file path, or a handle to an already open file
;$sDelim tells which delimiters to use to split each line
;$iDelimFlag affects $sDelim.  Set helpfile for StringSplit() for details
;$sComment holds the string that when a line begins with that string, it will not be added to the array
;The return array is 1-based in both dimensions:
;[0][0] = number of lines
;[0][1] = maximum number of items
;[x][0] = number of items for line x
;Written by Mike Ratzlaff, 20050728
Func _FileReadToArray2d($vFile, $sDelim, $iDelimFlag=0, $sComment='')
    Local $hFile, $sLine, $aTemp, $aResult[100][10], $iCommentLen = StringLen($sComment)
    $aResult[0][0]=0
    $aResult[0][1]=2
    
;open the file if needed
    If IsInt($vFile) Then;$vFile is a string containing a filename
        $hFile = $vFile
    Else;$vFile is a filehandle, assume to be already open
        $hFile = FileOpen($vFile, 0)
    EndIf

    Local $New = 0, $NewH = UBound($aResult, 1), $NewW = UBound($aResult, 2), $i, $IsComment
    $sLine = FileReadLine($hFile)
    While Not @error
    ;check to see if this is a commented line
        $IsComment = 0
        If $iCommentLen And StringLeft(StringStripWS($sLine, 1), $iCommentLen) = $sComment Then $IsComment = 1
    ;Process a non-comment line
        If Not $IsComment Then
        ;split read line into temp array
            $aTemp = StringSplit($sLine, $sDelim, $iDelimFlag)
        ;check output array height
            $aResult[0][0] = $aResult[0][0] + 1
            If $aResult[0][0] >= $NewH Then
                $NewH = $NewH * 2
                $New = 1
            EndIf
        ;check array output width
            If $aTemp[0] > $aResult[0][1] Then $aResult[0][1] = $aTemp[0]
            If $aTemp[0] >= $NewW Then
                $NewW = $aTemp[0] * 2
                $New = 1
            EndIf
        ;adjust output array size if necessary
            If $New Then
                ReDim $aResult[$NewH][$NewW]
                $New = 0
            EndIf
            $aResult[$aResult[0][0]][0] = $aTemp[0]
        ;add temp array to output array
            For $i = 1 To $aTemp[0]
                $aResult[$aResult[0][0]][$i] = $aTemp[$i]
            Next
        EndIf
    ;read the next line
        $sLine = FileReadLine($hFile)
    WEnd

    If IsInt($vFile) Then;$vFile is a string containing a filename
        $hFile = $vFile
    Else;$vFile is a filehandle, assume to be already open
        FileClose($vFile)
    EndIf
    
;Clean out output array and return it
    ReDim $aResult[$aResult[0][0] + 1][$aResult[0][1] + 1]
    Return $aResult
EndFunc

Edit: Per request, I have change the CODEBOX tags to just CODE tags.

Edited by blindwig
Link to comment
Share on other sites

Grrr blindwig, why you never commented my func?

Is it more or less the same... :)

<{POST_SNAPBACK}>

It looks pretty different actually. Your function takes a string and returns a string. Also, the inner split is only done on the first occurance of the delimiter. So your function could handle "a,b;c,d;e,f" but wouldn't do "a,b,c;d,e,f" properly.

My function (and mooseydoom's function) take a file as input, read each line, and return a 2d array that was split first by lines and then by delimiters.

Try this:

$aTemp = _FileReadToArray2d(@WindowsDir & '\Win.Ini', '=', 0, ';')

_Array1Box($aTemp, 'Demo of _FileReadToArray2d')

and you'll see what I mean.

Link to comment
Share on other sites

Yeah, i do read the file three times. =\ I like ur function better :). I didn't know ReDim could add cells to an array. Gotta try that. Thanx for the feedback.

Nice function!

But do you realize that you read the entire file 3 times?  Why don't you combine your _FileCountUnCommentedLines() and _FileMaxOccurances() functions, then you'd only need to read it twice.  Or write your routine to adjust the array using ReDim, then you could get by only reading it once.

<{POST_SNAPBACK}>

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...