Jump to content
DXRW4E

_XLSXReadToArray (using only Autoit)

Recommended Posts

DXRW4E

The _XLSXReadToArray Fuction Reads the EXCEL XLSX Sheet into an Array, not needed that Office is installed and do not use Object just use the REGEXP so_XLSXReadToArray works very fast, in a test done by me here with BulletinSearch_20121008_170143.xlsx (1,5 Mb, Rows = 12148 & Column = 15), the _XLSXReadToArray Return Array in 10 seconds, instead to Excel 2013 them have served more than 20 seconds to open the file ect ect, Normally should not happen ever of not finding the "[Content_Types].xml" or other files mentioned in "[Content_Types].xml", but if anything happens already Added the >_FileListToArrayEx to try to search the *.xml files in "Temp Extract Diretory"

 

#include-once

; #UDF# =======================================================================================================================
; Title .........: XLSX Read To Array
; AutoIt Version : 3.3.8.1
; Language ......: English
; Description ...: Fuction Reads the EXCEL XLSX Sheet into an Array
; Author(s) .....: DXRW4E
; Notes .........:
; ===============================================================================================================================

; #CURRENT# =====================================================================================================================
;~ _XLSXReadToArray
;~ _XLSXSheetGetColumnNumber
;~ _SSNToDate
;~ _DateToSSN
;~ _FileListToArrayEx
; ===============================================================================================================================

If Not ObjEvent("AutoIt.Error") Then Global Const $_XLSXZip_COMErrorFunc = ObjEvent("AutoIt.Error", "_XLSXZip_COMErrorFunc")
Global $DateSSN[27] = [0, 31, 29, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31, 0, 31, 60, 91, 121, 152, 182, 213, 244, 274, 305, 335]

; #FUNCTION# ================================================================================================================================
; Name...........: _XLSXReadToArray
; Description ...: The _XLSXReadToArray Fuction Reads the EXCEL XLSX Sheet into an Array
; Syntax.........: _XLSXReadToArray($XLSXFilePath, $iFlag)
; Parameters ....: $XLSXFilePath - Path and filename of the XLSX file to be read.
;                  $iFlag   - Optional
;                  |$iFlag = 0 (Default) None
;                  |$iFlag = 1 if there are also add HyperLinks in the Array
;                    Strings\Test in Column\Rows will separate from HyperLink by @LF, example "Value" & @LF & http://www.autoitscript.com/forum/
;                  |$Cols   - Optional, Columns Number to Read (read only X column)
;                  |$Rows   - Optional, Rows Number to Read (read only X Row)
;                  |$iSheet - Optional, Number of Sheet*.xml to Read, Default is 1
; Return values .: Success  - Return
;                    Array ($Array[0][0] = Rows Number & @Extended = Column Nmmber)
;                    If Set $Cols and $Rows Return is String Data, if Return Strigs = "" @Extended is Set to 1
;                  @Error - Set Error
;                  |1 = XLSX file not found or invalid (Can not Read\Extract XLSX file)
;                  |2 = [Content_Types].xml not found or invalid
;                  |3 = sheet1.xml file not found or invalid
;                  |4 = Sheet Dimension Not found Or is Greater than 15999999 (Array Size Limit)
;                  |5 = No SheetDate (Columns & Rows) Found
; Author ........: DXRW4E
; Modified.......:
; Remarks .......:
; Related .......:
; Link ..........:
; Example .......: __XLSXReadToArray("C:\file.xlsx")
; Note ..........:
; ===========================================================================================================================================
Func _XLSXReadToArray($XLSXFilePath, $iFlag = 0, $Cols = 0, $Rows = 0, $iSheet = 1)
    If ($Cols * ($Rows + 1)) > 15999999 Then Return SetError(4, 0, "")
    Local $XLSXExtractDir = @WindowsDir & "\Temp\XLSX_" & @YEAR & @MON & @MDAY & @HOUR & @MIN & @SEC & @MSEC, $XLSXZip, $oShell, $X

    $XLSXZip = FileCopy($XLSXFilePath, $XLSXExtractDir & "\__xlsx.zip", 9)
    If Not $XLSXZip Then Return SetError(1, DirRemove($XLSXExtractDir, 1), "")
    $oShell = ObjCreate("shell.application")
    $oShell.Namespace($XLSXExtractDir).CopyHere($oShell.Namespace($XLSXExtractDir & "\__xlsx.zip").items, 20)

    Local $ContentTypesXML = StringReplace(FileRead($XLSXExtractDir & "\[Content_Types].xml"), "/", "\", 0, 1)
    If Not $ContentTypesXML Then
        $ContentTypesXML = _FileListToArrayEx($XLSXExtractDir, "*Content*Types*.xml", 37)
        If Not @Error Then $ContentTypesXML = StringReplace(FileRead($XLSXExtractDir & "\" & $ContentTypesXML[0]), "/", "\", 0, 1)
        If Not $ContentTypesXML Then Return SetError(2, DirRemove($XLSXExtractDir, 1), "")
    EndIf
    Local $SharedStringsXMLPath = StringRegExp($ContentTypesXML, '(?si)<Override\s*PartName="([^"]*\\sharedStrings\.xml)"', 1)
    If @Error Then $SharedStringsXMLPath = _FileListToArrayEx($XLSXExtractDir, "sharedStrings.xml", 165)
    If Not @Error Then $SharedStringsXMLPath = $XLSXExtractDir & $SharedStringsXMLPath[0]
    Local $SheetXMLPath = StringRegExp($ContentTypesXML, '(?si)<Override\s*PartName="([^"]*\\sheet' & $iSheet & '\.xml)"', 1)
    If @Error Then $SheetXMLPath = _FileListToArrayEx($XLSXExtractDir, "sheet" & $iSheet & ".xml", 165)
    If @Error Then $SheetXMLPath = StringRegExp($ContentTypesXML, '(?si)<Override\s*PartName="([^"]*\\sheet[^"]*\.xml)"', 1)
    If @Error Then $SheetXMLPath = _FileListToArrayEx($XLSXExtractDir, "sheet*.xml", 165)
    If Not @Error Then $SheetXMLPath = $XLSXExtractDir & $SheetXMLPath[0]

    Local $WorkSheet = FileRead($SheetXMLPath)
    If Not $WorkSheet Then Return SetError(3, DirRemove($XLSXExtractDir, 1), "")
    Local $SharedStringsXML = FileRead($SharedStringsXMLPath)

    ;;      Example Get File Path using the StringRegExpReplace(), but more slowly than StringRegExp Mod.
    ;;  Local $WorkBookXMLPath = $XLSXExtractDir & StringRegExpReplace($ContentTypesXML, '(?si).*<Override\s+PartName="([^"]*\\workbook\.xml)".*', "$1")
    ;;  Local $StylesXMLPath = $XLSXExtractDir & StringRegExpReplace($ContentTypesXML, '(?si).*<Override\s+PartName="([^"]*\\styles\.xml)".*', "$1")
    ;;  Local $SharedStringsXMLPath = $XLSXExtractDir & StringRegExpReplace($ContentTypesXML, '(?si).*<Override\s+PartName="([^"]*\\sharedStrings\.xml)".*', "$1")
    ;;  Local $SheetXMLPath = $XLSXExtractDir & StringRegExpReplace($ContentTypesXML, '(?si).*<Override\s+PartName="([^"]*\\sheet1\.xml)".*', "$1")
    ;;
    ;;      ;; read other ect ect ect
    ;;  Local $WorkBookXMLPath = StringRegExp($ContentTypesXML, '(?si)<Override\s*PartName="([^"]*\\workbook\.xml)"', 1)
    ;;  If @Error Then $WorkBookXMLPath = _FileListToArrayEx($XLSXExtractDir, "workbook.xml", 165)
    ;;     If Not @Error Then $WorkBookXMLPath = $XLSXExtractDir & $WorkBookXMLPath[0]
    ;;  $WorkBookXML = FileRead($WorkBookXMLPath)
    ;;    ;Example using the StringRegExpReplace()
    ;;    Local $SheetName = $XLSXExtractDir & StringRegExpReplace($WorkBookXML, '(?si).*<sheet\s+name="([^"]*)".*', "$1")
    ;;
    ;;  Local $AppXMLPath = StringRegExp($ContentTypesXML, '(?si)<Override\s*PartName="([^"]*\\app\.xml)"', 1)
    ;;  If @Error Then $AppXMLPath = _FileListToArrayEx($XLSXExtractDir, "app.xml", 165)
    ;;     If Not @Error Then $AppXMLPath = $XLSXExtractDir & $AppXMLPath[0]
    ;;  $AppXML = FileRead($AppXMLPath)
    ;;    ;Example using the StringRegExpReplace()
    ;;    Local $AppVersion = $XLSXExtractDir & StringRegExpReplace($AppXML, '(?si).*<AppVersion>([^<]*)</AppVersion>.*', "$1")
    ;;
    ;;  Local $CoreXMLPath = StringRegExp($ContentTypesXML, '(?si)<Override\s*PartName="([^"]*\\core\.xml)"', 1)
    ;;  If @Error Then $CoreXMLPath = _FileListToArrayEx($XLSXExtractDir, "core.xml", 165)
    ;;     If Not @Error Then $CoreXMLPath = $XLSXExtractDir & $CoreXMLPath[0]
    ;;  $CoreXML = FileRead($CoreXMLPath)
    ;;    ;Example using the StringRegExpReplace()
    ;;    Local $Modified = $XLSXExtractDir & StringRegExpReplace($CoreXML, '(?si).*<dcterms\:modified[^>]([^<]*)</dcterms:modified>.*', "$1")
    ;;
    ;;  Local $StylesXMLPath = StringRegExp($ContentTypesXML, '(?si)<Override\s*PartName="([^"]*\\styles\.xml)"', 1)
    ;;  If @Error Then $StylesXMLPath = _FileListToArrayEx($XLSXExtractDir, "styles.xml", 165)
    ;;  If Not @Error Then $StylesXMLPath = $XLSXExtractDir & $StylesXMLPath[0]
    ;;    $StylesXML = FileRead($StylesXMLPath)
    ;;  ;ect ect ect ect
    ;;
    ;;  Local $SheetViews = StringRegExp($WorkSheet, '(?si)<sheetView\s+tabSelected="([^"]*)".*?\sworkbookViewId="([^"]*)".*?\stopLeftCell="([A-Z]{1,3})([0-9]+)".*?\sactiveCell="([A-Z]{1,3})([0-9]+)".*?\sdefaultRowHeight="([^"]*)"', 1)
    ;;  If @Error Then Local $SheetViews[7]
    ;;  $SheetViews[2] = _XLSXSheetGetColumnNumber($SheetViews[2])
    ;;  $SheetViews[4] = _XLSXSheetGetColumnNumber($SheetViews[4])
    ;;  ;;;;  $SheetViews....
    ;;  ;;;;    $SheetViews[0] = tabSelected
    ;;  ;;;;    $SheetViews[1] = workbookViewId
    ;;  ;;;;    $SheetViews[2] = top Left Cell - Column Number
    ;;  ;;;;    $SheetViews[3] = top Left Cell - Rows Nmmber
    ;;  ;;;;    $SheetViews[4] = active Cell - Column Number
    ;;  ;;;;    $SheetViews[5] = active Cell - Rows Nmmber
    ;;  ;;;;    $SheetViews[6] = default Row Height
    DirRemove($XLSXExtractDir, 1)
    Local $nCols = Number($Cols), $nRows = "[0-9]+", $X = StringRegExp($WorkSheet, '(?si)<([^:><]*:?)?worksheet\s+', 1)
    If Not @Error Then $X = $X[0]
    If $Rows > 0 Then    ;;;;    StringRegExp($WorkSheet, '(?s)<' & $X & 'row\s+r="' & $Rows  & '".*?</' & $X & 'row>', 1)
        $nRows = $Rows
        $Rows = 1
    EndIf
    Local $SheetDimension = StringRegExp($WorkSheet, '(?si)<' & $X & '(?:dimension|autoFilter)\s+ref="([A-Z]{1,3})([0-9]+):([A-Z]{1,3})(?i)([0-9]+)', 1)
    If Not @Error Then
        $Cols = _XLSXSheetGetColumnNumber($SheetDimension[2])
        If $nRows = "[0-9]+" Then $Rows = $SheetDimension[3]
    EndIf
    $SheetDimension = StringRegExp($WorkSheet, '(?si)<' & $X & 'col\s+min="?(\d+)[^>]*></' & $X & 'cols>', 1)
    If Not @Error And $SheetDimension[0] > $Cols Then $Cols = $SheetDimension[0]
    If $nRows = "[0-9]+" Then
        $SheetDimension = StringRegExp($WorkSheet, '(?si).*<' & $X & 'c\s+r="?[A-Z]*(\d+)', 1)
        If Not @Error And $SheetDimension[0] > $Rows Then $Rows = $SheetDimension[0]
    EndIf
    If $nCols > ($Cols + 1) Then Return SetError(5, 0, "")
    If $Cols < 1 Or $Rows < 1 Or ($Cols * ($Rows + 1)) > 15999999 Then Return SetError(4, 0, "")
    Local $SheetData = StringRegExp($WorkSheet, '(?s)<' & $X & 'c\s+r="([A-Z]{1,3})(?i)(' & $nRows & ')"\s*(?:s=")?([0-9]*)"?\s*(?:t=")?([^">]*)"?\s*><' & $X & 'v>([^<]*)\s*</' & $X & 'v>\s*</' & $X & 'c>', 3)
    If @Error Then Return SetError(5, 0, "")
    If $nCols Then $Cols = 1
    Local $SheetDataA[($Rows + 1)][$Cols] = [[UBound($SheetData) - 1]], $ColumnName, $ColumnNumber, $ColumnSize, $SharedStringsXMLSize
    If $SharedStringsXML Then
        Local $S = StringRegExp($SharedStringsXML, '(?si)<([^:><]*:?)?sst\s+', 1)
        If Not @Error Then $S = $S[0]
        $SharedStringsXML = StringRegExp($SharedStringsXML, '(?si)<' & $S & 'si>(?:<' & $S & 'r>.*?)?<' & $S & 't(?:/|\s[^>]*)?>(.*?)(?:</' & $S & 't>)?(?:</' & $S & 'r>)?</' & $S & 'si>', 3)
        If Not @Error Then
            $SharedStringsXMLSize = UBound($SharedStringsXML)
            For $i = 0 To $SharedStringsXMLSize - 1
                If StringInStr($SharedStringsXML[$i], "<", 1) Then $SharedStringsXML[$i] = StringRegExpReplace($SharedStringsXML[$i], '</' & $S & 't>.*?<' & $S & 't>', "")
                If StringInStr($SharedStringsXML[$i], "&", 1) Then $SharedStringsXML[$i] = StringReplace(StringReplace(StringReplace($SharedStringsXML[$i], "&lt;", "<", 0, 1), "&gt;", ">", 0, 1), "&amp;", "&", 0, 1)
            Next
        EndIf
    EndIf
    For $i = 0 To $SheetDataA[0][0] Step 5
        $ColumnSize = StringLen($SheetData[$i]) - 1
        If Not $ColumnSize Then
            $ColumnNumber = Asc($SheetData[$i]) - 65
        Else
            $ColumnName = StringToASCIIArray($SheetData[$i])
            $ColumnNumber = $ColumnName[$ColumnSize] - 65
            $ColumnNumber += 26 * ($ColumnName[$ColumnSize - 1] - 64)    ;(26 ^ 1) * ($ColumnName[1] - 64)
            If $ColumnSize > 1 Then $ColumnNumber += 676 * ($ColumnName[0] - 64)    ;(26 ^ 2) * ($ColumnName[0] - 64)
            ;;;$ColumnNumber = _XLSXSheetGetColumnNumber($SheetData[$i], 1)
        EndIf
        If $nCols Then
            If $nCols <> ($ColumnNumber + 1) Then ContinueLoop
            $ColumnNumber = 0
        EndIf
        If $Rows = 1 Then $SheetData[$i + 1] = 1
        If $SheetData[$i + 3] = "s" And $SharedStringsXMLSize > $SheetData[$i + 4] Then
            $SheetDataA[$SheetData[$i + 1]][$ColumnNumber] = $SharedStringsXML[$SheetData[$i + 4]]
        ElseIf $SheetData[$i + 2] = 2 Then
            $SheetDataA[$SheetData[$i + 1]][$ColumnNumber] = _SSNToDate($SheetData[$i + 4])
        Else
            $SheetDataA[$SheetData[$i + 1]][$ColumnNumber] = $SheetData[$i + 4]
        EndIf
    Next
    $SheetDataA[0][0] = $Rows
    If $iFlag Then
        $HyperLinks = StringRegExp($WorkSheet, '(?si)<' & $X & 'hyperlink\s+ref="([A-Z]{1,3})(?i)(' & $nRows & ')".*?\s+display="([^"]*)"', 3)
        ;;$HyperLinks = StringRegExp($WorkSheet, '(?si)<' & $X & 'hyperlink\s+ref="([A-Z]{1,3})(?i)' & $nRows & '"\s+r:id="([^"]*)"\s+display="([^"]*)"', 3)
        If Not @Error Then
            Local $HyperLinksSize = UBound($HyperLinks) - 1
            For $i = 0 To $HyperLinksSize Step 3
                $ColumnNumber = _XLSXSheetGetColumnNumber($HyperLinks[$i], 1)
                If $Rows = 1 Then $HyperLinks[$i + 1] = 1
                $SheetDataA[$HyperLinks[$i + 1]][$ColumnNumber] &= @LF & $HyperLinks[$i + 2]
            Next
        EndIf
    EndIf
    If $nCols And $Rows = 1 Then Return SetError(0, $SheetDataA[1][0] = "", $SheetDataA[1][0])
    Return SetError(0, UBound($SheetDataA, 2), $SheetDataA)
EndFunc   ;==>_XLSXReadToArray


; #FUNCTION# =================================================================================================================
; Name...........: _XLSXSheetGetColumnNumber
; Description ...: The _XLSXSheetGetColumnNumber Fuction return Column Number of EXCEL XLSX Sheet
; Syntax.........: _XLSXSheetGetColumnNumber($ColumnName)
; Parameters ....: $ColumnName - [A-Z] Uppercase Caracter\String, are not Supported line with more than 3 characters
;                  $iFlag      - Optional
;                  |$iFlag = 0 (Default) Column Number
;                  |$iFlag = 1 Column Number - 1 (for Array Index 0)
; Return values .: Success  - Return Column Number
;                  Failure - @Error
; Author ........: DXRW4E
; Modified.......:
; Remarks .......: Limit is 18278 (A = 1 & AB = 27 & ZZZ = 18278)
; Related .......:
; Link ..........:
; Example .......: _XLSXSheetGetColumnNumber("ABC")
; Note ..........:
; ============================================================================================================================
Func _XLSXSheetGetColumnNumber($ColumnName, $iFlag = 0)
    If Not StringRegExp($ColumnName, '^[A-Z]{1,3}$') Or $iFlag < 0 Or $iFlag > 1 Then Return SetError(1, 0, 0)
    Local $ColumnNumber, $SheetDimension = StringLen($ColumnName) - 1
    If Not $SheetDimension Then
        $ColumnNumber = Asc($ColumnName) - 64 - $iFlag
    Else
        $ColumnName = StringToASCIIArray($ColumnName)
        $ColumnNumber = $ColumnName[$SheetDimension] - 64 - $iFlag
        $ColumnNumber += 26 * ($ColumnName[$SheetDimension - 1] - 64)    ;(26 ^ 1) * ($ColumnName[1] - 64)
        If $SheetDimension > 1 Then $ColumnNumber += 676 * ($ColumnName[0] - 64)    ;(26 ^ 2) * ($ColumnName[0] - 64)
    EndIf
    Return $ColumnNumber
EndFunc   ;==>_XLSXSheetGetColumnNumber


; #FUNCTION# =================================================================================================================
; Name...........: _SSNToDate
; Description ...: The _SSNToDate Fuction return Date from sequential serial number
; Syntax.........: _SSNToDate($iDay)
; Parameters ....: $iDay - sequential serial number (generated from DATE fuction on EXCEL, example 39637 = 7/8/2008)
; Return values .: Success  - Return DATE
;                  Failure - @Error
; Author ........: DXRW4E
; Modified.......:
; Remarks .......: DATE String is Month/Day/Year, Year - is number is 1900 to 9999
; Related .......:
; Link ..........:
; Example .......: _SSNToDate(39637)
; Note ..........:
; ============================================================================================================================
Func _SSNToDate($iDay)
    If $iDay < 1 Or $iDay > 2958465 Then Return SetError(1, 0, "")
    $DateSSN[0] = Int($iDay / 365)
    $DateSSN[25] = $DateSSN[0] / 4
    $DateSSN[26] = IsFloat($DateSSN[25])
    $iDay = $iDay - ($DateSSN[0] * 365) - Int($DateSSN[25]) - $DateSSN[26]
    If $iDay < 1 Then
        $DateSSN[0] -= 1
        $DateSSN[25] = IsInt(($DateSSN[0] -1) / 4)
        $iDay += 365 + $DateSSN[25]
        $DateSSN[26] = Int($DateSSN[25] = 0)
    EndIf
    $DateSSN[2] -= $DateSSN[26]
    For $iMonth = 1 To 11
        If $DateSSN[$iMonth] >= $iDay Then ExitLoop
        $iDay -= $DateSSN[$iMonth]
    Next
    $DateSSN[2] += $DateSSN[26]
    Return $iMonth & "/" & $iDay & "/" & (1900 + $DateSSN[0])
EndFunc   ;==>_SSNToDate


; #FUNCTION# =================================================================================================================
; Name...........: _DateToSSN
; Description ...: The _DateToSSN Fuction return sequential serial number that represent a particular Date
; Syntax.........: _DateToSSN($iYear, $iMonth, $iDay)
; Parameters ....: $iYear  - Year  - is number is 1900 to 9999
;                      Required. The value of the year argument can include one to four digits. Excel interprets the year argument
;                      according to the date system your computer is using. By default, Microsoft Excel for Windows uses the 1900 date system.
;                      We recommend using four digits for the year argument to prevent unwanted results. For example, "07" could mean "1907" or "2007." Four digit years prevent confusion.
;                      If year is between 0 (zero) and 1899 (inclusive), Excel adds that value to 1900 to calculate the year. For example, DATE(108,1,2) returns January 2, 2008 (1900+108).
;                      If year is between 1900 and 9999 (inclusive), Excel uses that value as the year. For example, _DateToSSN((2008,1,2) returns January 2, 2008.
;                      If year is less than 0 or is 10000 or greater, _DateToSSN returns the @Error
;                  $iMonth - is number is 1 to 12, If Month is less than 0 or is 13 or greater, _DateToSSN returns the @Error
;                  $iDay   - Required. A positive or negative integer representing the day of the month from 1 to 31.
;                      If day is greater than the number of days in the month specified, day adds that number of days to the first day in the month.
;                        For example, _DateToSSN(2008,1,35) returns the serial number representing February 4, 2008.
;                      If day is less than 1, day subtracts the magnitude that number of days, plus one, from the first day of the month specified.
;                        For example, _DateToSSN(2008,1,-15) returns the serial number representing December 16, 2007.
; Return values .: Success  - Return Sequential Serial Number
;                  Failure - @Error
; Author ........: DXRW4E
; Modified.......:
; Remarks .......: Sequential Serial Number, _DateToSSN(2008, 7, 8) Return 39637, that represent 7/8/2008
;
;                    NOTE - Excel stores dates as sequential serial numbers so that they can be used in calculations. January 1, 1900 is serial number 1,
;                      and January 1, 2008 is serial number 39448 because it is 39,447 days after January 1, 1900.
;
;                    _DateToSSN NOT SUPPORT FOR NOW, THIS
;                      $iMonth  Required. A positive or negative integer representing the month of the year from 1 to 12 (January to December).
;                        If month is greater than 12, month adds that number of months to the first month in the year specified. For example,
;                        DATE(2008,14,2) returns the serial number representing February 2, 2009.
;                      If month is less than 1, month subtracts the magnitude of that number of months, plus 1, from the first month in the
;                        year specified. For example, DATE(2008,-3,2) returns the serial number representing September 2, 2007.
; Related .......:
; Link ..........:
; Example .......: _DateToSSN(39637)
; Note ..........:
; ============================================================================================================================
Func _DateToSSN($iYear, $iMonth, $iDay)
    If $iYear < 1900 Or $iYear > 9999 Or $iMonth < 1 Or $iMonth > 12 Then Return SetError(1, 0, "")
    $iYear -= 1900
    $DateSSN[0] = $iYear / 4
    If IsFloat($DateSSN[0]) And $iMonth < 3 Then $iDay += 1
    Return ($iYear * 365) + Int($DateSSN[0]) + $DateSSN[$iMonth + 12] + $iDay
EndFunc   ;==>_DateToSSN


; #FUNCTION# =======================================================================================================================================================
; Name...........: _FileListToArrayEx
; Description ...: Lists files and\or folders in a specified path (Similar to using Dir with the /B Switch)
; Syntax.........: _FileListToArrayEx($sPath[, $sFilter = "*"[, $iFlag = 0]])
; Parameters ....: $sPath   - Path to generate filelist for.
;                  $sFilter - Optional the filter to use, default is *. (Multiple filter groups such as "All "*.png|*.jpg|*.bmp") Search the Autoit3 helpfile for the word "WildCards" For details.
;                  $iFlag   - Optional: specifies whether to return files folders or both Or Full Path (add the flags together for multiple operations):
;                  |$iFlag = 0 (Default) Return both files and folders
;                  |$iFlag = 1 Return files only
;                  |$iFlag = 2 Return Folders only
;                  |$iFlag = 4 Search SubDirectory
;                  |$iFlag = 8 Return Full Path
;                  |$iFlag = 16 $sFilter do Case-Sensitive matching (By Default $sFilter do Case-Insensitive matching)
;                  |$iFlag = 32 Disable the return the count in the first element - effectively makes the array 0-based (must use UBound() to get the size in this case).
;                    By Default the first element ($array[0]) contains the number of file found, the remaining elements ($array[1], $array[2], etc.)
;                  |$iFlag = 64 $sFilter is REGEXP Mod, See Pattern Parameters in StringRegExp (Can not be combined with flag 16)
;                  |$iFlag = 128 Return Backslash at the beginning of the file name, example Return "\Filename1.xxx" (Can not be combined with flag 8)
; Return values .: Failure - @Error
;                  |1 = Path not found or invalid
;                  |2 = Invalid $sFilter
;                  |3 = No File(s) Found
; Author ........: DXRW4E
; Modified.......:
; Remarks .......: The array returned is one-dimensional and is made up as follows:
;                                $array[0] = Number of Files\Folders returned
;                                $array[1] = 1st File\Folder
;                                $array[2] = 2nd File\Folder
;                                $array[3] = 3rd File\Folder
;                                $array[n] = nth File\Folder
; Related .......:
; Link ..........:
; Example .......: Yes
; Note ..........: Special Thanks to SolidSnake & Tlem
; ==================================================================================================================================================================
Func _FileListToArrayEx($sPath, $sFilter = "*", $iFlag = 0)
    $sPath = StringRegExpReplace($sPath & "\", "[\\/]+", "\\")
    If Not FileExists($sPath) Then Return SetError(1, 1, "")
    If StringRegExp($sFilter, StringReplace('^\s*$|\v|[\\/:><"]|^\||\|\||\|$', "[" & Chr(BitAND($iFlag, 64) + 28) & '\/:><"]|^\||\|\||\|$', "\\\\")) Then Return SetError(2, 2, "")
    Local $hSearch, $sFile, $sFileList, $sSubDir = BitAND($iFlag, 4), $sDelim = "|", $sDirFilter = StringReplace($sFilter, "*", "")
    $hSearch = FileFindFirstFile($sPath & "*")
    If @Error Then Return SetError(3, 3, "")
    Local $hWSearch = $hSearch, $hWSTMP, $SearchWD, $Extended, $iFlags = StringReplace(BitAND($iFlag, 1) + BitAND($iFlag, 2), "3", "0")
    If BitAND($iFlag, 8) Then $sDelim &= $sPath
    If BitAND($iFlag, 128) Then $sDelim = "|\"
    If Not BitAND($iFlag, 64) Then $sFilter = StringRegExpReplace(BitAND($iFlag, 16) & "(?i)(", "16\(\?\i\)|\d+", "") & StringRegExpReplace(StringRegExpReplace(StringRegExpReplace(StringRegExpReplace($sFilter, "[^*?|]+", "\\Q$0\\E"), "\\E(?=\||$)", "$0\$"), "(?<=^|\|)\\Q", "^$0"), "\*+", ".*") & ")"
    While 1
        $sFile = FileFindNextFile($hWSearch)
        If @Error Then
            If $hWSearch = $hSearch Then ExitLoop
            FileClose($hWSearch)
            $hWSearch -= 1
            $SearchWD = StringLeft($SearchWD, StringInStr(StringTrimRight($SearchWD, 1), "\", 1, -1))
        ElseIf $sSubDir Then
            $Extended = @Extended
            If ($iFlags + $Extended <> 2) Then
                If $sDirFilter Then
                    If StringRegExp($sFile, $sFilter) Then $sFileList &= $sDelim & $SearchWD & $sFile
                Else
                    $sFileList &= $sDelim & $SearchWD & $sFile
                EndIf
            EndIf
            If Not $Extended Then ContinueLoop
            $hWSTMP = FileFindFirstFile($sPath & $SearchWD & $sFile & "\*")
            If $hWSTMP = -1 Then ContinueLoop
            $hWSearch = $hWSTMP
            $SearchWD &= $sFile & "\"
        Else
            If ($iFlags + @Extended = 2) Or StringRegExp($sFile, $sFilter) = 0 Then ContinueLoop
            $sFileList &= $sDelim & $sFile
        EndIf
    WEnd
    FileClose($hSearch)
    If Not $sFileList Then Return SetError(3, 3, "")
    Return StringSplit(StringTrimLeft($sFileList, 1), "|", StringReplace(BitAND($iFlag, 32), "32", 2))
EndFunc   ;==>_FileListToArrayEx


Func _XLSXZip_COMErrorFunc()
    Return SetError(1, 0, "")
EndFunc   ;==>_XLSXZip_COMErrorFunc


example

#include <Array.au3>
    #include <_XLSXReadToArray.au3>
    
    local $a, $t, $Error, $Extended
    $t = TimerInit()
    InetGet("http://go.microsoft.com/fwlink/?LinkID=245778", @WindowsDir & "\Temp\BulletinSearch.xlsx")
    $a = _XLSXReadToArray(@WindowsDir & "\Temp\BulletinSearch.xlsx", 0, 0, 0)
    $t = TimerDiff($t)
    _ArrayDisplay($a, $t)
    
    ;;or
    $t = TimerInit()
    $a = _XLSXReadToArray(@WindowsDir & "\Temp\BulletinSearch.xlsx", 0, 2, 11)
    $Error = @Error
    $Extended = @Extended
    $t = TimerDiff($t)
    ConsoleWrite("Timer = " & $t & @LF)
    ConsoleWrite("Return = " & $a & " - $Error = " & $Error & " - $Extended = " & $Extended & @LF)
    
    exit

_XLSXReadToArray.au3

Edited by DXRW4E

apps-odrive.pngdrive_app_badge.png box-logo.png new_logo.png MEGA_Logo.png

Share this post


Link to post
Share on other sites
DXRW4E

Updated

Ciao.


apps-odrive.pngdrive_app_badge.png box-logo.png new_logo.png MEGA_Logo.png

Share this post


Link to post
Share on other sites
Digisoul

Great work, DXRW4E.

It will be great if you give any feature to extract data from a single row.


73 108 111 118 101 65 117 116 111 105 116

Share this post


Link to post
Share on other sites
DXRW4E

Hi Digisoul, Thank you

already done, use flags $Cols and $Rows

Ciao.

Edited by DXRW4E

apps-odrive.pngdrive_app_badge.png box-logo.png new_logo.png MEGA_Logo.png

Share this post


Link to post
Share on other sites
Digisoul

Great. Isn't it good to extract the excel file in init() function?

I mean if i want to read the rows in loop then how can i accomplish the task? without reading all the file in memory.


73 108 111 118 101 65 117 116 111 105 116

Share this post


Link to post
Share on other sites
DXRW4E

Hi Digisoul, impossible to read the entire file using the flag $Cols and $Rows (I mean we do not recommended because it takes a long time), basic logic that Microsoft uses in its files xml and inf ect ect, first load the file and then start the loop ec ect (is the fastest way to read these types of files), example

local $a, $b, $c, $t, $Error, $Extended
;;    $t = TimerInit()
;;    InetGet("http://go.microsoft.com/fwlink/?LinkID=245778", @UserProfileDir & "DownloadsBulletinSearch.xlsx")
;;    $a = _XLSXReadToArray(@UserProfileDir & "DownloadsBulletinSearch.xlsx")
;;    $Error = @Error
;;    $Extended = @Extended
;;    $t = TimerDiff($t)
;;    ConsoleWrite("Timer = " & $t & @LF)
;;    ConsoleWrite("Return = " & $a & " - $Error = " & $Error & " - $Extended = " & $Extended & @LF)
;;        Return
;;        $a = 2D Array
;;        Timer = 11600.4528335857
;;        $Error = 0
;;        $Extended = 15
;;        _ArrayDisplay($a, $t)
$a = _XLSXReadToArray(@UserProfileDir & "DownloadsBulletinSearch.xlsx")
If Not @Error Then
    ;use $b = @Extended - 1 or $b = UBound($a, 2) - 1
    For $i = 1 To $a[0][0]
        For $y = 0 To $b
            $c &= $a[$i][$y] & "|" ; ect ect
        Next
    Next
EndIf


;;    $t = TimerInit()
;;    $a = _XLSXReadToArray(@UserProfileDir & "DownloadsBulletinSearch.xlsx", 0, 7)
;;    $Error = @Error
;;    $Extended = @Extended
;;    $t = TimerDiff($t)
;;    ConsoleWrite("Timer = " & $t & @LF)
;;    ConsoleWrite("Return = " & $a & " - $Error = " & $Error & " - $Extended = " & $Extended & @LF)
;;        Return
;;        $a = 2D Array
;;        Timer = 8768.37496942515
;;        $Error = 0
;;        $Extended = 1
;;        _ArrayDisplay($a, $t)
$a = _XLSXReadToArray(@UserProfileDir & "DownloadsBulletinSearch.xlsx", 0, 7)
If Not @Error Then
    ; use $b = @Extended - 1 or $b = UBound($a, 2) - 1, @Extended or UBound($a, 2) is always 1 in this case since it has been chosen read only 1 Column
    ; even if only one Column, return always is 2D Array, done so as not to create confusion them
    For $i = 1 To $a[0][0]
        $c &= $a[$i][0] & "|" ; ect ect
    Next
EndIf


;;    $t = TimerInit()
;;    $a = _XLSXReadToArray(@UserProfileDir & "DownloadsBulletinSearch.xlsx", 0, 0, 1)
;;    $Error = @Error
;;    $Extended = @Extended
;;    $t = TimerDiff($t)
;;    ConsoleWrite("Timer = " & $t & @LF)
;;    ConsoleWrite("Return = " & $a & " - $Error = " & $Error & " - $Extended = " & $Extended & @LF)
;;        Return
;;        $a =  2D Array
;;        Timer = 2342.78631833739
;;        $Error = 0
;;        $Extended = 15
;;        _ArrayDisplay($a, $t)
$a = _XLSXReadToArray(@UserProfileDir & "DownloadsBulletinSearch.xlsx", 0, 0, 1)
If Not @Error Then
    ; use $b = @Extended - 1 or $b = UBound($a, 2) - 1
    ; $a[0][0] is always 1 in this case since it has been chosen read only 1 Row
    For $i = 0 To $b
        $c &= $a[1][$b] & "|" ; ect ect
    Next
EndIf


;;    $t = TimerInit()
;;    $a = _XLSXReadToArray(@UserProfileDir & "DownloadsBulletinSearch.xlsx", 0, 7, 4)
;;    $Error = @Error
;;    $Extended = @Extended
;;    $t = TimerDiff($t)
;;    ConsoleWrite("Timer = " & $t & @LF)
;;    ConsoleWrite("Return = " & $a & " - $Error = " & $Error & " - $Extended = " & $Extended & @LF)
;;        Return
;;        $a = Windows Server 2008 R2 for Itanium-Based Systems
;;        Timer = 2164.4779710023
;;        $Error = 0
;;        $Extended = 0
$a = _XLSXReadToArray(@UserProfileDir & "DownloadsBulletinSearch.xlsx", 0, 7, 4)
If Not @Error Then
    ;;If @Extended = 1, the ColumnRows was found, but is empty ($a = "")
    $c = $a 
EndIf

sorry again for my English

Ciao a tutti.

Edited by DXRW4E

apps-odrive.pngdrive_app_badge.png box-logo.png new_logo.png MEGA_Logo.png

Share this post


Link to post
Share on other sites
annybaby

thanks for sharing

but,what a pity,it doesn't work in my PC

Share this post


Link to post
Share on other sites
DXRW4E

Hi annybaby, Return error code ??

send me the file you are trying to read

Ciao.


apps-odrive.pngdrive_app_badge.png box-logo.png new_logo.png MEGA_Logo.png

Share this post


Link to post
Share on other sites
SirWayNe

Hi DXRW4E,

first of all: thx for sharing, looks nice, but i got several reading errors using your script.

$aScriptLanguageFile = _XLSXReadToArray($sScriptLanguageFile)
MsgBox("",@extended,@error)
_ArrayDisplay($aScriptLanguageFile)

This code shows the way i use your Script, and it fills the given array improperly. i will attach the .xlsx File on this post.

I am using Microsoft Office 2007 with updates applied.

I got reading i.e. Errors on Line 476. It does not Exist in the array and other Array entries got different Values then it should be.

Hope you can help me, because i really want to use a .xlsx File for different Languages instead of using different .txt Files.....

Greetings SirWayNe

Language.xlsx

Share this post


Link to post
Share on other sites
DXRW4E

Hi SirWayNe, thanks for reporting, try now?

Ciao.

  • Like 1

apps-odrive.pngdrive_app_badge.png box-logo.png new_logo.png MEGA_Logo.png

Share this post


Link to post
Share on other sites
SirWayNe

Yep, that looks fine now. Thanks for fast Reply finally i can use my LanguageFile in a proper way.

You will be given Credits for your awesome work :D

Share this post


Link to post
Share on other sites
gicandles

I would appreciate a little help.

I would like to use this to do the following:

I have designed a system for selecting the proper label to print and send the print commands to the correct label printer. This part I have done with no problems.

Now I am adding a part that allows the user to select a Client name from a xlsx file and the system then sends the data (tag line like www.client1.com) to the printer. I can set it up to that it prints fine.

What I am having difficulty with is populating a combo box with the names in column 1 and then having it save the value of column 2 from the row of the selected client in the combo box into a variable. I can display the individual cells and then save a specific cell to a variable, but the problem that I am having is the linking. The part where it automatically searches for the tag line in the same row and saves it to a variable.

I also must admit that I don't understand how to use the $cols and $rows flags above.

Any additional insight would be much appreciated. I tried to read through your code to see if I could gleam anything, but I am just not familiar enough to do so with any success.

Thanks for any help or suggestions you can come up with.

Share this post


Link to post
Share on other sites
DXRW4E

hi gicandles, _XLSXReadToArray() (in some system needs admin right) return full array, after you can do anything you want them (Look here for more http://www.autoitscript.com/autoit3/docs/libfunctions/Array%20Management.htm )

#include <Array.au3>
#include <SendMessage.au3>
#include <GUIConstantsEx.au3>
#include <ComboConstants.au3>
#include <WindowsConstants.au3>
#include <_XLSXReadToArray.au3>

#Region ### START Koda GUI section ### Form=
$Form1 = GUICreate("Form1", 615, 437)
$hCombo = GUICtrlCreateCombo("ALL", 120, 192, 377, 25, BitOR($GUI_SS_DEFAULT_COMBO,$CBS_SIMPLE,$CBS_DISABLENOSCROLL,$WS_HSCROLL))
GUISetState(@SW_SHOW)
#EndRegion ### END Koda GUI section ###

Local $aXLSX, $iError, $iExtended, $sXLSX
$aXLSX = _XLSXReadToArray("C:\xxxx.xlsx", 0, 0, 0)
$iError = @Error
$iExtended = @Extended
ConsoleWrite("$iError = " & $iError & " - $iExtended = " & $iExtended & @LF)
;_ArrayDisplay($aXLSX)


For $i = 2 To $aXLSX[0][0]
    $sXLSX &= $aXLSX[$i][0] & "|"
Next
_SendMessage(GUICtrlGetHandle($hCombo), $CB_RESETCONTENT)
GUICtrlSetData($hCombo, $sXLSX & "All", "All")

While 1
    $nMsg = GUIGetMsg()
    Switch $nMsg
        Case $GUI_EVENT_CLOSE
            Exit
        Case $hCombo
            $sXLSX = GUICtrlRead($hCombo)
            For $i = 2 To $aXLSX[0][0]
                If $sXLSX = $aXLSX[$i][0] Then
                    MsgBox(0, $i, $aXLSX[$i][1])
                    ;;do something
                    ;;$aXLSX[$i][1]
                    ;;$aXLSX[$i][2]
                    ;;$aXLSX[$i][3]
                    ExitLoop
                EndIf
            Next
    EndSwitch
WEnd
Edited by DXRW4E

apps-odrive.pngdrive_app_badge.png box-logo.png new_logo.png MEGA_Logo.png

Share this post


Link to post
Share on other sites
gicandles

Arg. Never fails. When I ask I figure it out.

Here is what I finally did. If anyone has any improvements or suggestions please send them my way.

First here is the structure of my data after import with _XLSXReadtToArray

$ClientList[0][0]:=2

$ClientList[0][1]:=

$ClientList[0][2]:=

$ClientList[1][0]:=Name1

$ClientList[1][1]:=Tag

$ClientList[1][2]:=Date Effective

$ClientList[2][0]:=Name2

$ClientList[2][1]:=Tag2

$ClientList[2][2]:=

Obviously this is just a sample file for test purposes. What I was looking to do was have 2,1 3,1 4,1 5,1 etc display in a combo box, and then have the corresponding 2,2 3,2 4,2 5,2 etc sent on to my print script.

First I used the following two functions to create the array and then to create a var with the data to load into the combo box.

Func _LoadClients ()
$t = TimerInit()
$ClientList = _XLSXReadToArray("L:\Databases\clients.xlsx")
$Error = @Error
$Extended = @Extended
$t = TimerDiff($t)
;~   ConsoleWrite("Timer = " & $t & @LF)
;~   ConsoleWrite("Return = " & $ClientList & " - $Error = " & $Error & " - $Extended = " & $Extended & @LF)
EndFunc

Func _DisplayClientName ()
$s = $ClientList[0][0]
For $i = 1 To $s Step 1
$list123 &= "|" & $ClientList[$i][0]
Next
EndFunc

Next I loaded both functions sequentially when the GUI with the print interface is loaded (Actually when the button is pushed to open that GUI)

I also had the same button load the $list123 into the combo box with GUICtrlSetData($privateComboList, $list123)

Finally when my user clicks the "print" button on my GUI it calls the following script

Func _SearchClientTag ()
$ClientName = GUICtrlRead($privateComboList)
ConsoleWrite ("Begin Search"&@LF)
ConsoleWrite ($ClientName & @LF)
$result = _ArraySearch($ClientList, $ClientName)
ConsoleWrite ("error code is: " & @error & @LF)
ConsoleWrite ("Client Name is : " & $result & @LF)
$aftertag = $ClientList[$result][1]
ConsoleWrite ("The Tag line is: " & $aftertag & @LF)
EndFunc

$aftertag is a global variable I have set that is later used to replace a identifier in an xml file that triggers the actual printing.

Also I should note that I have declarded $ClientList and $list123 as globals elsewhere.

Hope this helps someone.

Share this post


Link to post
Share on other sites
gicandles

Adventurer,

Thanks for the post. I did not see it until after I posted my solution. To tired at the moment to go through it in great detail, but it looks like I ended up doing something very similar. I appreciate the help and will review a little closer in the morning.

Thanks.

Share this post


Link to post
Share on other sites
SirWayNe

Hey its me again :)

I just had another issue with your currently well working Script.

I uploaded a Foto that shows the Issue. Its with my ID 924-927.

The Array on the left side is created by your Function, _XLSXReadToArray.

Hope you can help me again, because i am using your Script to apply different languages to my Script.

Greetings, SirWayNe

post-76274-0-04848300-1370269559_thumb.j

Language_AudioParser.xlsx

Share this post


Link to post
Share on other sites
DXRW4E

Hi SirWayNe, thanks for reporting, try now?
 

Ciao.


apps-odrive.pngdrive_app_badge.png box-logo.png new_logo.png MEGA_Logo.png

Share this post


Link to post
Share on other sites
SirWayNe

Yep all fine now. Thx again.

Share this post


Link to post
Share on other sites
fenhanxue
Posted (edited)

<snip>

thanks for so great udf  !

however, it do not work correctly with my xlsx file as follow:

can  you  help me ?

111.xlsx

Edited by Melba23
Huge quote removed

Share this post


Link to post
Share on other sites
BrewManNH

Post the script you're using.


If I posted any code, assume that code was written using the latest release version unless stated otherwise. Also, if it doesn't work on XP I can't help with that because I don't have access to XP, and I'm not going to.
Give a programmer the correct code and he can do his work for a day. Teach a programmer to debug and he can do his work for a lifetime - by Chirag Gude
How to ask questions the smart way!

I hereby grant any person the right to use any code I post, that I am the original author of, on the autoitscript.com forums, unless I've specifically stated otherwise in the code or the thread post. If you do use my code all I ask, as a courtesy, is to make note of where you got it from.

Back up and restore Windows user files _Array.au3 - Modified array functions that include support for 2D arrays.  -  ColorChooser - An add-on for SciTE that pops up a color dialog so you can select and paste a color code into a script.  -  Customizable Splashscreen GUI w/Progress Bar - Create a custom "splash screen" GUI with a progress bar and custom label.  -  _FileGetProperty - Retrieve the properties of a file  -  SciTE Toolbar - A toolbar demo for use with the SciTE editor  -  GUIRegisterMsg demo - Demo script to show how to use the Windows messages to interact with controls and your GUI.  -   Latin Square password generator

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

  • Similar Content

    • Skeletor
      By Skeletor
      Hi Virtual People,
      My array works perfectly fine. However, what is the best practice if the line in the array doesn't have the correct amount of columns and if I can add a placeholder?

       
      For $count = 1 To _FileCountLines($FileRead1) Step 1 $string = FileReadLine($FileRead1, $count) $input = StringSplit($string, ",", 1) $value1 = $input[1] $value2 = $input[2] $value3 = $input[3] _Excel_RangeWrite($oWorkbook, $oWorkbook.Activesheet, $value2, "A1") _Excel_RangeWrite($oWorkbook, $oWorkbook.Activesheet, $value1, "B1") _Excel_RangeWrite($oWorkbook, $oWorkbook.Activesheet, $value3, "C1") Next  
    • Skeletor
      By Skeletor
      Hi Guys,
      Can anyone point me in the right direction. I'm trying to accomplished conditional formatting with arrows, but the code I have is wrong... 
      .Range("=$A3:$A4000").FormatConditions.Add = (xlCellValue, xlGreater, "=$A:$A") .IconSet(xl3Arrows) ;ActiveWorkbook.iconsets I also tried it like this:
      .Range("=$A3:$A4000").FormatConditions _ .Add(xlCellValue, xlGreater, "=$A:$A") .IconSet(xl3Arrows) ;ActiveWorkbook.iconsets  
    • MrCheese
      By MrCheese
      hi all,
      reviewing the forum, this thread is applicable: 
       
       
      I wanted to know if there is now a better way to do this?
      In essence, I load a tab delimited txt file into an array (works well). I used tab, as some fields in the original csv contains commas.
      However, I needed autoit to manipulate this array, and output it as a csv.
      IF my array contains items with a comma, without double quotes around the field, then how best do I get a csv out of this?
      My current workaround is to filewritefromarray tab delimited, then open it in excel and save as a csv. I will need to check this to see how the address fields behave that contain a comma.
       
      Any thoughts would be appreciated.
       
    • Skeletor
      By Skeletor
      Hi Guys,
      How would you use _Excel_FilterSet to filter the excel sheet from largest number to smallest number?
      _Excel_FilterSet($oWorkbook, "Sheet1", "A2:I2", 1, ">20", 1, "<40") ; OR _Excel_FilterSet($oWorkbook, "Sheet1", "A2:I2", 1, ">", 1, "<") Excel Sheet attached. Trying to filter on the last column "I2"
      Inventory.xlsx
    • Skeletor
      By Skeletor
      Hi All,
      Has anyone come across this before?
      Code below:
      $SheetList = _Excel_SheetList($oWorkbook) _FileWriteFromArray("C:\" & $SheetListOutput, $SheetList, 1) Result:
      ResultABC| ResultDEF| ResultGHI| ResultJKL| ResultMNO| It created these "|" vertical bars?
×