Popular Post Gianni Posted February 23, 2015 Popular Post Posted February 23, 2015 (edited) This is for extraction of data from HTML tables to an array. It uses an raw html source file as input, and does not relies on any browser. You can get the source of the html using commands like InetGet(), InetRead(), _INetGetSource(), _IEDocReadHTML() for example, or load an html file from disc as well. It also takes care of the data position in the table due to rowspan and colspan trying to keep the same layout in the generated array. It has the option to fill the cells in the array corresponding with the "span" zones all with the same value of the first "span" cell of the corresponding area. expandcollapse popup; save this as _HtmlTable2Array.au3 #include-once #include <array.au3> ; ; #FUNCTION# ==================================================================================================================== ; Name ..........: _HtmlTableGetList ; Description ...: Finds and enumerates all the html tables contained in an html listing (even if nested). ; if the optional parameter $i_index is passed, then only that table is returned ; Syntax ........: _HtmlTableGetList($sHtml[, $i_index = -1]) ; Parameters ....: $sHtml - A string value containing an html page listing ; $i_index - [optional] An integer value indicating the number of the table to be returned (1 based) ; with the default value of -1 an array with all found tables is returned ; Return values .: Success; Returns an 1D 1 based array containing all or single html table found in the html. ; element [0] (and @extended as well) contains the number of tables found (or 0 if no tables are returned) ; if an error occurs then an ampty string is returned and the following @error code is setted ; @error: 1 - no tables are present in the passed HTML ; 2 - error while parsing tables, (opening and closing tags are not balanced) ; 3 - error while parsing tables, (open/close mismatch error) ; 4 - invalid table index request (requested table nr. is out of boundaries) ; =============================================================================================================================== Func _HtmlTableGetList($sHtml, $i_index = -1) Local $aTables = _ParseTags($sHtml, "<table", "</table>") If @error Then Return SetError(@error, 0, "") ElseIf $i_index = -1 Then Return SetError(0, $aTables[0], $aTables) Else If $i_index > 0 And $i_index <= $aTables[0] Then Local $aTemp[2] = [1, $aTables[$i_index]] Return SetError(0, 1, $aTemp) Else Return SetError(4, 0, "") ; bad index EndIf EndIf EndFunc ;==>_HtmlTableGetList ; #FUNCTION# ==================================================================================================================== ; Name ..........: _HtmlTableWriteToArray ; Description ...: It writes values from an html table to a 2D array. It tries to take care of the rowspan and colspan formats ; Syntax ........: _HtmlTableWriteToArray($sHtmlTable[, $bFillSpan = False[, $iFilter = 0]]) ; Parameters ....: $sHtmlTable - A string value containing the html code of the table to be parsed ; $bFillSpan - [optional] Default is False. If span areas have to be filled by repeating the data ; contained in the first cell of the span area ; $iFilter - [optional] Default is 0 (no filters) data extracted from cells is returned unchanged. ; - 0 = no filter ; - 1 = removes non ascii characters ; - 2 = removes all double whitespaces ; - 4 = removes all double linefeeds ; - 8 = removes all html-tags ; - 16 = simple html-tag / entities convertor ; Return values .: Success: 2D array containing data from the html table ; Faillure: An empty strimg and sets @error as following: ; @error: 1 - no table content is present in the passed HTML ; 2 - error while parsing rows and/or columns, (opening and closing tags are not balanced) ; 3 - error while parsing rows and/or columns, (open/close mismatch error) ; =============================================================================================================================== Func _HtmlTableWriteToArray($sHtmlTable, $bFillSpan = False, $iFilter = 0) $sHtmlTable = StringReplace(StringReplace($sHtmlTable, "<th", "<td"), "</th>", "</td>") ; th becomes td ; rows of the wanted table Local $iError, $aTempEmptyRow[2] = [1, ""] Local $aRows = _ParseTags($sHtmlTable, "<tr", "</tr>") ; $aRows[0] = nr. of rows If @error Then Return SetError(@error, 0, "") Local $aCols[$aRows[0] + 1], $aTemp For $i = 1 To $aRows[0] $aTemp = _ParseTags($aRows[$i], "<td", "</td>") $iError = @error If $iError = 1 Then ; check if it's an empty row $aTemp = $aTempEmptyRow ; Empty Row Else If $iError Then Return SetError($iError, 0, "") EndIf If $aCols[0] < $aTemp[0] Then $aCols[0] = $aTemp[0] ; $aTemp[0] = max nr. of columns in table $aCols[$i] = $aTemp Next Local $aResult[$aRows[0]][$aCols[0]], $iStart, $iEnd, $aRowspan, $aColspan, $iSpanY, $iSpanX, $iSpanRow, $iSpanCol, $iMarkerCode, $sCellContent Local $aMirror = $aResult For $i = 1 To $aRows[0] ; scan all rows in this table $aTemp = $aCols[$i] ; <td ..> xx </td> ..... For $ii = 1 To $aTemp[0] ; scan all cells in this row $iSpanY = 0 $iSpanX = 0 $iY = $i - 1 ; zero base index for vertical ref $iX = $ii - 1 ; zero based indexes for horizontal ref ; following RegExp kindly provided by SadBunny in this post: ; http://www.autoitscript.com/forum/topic/167174-how-to-get-a-number-located-after-a-name-from-within-a-string/?p=1222781 $aRowspan = StringRegExp($aTemp[$ii], "(?i)rowspan\s*=\s*[""']?\s*(\d+)", 1) ; check presence of rowspan If IsArray($aRowspan) Then $iSpanY = $aRowspan[0] - 1 If $iSpanY + $iY > $aRows[0] Then $iSpanY -= $iSpanY + $iY - $aRows[0] + 1 EndIf EndIf ; $aColspan = StringRegExp($aTemp[$ii], "(?i)colspan\s*=\s*[""']?\s*(\d+)", 1) ; check presence of colspan If IsArray($aColspan) Then $iSpanX = $aColspan[0] - 1 ; $iMarkerCode += 1 ; code to mark this span area or single cell If $iSpanY Or $iSpanX Then $iX1 = $iX For $iSpY = 0 To $iSpanY For $iSpX = 0 To $iSpanX $iSpanRow = $iY + $iSpY If $iSpanRow > UBound($aMirror, 1) - 1 Then $iSpanRow = UBound($aMirror, 1) - 1 EndIf $iSpanCol = $iX1 + $iSpX If $iSpanCol > UBound($aMirror, 2) - 1 Then ReDim $aResult[$aRows[0]][UBound($aResult, 2) + 1] ReDim $aMirror[$aRows[0]][UBound($aMirror, 2) + 1] EndIf ; While $aMirror[$iSpanRow][$iX1 + $iSpX] ; search first free column $iX1 += 1 ; $iSpanCol += 1 If $iX1 + $iSpX > UBound($aMirror, 2) - 1 Then ReDim $aResult[$aRows[0]][UBound($aResult, 2) + 1] ReDim $aMirror[$aRows[0]][UBound($aMirror, 2) + 1] EndIf WEnd Next Next EndIf ; $iX1 = $iX ; following RegExp kindly provided by mikell in this post: ; http://www.autoitscript.com/forum/topic/167309-how-to-remove-from-a-string-all-between-and-pairs/?p=1224207 $sCellContent = StringRegExpReplace($aTemp[$ii], '<[^>]+>', "") If $iFilter Then $sCellContent = _HTML_Filter($sCellContent, $iFilter) For $iSpX = 0 To $iSpanX For $iSpY = 0 To $iSpanY $iSpanRow = $iY + $iSpY If $iSpanRow > UBound($aMirror, 1) - 1 Then $iSpanRow = UBound($aMirror, 1) - 1 EndIf While $aMirror[$iSpanRow][$iX1 + $iSpX] $iX1 += 1 If $iX1 + $iSpX > UBound($aMirror, 2) - 1 Then ReDim $aResult[$aRows[0]][$iX1 + $iSpX + 1] ReDim $aMirror[$aRows[0]][$iX1 + $iSpX + 1] EndIf WEnd $aMirror[$iSpanRow][$iX1 + $iSpX] = $iMarkerCode ; 1 If $bFillSpan Then $aResult[$iSpanRow][$iX1 + $iSpX] = $sCellContent Next $aResult[$iY][$iX1] = $sCellContent Next Next Next ; _ArrayDisplay($aMirror, "Debug") Return SetError(0, $aResult[0][0], $aResult) EndFunc ;==>_HtmlTableWriteToArray ; ; #FUNCTION# ==================================================================================================================== ; Name ..........: _HtmlTableGetWriteToArray ; Description ...: extract the html code of the required table from the html listing and copy the data of the table to a 2D array ; Syntax ........: _HtmlTableGetWriteToArray($sHtml[, $iWantedTable = 1[, $bFillSpan = False[, $iFilter = 0]]]) ; Parameters ....: $sHtml - A string value containing the html listing ; $iWantedTable - [optional] An integer value. The nr. of the table to be parsed (default is first table) ; $bFillSpan - [optional] Default is False. If all span areas have to be filled by repeating the data ; contained in the first cell of the span area ; $iFilter - [optional] Default is 0 (no filters) data extracted from cells is returned unchanged. ; - 0 = no filter ; - 1 = removes non ascii characters ; - 2 = removes all double whitespaces ; - 4 = removes all double linefeeds ; - 8 = removes all html-tags ; - 16 = simple html-tag / entities convertor ; Return values .: success: 2D array containing data from the wanted html table. ; faillure: An empty string and sets @error as following: ; @error: 1 - no tables are present in the passed HTML ; 2 - error while parsing tables, (opening and closing tags are not balanced) ; 3 - error while parsing tables, (open/close mismatch error) ; 4 - invalid table index request (requested table nr. is out of boundaries) ; =============================================================================================================================== Func _HtmlTableGetWriteToArray($sHtml, $iWantedTable = 1, $bFillSpan = False, $iFilter = 0) Local $aSingleTable = _HtmlTableGetList($sHtml, $iWantedTable) If @error Then Return SetError(@error, 0, "") Local $aTableData = _HtmlTableWriteToArray($aSingleTable[1], $bFillSpan, $iFilter) If @error Then Return SetError(@error, 0, "") Return SetError(0, $aTableData[0][0], $aTableData) EndFunc ;==>_HtmlTableGetWriteToArray ; #FUNCTION# ==================================================================================================================== ; Name ..........: _ParseTags ; Description ...: searches and extract all portions of html code within opening and closing tags inclusive. ; Returns an array containing a collection of <tag ...... </tag> lines. one in each element (even if are nested) ; Syntax ........: _ParseTags($sHtml, $sOpening, $sClosing) ; Parameters ....: $sHtml - A string value containing the html listing ; $sOpening - A string value indicating the opening tag ; $sClosing - A string value indicating the closing tag ; Return values .: success: an 1D 1 based array containing all the portions of html code representing the element ; element [0] af the array (and @extended as well) contains the counter of found elements ; faillure: An empty string and sets @error as following: ; @error: 1 - no tables are present in the passed HTML ; 2 - error while parsing tables, (opening and closing tags are not balanced) ; 3 - error while parsing tables, (open/close mismatch error) ; 4 - invalid table index request (requested table nr. is out of boundaries) ; =============================================================================================================================== Func _ParseTags($sHtml, $sOpening, $sClosing) ; example: $sOpening = '<table', $sClosing = '</table>' ; it finds how many of such tags are on the HTML page StringReplace($sHtml, $sOpening, $sOpening) ; in @xtended nr. of occurences Local $iNrOfThisTag = @extended ; I assume that opening <tag and closing </tag> tags are balanced (as should be) ; (so NO check is made to see if they are actually balanced) If $iNrOfThisTag Then ; if there is at least one of this tag ; $aThisTagsPositions array will contain the positions of the ; starting <tag and ending </tag> tags within the HTML Local $aThisTagsPositions[$iNrOfThisTag * 2 + 1][3] ; 1 based (make room for all open and close tags) ; 2) find in the HTML the positions of the $sOpening <tag and $sClosing </tag> tags For $i = 1 To $iNrOfThisTag $aThisTagsPositions[$i][0] = StringInStr($sHtml, $sOpening, 0, $i) ; start position of $i occurrence of <tag opening tag $aThisTagsPositions[$i][1] = $sOpening ; it marks which kind of tag is this $aThisTagsPositions[$i][2] = $i ; nr of this tag $aThisTagsPositions[$iNrOfThisTag + $i][0] = StringInStr($sHtml, $sClosing, 0, $i) + StringLen($sClosing) - 1 ; end position of $i^ occurrence of </tag> closing tag $aThisTagsPositions[$iNrOfThisTag + $i][1] = $sClosing ; it marks which kind of tag is this Next _ArraySort($aThisTagsPositions, 0, 1) ; now all opening and closing tags are in the same sequence as them appears in the HTML Local $aStack[UBound($aThisTagsPositions)][2] Local $aTags[Ceiling(UBound($aThisTagsPositions) / 2)] ; will contains the collection of <tag ..... </tag> from the html For $i = 1 To UBound($aThisTagsPositions) - 1 If $aThisTagsPositions[$i][1] = $sOpening Then ; opening <tag $aStack[0][0] += 1 ; nr of tags in html $aStack[$aStack[0][0]][0] = $sOpening $aStack[$aStack[0][0]][1] = $i ElseIf $aThisTagsPositions[$i][1] = $sClosing Then ; a closing </tag> was found If Not $aStack[0][0] Or Not ($aStack[$aStack[0][0]][0] = $sOpening And $aThisTagsPositions[$i][1] = $sClosing) Then Return SetError(3, 0, "") ; Open/Close mismatch error Else ; pair detected (the reciprocal tag) ; now get coordinates of the 2 tags ; 1) extract this tag <tag ..... </tag> from the html to the array $aTags[$aThisTagsPositions[$aStack[$aStack[0][0]][1]][2]] = StringMid($sHtml, $aThisTagsPositions[$aStack[$aStack[0][0]][1]][0], 1 + $aThisTagsPositions[$i][0] - $aThisTagsPositions[$aStack[$aStack[0][0]][1]][0]) ; 2) remove that tag <tag ..... </tag> from the html $sHtml = StringLeft($sHtml, $aThisTagsPositions[$aStack[$aStack[0][0]][1]][0] - 1) & StringMid($sHtml, $aThisTagsPositions[$i][0] + 1) ; 3) adjust the references to the new positions of remaining tags For $ii = $i To UBound($aThisTagsPositions) - 1 $aThisTagsPositions[$ii][0] -= StringLen($aTags[$aThisTagsPositions[$aStack[$aStack[0][0]][1]][2]]) Next $aStack[0][0] -= 1 ; nr of tags still in html EndIf EndIf Next If Not $aStack[0][0] Then ; all tags where parsed correctly $aTags[0] = $iNrOfThisTag Return SetError(0, $iNrOfThisTag, $aTags) ; OK Else Return SetError(2, 0, "") ; opening and closing tags are not balanced EndIf Else Return SetError(1, 0, "") ; there are no of such tags on this HTML page EndIf EndFunc ;==>_ParseTags ; #============================================================================= ; Name ..........: _HTML_Filter ; Description ...: Filter for strings ; AutoIt Version : V3.3.0.0 ; Syntax ........: _HTML_Filter(ByRef $sString[, $iMode = 0]) ; Parameter(s): .: $sString - String to filter ; $iMode - Optional: (Default = 0) : removes nothing ; - 0 = no filter ; - 1 = removes non ascii characters ; - 2 = removes all double whitespaces ; - 4 = removes all double linefeeds ; - 8 = removes all html-tags ; - 16 = simple html-tag / entities convertor ; Return Value ..: Success - Filterd String ; Failure - Input String ; Author(s) .....: Thorsten Willert, Stephen Podhajecki {gehossafats at netmdc. com} _ConvertEntities ; Date ..........: Wed Jan 27 20:49:59 CET 2010 ; modified ......: by Chimp Removed a double " " entities declaration, ; replace it with char(160) instead of chr(32), ; declaration of the $aEntities array as Static instead of just Local ; ============================================================================== Func _HTML_Filter(ByRef $sString, $iMode = 0) If $iMode = 0 Then Return $sString ;16 simple HTML tag / entities converter If $iMode >= 16 And $iMode < 32 Then Static Local $aEntities[95][2] = [[""", 34],["&", 38],["<", 60],[">", 62],[" ", 160] _ ,["¡", 161],["¢", 162],["£", 163],["¤", 164],["¥", 165],["¦", 166] _ ,["§", 167],["¨", 168],["©", 169],["ª", 170],["¬", 172],["­", 173] _ ,["®", 174],["¯", 175],["°", 176],["±", 177],["²", 178],["³", 179] _ ,["´", 180],["µ", 181],["¶", 182],["·", 183],["¸", 184],["¹", 185] _ ,["º", 186],["»", 187],["¼", 188],["½", 189],["¾", 190],["¿", 191] _ ,["À", 192],["Á", 193],["Ã", 195],["Ä", 196],["Å", 197],["Æ", 198] _ ,["Ç", 199],["È", 200],["É", 201],["Ê", 202],["Ì", 204],["Í", 205] _ ,["Î", 206],["Ï", 207],["Ð", 208],["Ñ", 209],["Ò", 210],["Ó", 211] _ ,["Ô", 212],["Õ", 213],["Ö", 214],["×", 215],["Ø", 216],["Ù", 217] _ ,["Ú", 218],["Û", 219],["Ü", 220],["Ý", 221],["Þ", 222],["ß", 223] _ ,["à", 224],["á", 225],["â", 226],["ã", 227],["ä", 228],["å", 229] _ ,["æ", 230],["ç", 231],["è", 232],["é", 233],["ê", 234],["ë", 235] _ ,["ì", 236],["í", 237],["î", 238],["ï", 239],["ð", 240],["ñ", 241] _ ,["ò", 242],["ó", 243],["ô", 244],["õ", 245],["ö", 246],["÷", 247] _ ,["ø", 248],["ù", 249],["ú", 250],["û", 251],["ü", 252],["þ", 254]] $sString = StringRegExpReplace($sString, '(?i)<p.*?>', @CRLF & @CRLF) $sString = StringRegExpReplace($sString, '(?i)<br>', @CRLF) Local $iE = UBound($aEntities) - 1 For $x = 0 To $iE $sString = StringReplace($sString, $aEntities[$x][0], Chr($aEntities[$x][1]), 0, 2) Next For $x = 32 To 255 $sString = StringReplace($sString, "&#" & $x & ";", Chr($x)) Next $iMode -= 16 EndIf ;8 Tag filter If $iMode >= 8 And $iMode < 16 Then ;$sString = StringRegExpReplace($sString, '<script.*?>.*?</script>', "") $sString = StringRegExpReplace($sString, "<[^>]*>", "") $iMode -= 8 EndIf ; 4 remove all double cr, lf If $iMode >= 4 And $iMode < 8 Then $sString = StringRegExpReplace($sString, "([ \t]*[\n\r]+[ \t]*)", @CRLF) $sString = StringRegExpReplace($sString, "[\n\r]+", @CRLF) $iMode -= 4 EndIf ; 2 remove all double withespaces If $iMode = 2 Or $iMode = 3 Then $sString = StringRegExpReplace($sString, "[[:blank:]]+", " ") $sString = StringRegExpReplace($sString, "\n[[:blank:]]+", @CRLF) $sString = StringRegExpReplace($sString, "[[:blank:]]+\n", "") $iMode -= 2 EndIf ; 1 remove all non ASCII (remove all chars with ascii code > 127) If $iMode = 1 Then $sString = StringRegExpReplace($sString, "[^\x00-\x7F]", " ") EndIf Return $sString EndFunc ;==>_HTML_Filter This simple demo allow to test those functions, showing what it can extract from the html tables in a web page of your choice or loading the html file from the disc. expandcollapse popup; #include <_HtmlTable2Array.au3> ; <--- udf already included (hard coded) at bottom of this demo #include <GUIConstantsEx.au3> #include <EditConstants.au3> #include <WindowsConstants.au3> #include <File.au3> ; needed for _FileWriteFromArray() #include <array.au3> #include <IE.au3> Local $oIE1 = _IECreateEmbedded(), $oIE2 = _IECreateEmbedded(), $iFilter = 0 Local $sHtml_File, $iIndex, $aTable, $aMyArray, $sFilePath GUICreate("Html tables to array demo", 1000, 450, (@DesktopWidth - 1000) / 2, (@DesktopHeight - 450) / 2 _ , $WS_OVERLAPPEDWINDOW + $WS_CLIPSIBLINGS + $WS_CLIPCHILDREN) GUICtrlCreateObj($oIE1, 010, 10, 480, 360) ; left browser GUICtrlCreateTab(500, 10, 480, 360) GUICtrlCreateTabItem("view table") GUICtrlCreateObj($oIE2, 502, 33, 474, 335) ; right browser GUICtrlCreateTabItem("view html") Local $idLabel_HtmlTable = GUICtrlCreateInput("", 502, 33, 474, 335, $ES_MULTILINE + $ES_AUTOVSCROLL) GUICtrlSetFont(-1, 10, 0, 0, "Courier new") GUICtrlCreateTabItem("") Local $idInputUrl = GUICtrlCreateInput("", 10, 380, 440, 20) Local $idButton_Go = GUICtrlCreateButton("Go", 455, 380, 25, 20) Local $idButton_Load = GUICtrlCreateButton("Load html from disk", 10, 410, 480, 30) Local $idButton_Prev = GUICtrlCreateButton("Prev <-", 510, 375, 50, 30) Local $idLabel_NunTable = GUICtrlCreateLabel("00 / 00", 570, 375, 40, 30) GUICtrlSetFont(-1, 9, 700) Local $idButton_Next = GUICtrlCreateButton("Next ->", 620, 375, 50, 30) GUICtrlCreateGroup("Fill Span", 680, 370, 80, 40) Local $iFillSpan = GUICtrlCreateCheckbox("", 715, 388, 15, 15) GUICtrlCreateGroup("", -99, -99, 1, 1) ;close group Local $idButton_Array0 = GUICtrlCreateButton("Preview array", 770, 375, 100, 30) Local $idButton_Array1 = GUICtrlCreateButton("Write array to file", 880, 375, 100, 30) ; options for filtering GUICtrlCreateGroup("Filters", 510, 410, 470, 35) Local $iFilter01 = GUICtrlCreateCheckbox("non ascii", 520, 425, 85, 15) Local $iFilter02 = GUICtrlCreateCheckbox("double spaces", 610, 425, 85, 15) Local $iFilter04 = GUICtrlCreateCheckbox("double @LF", 700, 425, 85, 15) Local $iFilter08 = GUICtrlCreateCheckbox("html-tags", 790, 425, 85, 15) Local $iFilter16 = GUICtrlCreateCheckbox("tags to entities", 880, 425, 85, 15) GUICtrlCreateGroup("", -99, -99, 1, 1) ;close group GUISetState(@SW_SHOW) ;Show GUI ; _IEDocWriteHTML($oIE2, "<HTML></HTML>") GUICtrlSetData($idInputUrl, "http://www.danshort.com/HTMLentities/") ; GUICtrlSetData($idInputUrl, "http://www.mojotoad.com/sisk/projects/HTML-TableExtract/tables.html") ; example page ControlClick("", "", $idButton_Go) ; _IEAction($oIE1, "stop") Do; Waiting for user to close the window $iMsg = GUIGetMsg() Select Case $iMsg = $idButton_Go _IENavigate($oIE1, GUICtrlRead($idInputUrl)) ; _IEAction($oIE1, "stop") $aTables = _HtmlTableGetList(_IEBodyReadHTML($oIE1)) If Not @error Then ; _ArrayDisplay($aTables, "Tables contained in this html") $iIndex = 1 _IEBodyWriteHTML($oIE2, "<html>" & $aTables[$iIndex] & "</html>") ControlClick("", "", $idButton_Prev) _IEAction($oIE2, "stop") Else MsgBox(0, 0, "@error " & @error) EndIf Case $iMsg = $idButton_Load ConsoleWrite("$idButton_Load" & @CRLF) $sHtml_File = FileOpenDialog("Choose an html file", @ScriptDir & "\", "html page (*.htm;*.html)") If Not @error Then GUICtrlSetData($idInputUrl, $sHtml_File) ControlClick("", "", $idButton_Go) EndIf Case $iMsg = $idButton_Next If IsArray($aTables) Then $iIndex += $iIndex < $aTables[0] GUICtrlSetData($idLabel_NunTable, "Table" & @CRLF & $iIndex & " / " & $aTables[0]) GUICtrlSetData($idLabel_HtmlTable, $aTables[$iIndex]) _IEBodyWriteHTML($oIE2, "<html>" & $aTables[$iIndex] & "</html>") _IEAction($oIE2, "stop") EndIf Case $iMsg = $idButton_Prev If IsArray($aTables) Then $iIndex -= $iIndex > 1 GUICtrlSetData($idLabel_NunTable, "Table" & @CRLF & $iIndex & " / " & $aTables[0]) GUICtrlSetData($idLabel_HtmlTable, $aTables[$iIndex]) _IEBodyWriteHTML($oIE2, "<html>" & $aTables[$iIndex] & "</html>") _IEAction($oIE2, "stop") EndIf Case $iMsg = $idButton_Array0 ; Preview Array If IsArray($aTables) Then $iFilter = 1 * _IsChecked($iFilter01) + 2 * _IsChecked($iFilter02) + 4 * _IsChecked($iFilter04) + 8 * _IsChecked($iFilter08) + 16 * _IsChecked($iFilter16) $aMyArray = _HtmlTableWriteToArray($aTables[$iIndex], _IsChecked($iFillSpan), $iFilter) If Not @error Then _ArrayDisplay($aMyArray) EndIf Case $iMsg = $idButton_Array1 ; Saves the array in a csv file of your choice If IsArray($aTables) Then $iFilter = 1 * _IsChecked($iFilter01) + 2 * _IsChecked($iFilter02) + 4 * _IsChecked($iFilter04) + 8 * _IsChecked($iFilter08) + 16 * _IsChecked($iFilter16) $aMyArray = _HtmlTableWriteToArray($aTables[$iIndex], _IsChecked($iFillSpan), $iFilter) If Not @error Then $sFilePath = FileSaveDialog("Choose a file to save to", @ScriptDir, "(*.csv)") If $sFilePath <> "" Then If Not _FileWriteFromArray($sFilePath, $aMyArray, 0, Default, ",") Then MsgBox(0, "Error on file write", "Error code is " & @error & @CRLF & @CRLF & "@error meaning:" & @CRLF & _ "1 - Error opening specified file" & @CRLF & _ "2 - $aArray is not an array" & @CRLF & _ "3 - Error writing to file" & @CRLF & _ "4 - $aArray is not a 1D or 2D array" & @CRLF & _ "5 - Start index is greater than the $iUbound parameter") EndIf EndIf EndIf EndIf EndSelect Until $iMsg = $GUI_EVENT_CLOSE GUIDelete() ; returns 1 if CheckBox is checked Func _IsChecked($idControlID) ; $GUI_CHECKED = 1 Return GUICtrlRead($idControlID) = $GUI_CHECKED EndFunc ;==>_IsChecked ; ------------------------------------------------------------------------ ; Following code should be included by the #include <_HtmlTable2Array.au3> ; hard coded here for easy load an run to try the example ; ------------------------------------------------------------------------ #include-once #include <array.au3> ; ; #FUNCTION# ==================================================================================================================== ; Name ..........: _HtmlTableGetList ; Description ...: Finds and enumerates all the html tables contained in an html listing (even if nested). ; if the optional parameter $i_index is passed, then only that table is returned ; Syntax ........: _HtmlTableGetList($sHtml[, $i_index = -1]) ; Parameters ....: $sHtml - A string value containing an html page listing ; $i_index - [optional] An integer value indicating the number of the table to be returned (1 based) ; with the default value of -1 an array with all found tables is returned ; Return values .: Success; Returns an 1D 1 based array containing all or single html table found in the html. ; element [0] (and @extended as well) contains the number of tables found (or 0 if no tables are returned) ; if an error occurs then an ampty string is returned and the following @error code is setted ; @error: 1 - no tables are present in the passed HTML ; 2 - error while parsing tables, (opening and closing tags are not balanced) ; 3 - error while parsing tables, (open/close mismatch error) ; 4 - invalid table index request (requested table nr. is out of boundaries) ; =============================================================================================================================== Func _HtmlTableGetList($sHtml, $i_index = -1) Local $aTables = _ParseTags($sHtml, "<table", "</table>") If @error Then Return SetError(@error, 0, "") ElseIf $i_index = -1 Then Return SetError(0, $aTables[0], $aTables) Else If $i_index > 0 And $i_index <= $aTables[0] Then Local $aTemp[2] = [1, $aTables[$i_index]] Return SetError(0, 1, $aTemp) Else Return SetError(4, 0, "") ; bad index EndIf EndIf EndFunc ;==>_HtmlTableGetList ; #FUNCTION# ==================================================================================================================== ; Name ..........: _HtmlTableWriteToArray ; Description ...: It writes values from an html table to a 2D array. It tries to take care of the rowspan and colspan formats ; Syntax ........: _HtmlTableWriteToArray($sHtmlTable[, $bFillSpan = False[, $iFilter = 0]]) ; Parameters ....: $sHtmlTable - A string value containing the html code of the table to be parsed ; $bFillSpan - [optional] Default is False. If span areas have to be filled by repeating the data ; contained in the first cell of the span area ; $iFilter - [optional] Default is 0 (no filters) data extracted from cells is returned unchanged. ; - 0 = no filter ; - 1 = removes non ascii characters ; - 2 = removes all double whitespaces ; - 4 = removes all double linefeeds ; - 8 = removes all html-tags ; - 16 = simple html-tag / entities convertor ; Return values .: Success: 2D array containing data from the html table ; Faillure: An empty strimg and sets @error as following: ; @error: 1 - no table content is present in the passed HTML ; 2 - error while parsing rows and/or columns, (opening and closing tags are not balanced) ; 3 - error while parsing rows and/or columns, (open/close mismatch error) ; =============================================================================================================================== Func _HtmlTableWriteToArray($sHtmlTable, $bFillSpan = False, $iFilter = 0) $sHtmlTable = StringReplace(StringReplace($sHtmlTable, "<th", "<td"), "</th>", "</td>") ; th becomes td ; rows of the wanted table Local $iError, $aTempEmptyRow[2] = [1, ""] Local $aRows = _ParseTags($sHtmlTable, "<tr", "</tr>") ; $aRows[0] = nr. of rows If @error Then Return SetError(@error, 0, "") Local $aCols[$aRows[0] + 1], $aTemp For $i = 1 To $aRows[0] $aTemp = _ParseTags($aRows[$i], "<td", "</td>") $iError = @error If $iError = 1 Then ; check if it's an empty row $aTemp = $aTempEmptyRow ; Empty Row Else If $iError Then Return SetError($iError, 0, "") EndIf If $aCols[0] < $aTemp[0] Then $aCols[0] = $aTemp[0] ; $aTemp[0] = max nr. of columns in table $aCols[$i] = $aTemp Next Local $aResult[$aRows[0]][$aCols[0]], $iStart, $iEnd, $aRowspan, $aColspan, $iSpanY, $iSpanX, $iSpanRow, $iSpanCol, $iMarkerCode, $sCellContent Local $aMirror = $aResult For $i = 1 To $aRows[0] ; scan all rows in this table $aTemp = $aCols[$i] ; <td ..> xx </td> ..... For $ii = 1 To $aTemp[0] ; scan all cells in this row $iSpanY = 0 $iSpanX = 0 $iY = $i - 1 ; zero base index for vertical ref $iX = $ii - 1 ; zero based indexes for horizontal ref ; following RegExp kindly provided by SadBunny in this post: ; http://www.autoitscript.com/forum/topic/167174-how-to-get-a-number-located-after-a-name-from-within-a-string/?p=1222781 $aRowspan = StringRegExp($aTemp[$ii], "(?i)rowspan\s*=\s*[""']?\s*(\d+)", 1) ; check presence of rowspan If IsArray($aRowspan) Then $iSpanY = $aRowspan[0] - 1 If $iSpanY + $iY > $aRows[0] Then $iSpanY -= $iSpanY + $iY - $aRows[0] + 1 EndIf EndIf ; $aColspan = StringRegExp($aTemp[$ii], "(?i)colspan\s*=\s*[""']?\s*(\d+)", 1) ; check presence of colspan If IsArray($aColspan) Then $iSpanX = $aColspan[0] - 1 ; $iMarkerCode += 1 ; code to mark this span area or single cell If $iSpanY Or $iSpanX Then $iX1 = $iX For $iSpY = 0 To $iSpanY For $iSpX = 0 To $iSpanX $iSpanRow = $iY + $iSpY If $iSpanRow > UBound($aMirror, 1) - 1 Then $iSpanRow = UBound($aMirror, 1) - 1 EndIf $iSpanCol = $iX1 + $iSpX If $iSpanCol > UBound($aMirror, 2) - 1 Then ReDim $aResult[$aRows[0]][UBound($aResult, 2) + 1] ReDim $aMirror[$aRows[0]][UBound($aMirror, 2) + 1] EndIf ; While $aMirror[$iSpanRow][$iX1 + $iSpX] ; search first free column $iX1 += 1 ; $iSpanCol += 1 If $iX1 + $iSpX > UBound($aMirror, 2) - 1 Then ReDim $aResult[$aRows[0]][UBound($aResult, 2) + 1] ReDim $aMirror[$aRows[0]][UBound($aMirror, 2) + 1] EndIf WEnd Next Next EndIf ; $iX1 = $iX ; following RegExp kindly provided by mikell in this post: ; http://www.autoitscript.com/forum/topic/167309-how-to-remove-from-a-string-all-between-and-pairs/?p=1224207 $sCellContent = StringRegExpReplace($aTemp[$ii], '<[^>]+>', "") If $iFilter Then $sCellContent = _HTML_Filter($sCellContent, $iFilter) For $iSpX = 0 To $iSpanX For $iSpY = 0 To $iSpanY $iSpanRow = $iY + $iSpY If $iSpanRow > UBound($aMirror, 1) - 1 Then $iSpanRow = UBound($aMirror, 1) - 1 EndIf While $aMirror[$iSpanRow][$iX1 + $iSpX] $iX1 += 1 If $iX1 + $iSpX > UBound($aMirror, 2) - 1 Then ReDim $aResult[$aRows[0]][$iX1 + $iSpX + 1] ReDim $aMirror[$aRows[0]][$iX1 + $iSpX + 1] EndIf WEnd $aMirror[$iSpanRow][$iX1 + $iSpX] = $iMarkerCode ; 1 If $bFillSpan Then $aResult[$iSpanRow][$iX1 + $iSpX] = $sCellContent Next $aResult[$iY][$iX1] = $sCellContent Next Next Next ; _ArrayDisplay($aMirror, "Debug") Return SetError(0, $aResult[0][0], $aResult) EndFunc ;==>_HtmlTableWriteToArray ; ; #FUNCTION# ==================================================================================================================== ; Name ..........: _HtmlTableGetWriteToArray ; Description ...: extract the html code of the required table from the html listing and copy the data of the table to a 2D array ; Syntax ........: _HtmlTableGetWriteToArray($sHtml[, $iWantedTable = 1[, $bFillSpan = False[, $iFilter = 0]]]) ; Parameters ....: $sHtml - A string value containing the html listing ; $iWantedTable - [optional] An integer value. The nr. of the table to be parsed (default is first table) ; $bFillSpan - [optional] Default is False. If all span areas have to be filled by repeating the data ; contained in the first cell of the span area ; $iFilter - [optional] Default is 0 (no filters) data extracted from cells is returned unchanged. ; - 0 = no filter ; - 1 = removes non ascii characters ; - 2 = removes all double whitespaces ; - 4 = removes all double linefeeds ; - 8 = removes all html-tags ; - 16 = simple html-tag / entities convertor ; Return values .: success: 2D array containing data from the wanted html table. ; faillure: An empty string and sets @error as following: ; @error: 1 - no tables are present in the passed HTML ; 2 - error while parsing tables, (opening and closing tags are not balanced) ; 3 - error while parsing tables, (open/close mismatch error) ; 4 - invalid table index request (requested table nr. is out of boundaries) ; =============================================================================================================================== Func _HtmlTableGetWriteToArray($sHtml, $iWantedTable = 1, $bFillSpan = False, $iFilter = 0) Local $aSingleTable = _HtmlTableGetList($sHtml, $iWantedTable) If @error Then Return SetError(@error, 0, "") Local $aTableData = _HtmlTableWriteToArray($aSingleTable[1], $bFillSpan, $iFilter) If @error Then Return SetError(@error, 0, "") Return SetError(0, $aTableData[0][0], $aTableData) EndFunc ;==>_HtmlTableGetWriteToArray ; #FUNCTION# ==================================================================================================================== ; Name ..........: _ParseTags ; Description ...: searches and extract all portions of html code within opening and closing tags inclusive. ; Returns an array containing a collection of <tag ...... </tag> lines. one in each element (even if are nested) ; Syntax ........: _ParseTags($sHtml, $sOpening, $sClosing) ; Parameters ....: $sHtml - A string value containing the html listing ; $sOpening - A string value indicating the opening tag ; $sClosing - A string value indicating the closing tag ; Return values .: success: an 1D 1 based array containing all the portions of html code representing the element ; element [0] af the array (and @extended as well) contains the counter of found elements ; faillure: An empty string and sets @error as following: ; @error: 1 - no tables are present in the passed HTML ; 2 - error while parsing tables, (opening and closing tags are not balanced) ; 3 - error while parsing tables, (open/close mismatch error) ; 4 - invalid table index request (requested table nr. is out of boundaries) ; =============================================================================================================================== Func _ParseTags($sHtml, $sOpening, $sClosing) ; example: $sOpening = '<table', $sClosing = '</table>' ; it finds how many of such tags are on the HTML page StringReplace($sHtml, $sOpening, $sOpening) ; in @xtended nr. of occurences Local $iNrOfThisTag = @extended ; I assume that opening <tag and closing </tag> tags are balanced (as should be) ; (so NO check is made to see if they are actually balanced) If $iNrOfThisTag Then ; if there is at least one of this tag ; $aThisTagsPositions array will contain the positions of the ; starting <tag and ending </tag> tags within the HTML Local $aThisTagsPositions[$iNrOfThisTag * 2 + 1][3] ; 1 based (make room for all open and close tags) ; 2) find in the HTML the positions of the $sOpening <tag and $sClosing </tag> tags For $i = 1 To $iNrOfThisTag $aThisTagsPositions[$i][0] = StringInStr($sHtml, $sOpening, 0, $i) ; start position of $i occurrence of <tag opening tag $aThisTagsPositions[$i][1] = $sOpening ; it marks which kind of tag is this $aThisTagsPositions[$i][2] = $i ; nr of this tag $aThisTagsPositions[$iNrOfThisTag + $i][0] = StringInStr($sHtml, $sClosing, 0, $i) + StringLen($sClosing) - 1 ; end position of $i^ occurrence of </tag> closing tag $aThisTagsPositions[$iNrOfThisTag + $i][1] = $sClosing ; it marks which kind of tag is this Next _ArraySort($aThisTagsPositions, 0, 1) ; now all opening and closing tags are in the same sequence as them appears in the HTML Local $aStack[UBound($aThisTagsPositions)][2] Local $aTags[Ceiling(UBound($aThisTagsPositions) / 2)] ; will contains the collection of <tag ..... </tag> from the html For $i = 1 To UBound($aThisTagsPositions) - 1 If $aThisTagsPositions[$i][1] = $sOpening Then ; opening <tag $aStack[0][0] += 1 ; nr of tags in html $aStack[$aStack[0][0]][0] = $sOpening $aStack[$aStack[0][0]][1] = $i ElseIf $aThisTagsPositions[$i][1] = $sClosing Then ; a closing </tag> was found If Not $aStack[0][0] Or Not ($aStack[$aStack[0][0]][0] = $sOpening And $aThisTagsPositions[$i][1] = $sClosing) Then Return SetError(3, 0, "") ; Open/Close mismatch error Else ; pair detected (the reciprocal tag) ; now get coordinates of the 2 tags ; 1) extract this tag <tag ..... </tag> from the html to the array $aTags[$aThisTagsPositions[$aStack[$aStack[0][0]][1]][2]] = StringMid($sHtml, $aThisTagsPositions[$aStack[$aStack[0][0]][1]][0], 1 + $aThisTagsPositions[$i][0] - $aThisTagsPositions[$aStack[$aStack[0][0]][1]][0]) ; 2) remove that tag <tag ..... </tag> from the html $sHtml = StringLeft($sHtml, $aThisTagsPositions[$aStack[$aStack[0][0]][1]][0] - 1) & StringMid($sHtml, $aThisTagsPositions[$i][0] + 1) ; 3) adjust the references to the new positions of remaining tags For $ii = $i To UBound($aThisTagsPositions) - 1 $aThisTagsPositions[$ii][0] -= StringLen($aTags[$aThisTagsPositions[$aStack[$aStack[0][0]][1]][2]]) Next $aStack[0][0] -= 1 ; nr of tags still in html EndIf EndIf Next If Not $aStack[0][0] Then ; all tags where parsed correctly $aTags[0] = $iNrOfThisTag Return SetError(0, $iNrOfThisTag, $aTags) ; OK Else Return SetError(2, 0, "") ; opening and closing tags are not balanced EndIf Else Return SetError(1, 0, "") ; there are no of such tags on this HTML page EndIf EndFunc ;==>_ParseTags ; #============================================================================= ; Name ..........: _HTML_Filter ; Description ...: Filter for strings ; AutoIt Version : V3.3.0.0 ; Syntax ........: _HTML_Filter(ByRef $sString[, $iMode = 0]) ; Parameter(s): .: $sString - String to filter ; $iMode - Optional: (Default = 0) : removes nothing ; - 0 = no filter ; - 1 = removes non ascii characters ; - 2 = removes all double whitespaces ; - 4 = removes all double linefeeds ; - 8 = removes all html-tags ; - 16 = simple html-tag / entities convertor ; Return Value ..: Success - Filterd String ; Failure - Input String ; Author(s) .....: Thorsten Willert, Stephen Podhajecki {gehossafats at netmdc. com} _ConvertEntities ; Date ..........: Wed Jan 27 20:49:59 CET 2010 ; modified ......: by Chimp Removed a double " " entities declaration, ; replace it with char(160) instead of chr(32), ; declaration of the $aEntities array as Static instead of just Local ; ============================================================================== Func _HTML_Filter(ByRef $sString, $iMode = 0) If $iMode = 0 Then Return $sString ;16 simple HTML tag / entities converter If $iMode >= 16 And $iMode < 32 Then Static Local $aEntities[95][2] = [[""", 34],["&", 38],["<", 60],[">", 62],[" ", 160] _ ,["¡", 161],["¢", 162],["£", 163],["¤", 164],["¥", 165],["¦", 166] _ ,["§", 167],["¨", 168],["©", 169],["ª", 170],["¬", 172],["­", 173] _ ,["®", 174],["¯", 175],["°", 176],["±", 177],["²", 178],["³", 179] _ ,["´", 180],["µ", 181],["¶", 182],["·", 183],["¸", 184],["¹", 185] _ ,["º", 186],["»", 187],["¼", 188],["½", 189],["¾", 190],["¿", 191] _ ,["À", 192],["Á", 193],["Ã", 195],["Ä", 196],["Å", 197],["Æ", 198] _ ,["Ç", 199],["È", 200],["É", 201],["Ê", 202],["Ì", 204],["Í", 205] _ ,["Î", 206],["Ï", 207],["Ð", 208],["Ñ", 209],["Ò", 210],["Ó", 211] _ ,["Ô", 212],["Õ", 213],["Ö", 214],["×", 215],["Ø", 216],["Ù", 217] _ ,["Ú", 218],["Û", 219],["Ü", 220],["Ý", 221],["Þ", 222],["ß", 223] _ ,["à", 224],["á", 225],["â", 226],["ã", 227],["ä", 228],["å", 229] _ ,["æ", 230],["ç", 231],["è", 232],["é", 233],["ê", 234],["ë", 235] _ ,["ì", 236],["í", 237],["î", 238],["ï", 239],["ð", 240],["ñ", 241] _ ,["ò", 242],["ó", 243],["ô", 244],["õ", 245],["ö", 246],["÷", 247] _ ,["ø", 248],["ù", 249],["ú", 250],["û", 251],["ü", 252],["þ", 254]] $sString = StringRegExpReplace($sString, '(?i)<p.*?>', @CRLF & @CRLF) $sString = StringRegExpReplace($sString, '(?i)<br>', @CRLF) Local $iE = UBound($aEntities) - 1 For $x = 0 To $iE $sString = StringReplace($sString, $aEntities[$x][0], Chr($aEntities[$x][1]), 0, 2) Next For $x = 32 To 255 $sString = StringReplace($sString, "&#" & $x & ";", Chr($x)) Next $iMode -= 16 EndIf ;8 Tag filter If $iMode >= 8 And $iMode < 16 Then ;$sString = StringRegExpReplace($sString, '<script.*?>.*?</script>', "") $sString = StringRegExpReplace($sString, "<[^>]*>", "") $iMode -= 8 EndIf ; 4 remove all double cr, lf If $iMode >= 4 And $iMode < 8 Then $sString = StringRegExpReplace($sString, "([ \t]*[\n\r]+[ \t]*)", @CRLF) $sString = StringRegExpReplace($sString, "[\n\r]+", @CRLF) $iMode -= 4 EndIf ; 2 remove all double withespaces If $iMode = 2 Or $iMode = 3 Then $sString = StringRegExpReplace($sString, "[[:blank:]]+", " ") $sString = StringRegExpReplace($sString, "\n[[:blank:]]+", @CRLF) $sString = StringRegExpReplace($sString, "[[:blank:]]+\n", "") $iMode -= 2 EndIf ; 1 remove all non ASCII (remove all chars with ascii code > 127) If $iMode = 1 Then $sString = StringRegExpReplace($sString, "[^\x00-\x7F]", " ") EndIf Return $sString EndFunc ;==>_HTML_Filter Any error reports or suggestions for enhancements are welcome Edited June 26, 2016 by Chimp Added the _HTML_Filter() function in the UDF and updated the example script GoogleDude, mLipok, iSan and 11 others 13 1 Â Chimp small minds discuss people average minds discuss events great minds discuss ideas.... and use AutoIt....
Gianni Posted June 28, 2015 Author Posted June 28, 2015 After seeing the function __HTML_Filter() in this topic by Stilgar (https://www.autoitscript.com/forum/topic/124330-_htmlau3-v101/) I thought I'd include that function also in this script.the purpose of that function is to clean the extracted data from the table by those codes that are not visible in the browser but are visible as code "dirty" in the data when they are picked up from the table.Updated the udf and the example script in first post.To see the difference in the extracted data with or without the use of the HTML_Filter() function, just extract the table data from the example page by clicking on the "Preview array" button with the filter CheckBox "tags to entities" one time unchecked and then checked instead. Â Chimp small minds discuss people average minds discuss events great minds discuss ideas.... and use AutoIt....
barbossa Posted April 7, 2016 Posted April 7, 2016 Hi Chimp! Thanks a lot for your example - it saved me a lot work! I had to parse a table with almost 1400 rows (and lots of rowspans) in an 1.5MB HTML file, and got some performance issues. Here is how I solved them: First, I adapted the HTML tag position search in _ParseTags to search starting on the last tag found position, so StringInStr doesn't need to count thousands of "<tr" tags every iteration. Then, _ArraySort failed (too many rows...). So, to get the tag list pre-sorted, I search for the first opening and first closing tag. If the opening is before the closing, write to $aThisTagsPositions and find the next opening; if the closing is before the next opening, write to $aThisTagsPositions and find the next closing. This made it possible to read that huge HTML file in less than 90 seconds. Just replace the code on lines 208-216 with this: Local $iNextOpenPosition = StringInStr($sHtml, $sOpening, 0, 1) Local $iNextClosePosition = StringInStr($sHtml, $sClosing, 0, 1) Local $iOpenCount = 1 ; 2) find in the HTML the positions of the $sOpening <tag and $sClosing </tag> tags For $i = 1 To $iNrOfThisTag * 2 ;search all the opening and closing tags If ($iNextOpenPosition < $iNextClosePosition) And $iNextOpenPosition <> 0 Then $aThisTagsPositions[$i][0] = $iNextOpenPosition $aThisTagsPositions[$i][1] = $sOpening ; it marks which kind of tag is this $aThisTagsPositions[$i][2] = $iOpenCount; nr of this tag $iOpenCount += 1 $iNextOpenPosition = StringInStr($sHtml, $sOpening, 0, 1, $aThisTagsPositions[$i][0] + 1) Else $aThisTagsPositions[$i][0] = $iNextClosePosition + StringLen($sClosing) - 1 $aThisTagsPositions[$i][1] = $sClosing ; it marks which kind of tag is this $iNextClosePosition = StringInStr($sHtml, $sClosing, 0, 1, $aThisTagsPositions[$i][0] + 1) EndIf Next  Gianni 1
MarcoMonte Posted March 20, 2023 Posted March 20, 2023 Sorry, I cannot find the link for downloanding the UDF library... someone could help me?
Developers Jos Posted March 20, 2023 Developers Posted March 20, 2023 The first post contains the UDF's, which you simply cut&paste into a file yourself.  MarcoMonte 1 SciTE4AutoIt3 Full installer Download page  - Beta files    Read before posting   How to post scriptsource   Forum etiquette Forum Rules  Live for the present, Dream of the future, Learn from the past.Â
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now