Scrape Javascript Rendered Directory


Working on getting the latest chrome driver  from https://chromedriver.storage.googleapis.com/index.html but the page is pure javascript that gets generated after the user visits it. Standard scraping isn't working. For example:

#include <IE.au3>

$Url = "https://chromedriver.storage.googleapis.com/index.html"
$oIE = _IECreate($Url)
$oRows = _IETagnameGetCollection($oIE, "tr") 
ConsoleWrite("oRows:" & $oRows.Length & @CRLF)

Results in the following output:

Thanks for any ideas and help 🙂



So far this. Seems maybe I had to add a sleep before scraping but this code works slow as molasses


#include <IE.au3>

$Url = "https://chromedriver.storage.googleapis.com/index.html"
$oIE = _IECreate($Url)
Sleep(2000) ; So far sleeping 2 seconds here gets results
$oRows = _IETagnameGetCollection($oIE, "tr")
ConsoleWrite("oRows:" & $oRows.Length & @CRLF)
For $oRow in $oRows
    $innerString = ''
    If StringLen($oRow.innerText) > 0 And $oRow.innerText <> "-" Then
      $aChromeVer = StringSplit($oRow.innerText,".")
      If StringIsDigit ( $aChromeVer[1] ) And Number($aChromeVer[1]) > 2 Then ;
         ConsoleWrite("->" & $aChromeVer[1] & @CRLF)
      $oDatas = _IETagnameGetCollection($oRow, "td") ;<--this slowed up the process looking for the subset of data

Takes so much longer than loading the page to extract some data. Is there a way to speed this up?


EDIT UPDATE: I found out that the next step $oData = _IETagnameGetCollection was slowing down the scraping.

@NassauSky or you could have used table functions like this :

#include <IE.au3>
#include <Array.au3>

$Url = "https://chromedriver.storage.googleapis.com/index.html"
$oIE = _IECreate($Url)
Sleep(1000) ; So far sleeping 1 second here gets results

$oTable = _IETableGetCollection($oIE,0)
$aTable = _IETableWriteToArray($oTable, True)
_ArrayDisplay ($aTable)

Instant fast :)

You could also just use childNodes.item(index) as well, which is useful for getting attributes.

#include <IE.au3>

$Url = "https://chromedriver.storage.googleapis.com/index.html"
$oIE = _IECreate($Url)
Sleep(2000) ; So far sleeping 2 seconds here gets results
$oRows = _IETagNameGetCollection($oIE, "tr")
For $oRow In $oRows
    If $oRow.childNodes.Length = 5 Then
        If Int($oRow.childNodes.item(1).innerText) > 2 Then
            ConsoleWrite(Int($oRow.childNodes.item(1).innerText) & " - " & $oRow.childNodes.item(1).childNodes.item(0).href & @CRLF)


