Jump to content

Problem with IE9 and _IEDocReadHTML


Recommended Posts

I want to access "final" HTML of a dynamically (javascript) generated page.

Tried _IEDocReadHTML, but it returns same page as InetGet :S

Anyone who can show me the error in this code:

$URL = "http://somesite"
    $oIE = _IECreate ($URL, 0, 0, 1, 0)
    $sHTML = _IEBodyReadHTML ($oIE)
    WriteFile ($filename2, $sHTML)

One more question - is there equivalent function in FF.au3 unit?

Edited by chavv
Link to comment
Share on other sites

After some more digging I managed to get to the content I need via _IETableGetCollection

But then I met with another problem, a cell content is in the form of

<a href="www.abit.com">ABIT</a>

IETableGetCollection returns just the text "ABIT", not the hyperlink, which is what I need :S (or better, the whole content)

Link to comment
Share on other sites

You need to explain yourself better or show some code because you are coming to incorrect conclusions.

_IEDocReadHTML and all other IE.au3 functions access the final markup AFTER all client processing. If it is the the same as INetGet, it would only be because there were no changes on the client side.

_IETableGetCollection does not return a text string, but rather an object.

Dale

Free Internet Tools: DebugBar, AutoIt IE Builder, HTTP UDF, MODIV2, IE Developer Toolbar, IEDocMon, Fiddler, HTML Validator, WGet, curl

MSDN docs: InternetExplorer Object, Document Object, Overviews and Tutorials, DHTML Objects, DHTML Events, WinHttpRequest, XmlHttpRequest, Cross-Frame Scripting, Office object model

Automate input type=file (Related)

Alternative to _IECreateEmbedded? better: _IECreatePseudoEmbedded  Better Better?

IE.au3 issues with Vista - Workarounds

SciTe Debug mode - it's magic: #AutoIt3Wrapper_run_debug_mode=Y

Doesn't work needs to be ripped out of the troubleshooting lexicon. It means that what you tried did not produce the results you expected. It begs the questions 1) what did you try?, 2) what did you expect? and 3) what happened instead?

Reproducer: a small (the smallest?) piece of stand-alone code that demonstrates your trouble

Link to comment
Share on other sites

You need to explain yourself better or show some code because you are coming to incorrect conclusions.

_IEDocReadHTML and all other IE.au3 functions access the final markup AFTER all client processing. If it is the the same as INetGet, it would only be because there were no changes on the client side.

_IETableGetCollection does not return a text string, but rather an object.

Dale

$URL = "http://somesite"
  $oIE = _IECreate ($URL, 0, 0, 1, 0)
  $sHTML = _IEBodyReadHTML ($oIE)
  WriteFile ("BodyRead.txt", $sHTML)
  Local $hDownload = InetGet($URL, "InetGet.txt",1,1)
  Do
    Sleep(250)
  Until InetGetInfo($hDownload, 2)    ; Check if the download is complete.
  Local $nBytes = InetGetInfo($hDownload, 0)
  InetClose($hDownload)   ; Close the handle to release resourcs.

Both files are identical, and look like the page before final processing, ie there is a table which is not filled with any data, which is visible and filled with data in the browser.

Could it be something with system being IE9 on Win7 x64? I'll try to find some sites with dynamic content and check whats wrong. Maybe the site is using something unusual :S, I'm not fully aware of all new techniques in web design and generation.

Btw on this same site your greatly looking IEBuilder is not working - it says browser version too old, I tried to find version returned by IEBuilder in the source, maybe due to it reporting v.2.x ... but i just "flew" over the source...

I was wrong about _IETableGetCollection - it returns object, which i was viewing with _ArrayDisplay, thus the problem.

Solved it via accessing the object like:

$oEachRow = $oTable.rows(5)
  $plLink = $oEachRow.cells(1).innerHTML
which can parsed later ;)

So i solved my initial problem (accessing page content after final content generation), tho no idea whats wrong with IEBodyReadHTML

PS: looks like the site uses ajax too, maybe thats the problem :huh2: and thus accessing info via DOM is the right and only way?

Edited by chavv
Link to comment
Share on other sites

PS: looks like the site uses ajax too, maybe thats the problem and thus accessing info via DOM is the right and only way?

IE.au3 uses the DOM and accesses the current, final markup, only. You'll need to provide a reproducer to look at or my conclusion is that you are confused somehow.

Dale

Free Internet Tools: DebugBar, AutoIt IE Builder, HTTP UDF, MODIV2, IE Developer Toolbar, IEDocMon, Fiddler, HTML Validator, WGet, curl

MSDN docs: InternetExplorer Object, Document Object, Overviews and Tutorials, DHTML Objects, DHTML Events, WinHttpRequest, XmlHttpRequest, Cross-Frame Scripting, Office object model

Automate input type=file (Related)

Alternative to _IECreateEmbedded? better: _IECreatePseudoEmbedded  Better Better?

IE.au3 issues with Vista - Workarounds

SciTe Debug mode - it's magic: #AutoIt3Wrapper_run_debug_mode=Y

Doesn't work needs to be ripped out of the troubleshooting lexicon. It means that what you tried did not produce the results you expected. It begs the questions 1) what did you try?, 2) what did you expect? and 3) what happened instead?

Reproducer: a small (the smallest?) piece of stand-alone code that demonstrates your trouble

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...