redrum Posted August 22, 2010 Share Posted August 22, 2010 Can someone help please? I have been using AutoIt for several months successfully on several websites. I now have a website I am trying to get data from where I cannot seem to access the HTML. There are no names or controls on the elements I need. I can read out the HTML and see the data I need within it using the DebugBar. (I can zero in on each of the elements on the web page and see the HTML as normal) When I do a _IEBodyReadHTML and inspect it, it is totally different than what I see with the DebugBar. (I am doing the same _IEBodyWriteHTML on several other websites and it is working fine). There is a frame that has a name, and I can get an object variable to it using _IEGetObjByName and then can get the tagname (Frame) using the oOBJECT.tagname. But if I try to get the HTML using the oOBJECT.innerhtml, or .innertext, I get no text. I'm to the point on this where I am beginning to wonder whether a website can inhibit access to the HTML, even though it displays using DebugBar. Is this possible, or has anyone else run into this problem? Thanks Doug Link to comment Share on other sites More sharing options...
tobject Posted August 22, 2010 Share Posted August 22, 2010 (edited) probably dynamic HTML and maybe you're accessing wrong object or at a wrong time maybe there's a timer which loads additional code on some event post some code and website name Edited August 22, 2010 by tobject Link to comment Share on other sites More sharing options...
DaleHohm Posted August 22, 2010 Share Posted August 22, 2010 You are either not getting a reference to the correct frame, or a COM error is being thrown that you do not mention caused by cross-site scripting limitations (add _IEErrorHandlerRegister() to your code to see if you are getting Access Is Denies and make sure to run from SciTe F5). Also suggest using the View Source icon in DebugBar toolbar to easily see all of the frames and their source. BTW, dynamic HTML is not the issue since _IEBodyReadHTML reads the final html markup, not the original source as the IE view source menu item does (unless it is a timing issue as tobject suggests). Dale Free Internet Tools: DebugBar, AutoIt IE Builder, HTTP UDF, MODIV2, IE Developer Toolbar, IEDocMon, Fiddler, HTML Validator, WGet, curl MSDN docs: InternetExplorer Object, Document Object, Overviews and Tutorials, DHTML Objects, DHTML Events, WinHttpRequest, XmlHttpRequest, Cross-Frame Scripting, Office object model Automate input type=file (Related) Alternative to _IECreateEmbedded? better: _IECreatePseudoEmbedded Better Better? IE.au3 issues with Vista - Workarounds SciTe Debug mode - it's magic: #AutoIt3Wrapper_run_debug_mode=Y Doesn't work needs to be ripped out of the troubleshooting lexicon. It means that what you tried did not produce the results you expected. It begs the questions 1) what did you try?, 2) what did you expect? and 3) what happened instead? Reproducer: a small (the smallest?) piece of stand-alone code that demonstrates your trouble Link to comment Share on other sites More sharing options...
redrum Posted August 23, 2010 Author Share Posted August 23, 2010 Thanks for comments, As suggested, I am attaching a test script that illustrates the problem. I have had no success in reading the html from this website, even though DebugBar reads the html out just fine, and it contains the data elements that I am looking for. Attached is a test Script that has some code that attempts to read out some html elements. Some are presently commented out as they result in errors, but are included to illustrate what I have tried that I thought may work. Any help/suggestions on what the problem is would be greatly appreciated! Regards, DougNASDAQ html test.au3 Link to comment Share on other sites More sharing options...
Tvern Posted August 23, 2010 Share Posted August 23, 2010 I think you (often) need to load the actual page the IFrame embeds to be able to fiddle with it's source, or objects. If you don't need anything from the first page and if the embedded page always has the same adress, you can navigate there directly. Otherwise you need to get the IFrame's source adress from the main page and then download it, or navigate there. The example below demonstrates how you can get those values when loading the IFrames page directly. I'm using INetRead and StringRegExp, but you could do the same with _IE functions. (INetRead is faster though) The StringRegExp's in the example are pretty crude and at the very least they need errorchecking, but it show how it could work. Local $sSource, $aTotalShares, $aInstOwnership ;I got this directly from the IFrame's "src=" value Local $sUrl = "http://holdings.nasdaq.com/asp/Institutional.asp?CIK=&HolderName=&LinesPerPage=5&PageNum=1&SortBy=&Descending=&strFilter=&site=nasdaq&symbol=AKAM&FormType=INSTITUTIONAL&Selected=AKAM&market=NASDAQ-GS&coname=Akamai+Technologies%2C+Inc%2E&LogoPath=http%3A%2F%2Fcontent%2Enasdaq%2Ecom%2Flogos%2FAKAM%2EGIF&pageName=" ;read the IFrame's source page: (you could do this with _IENavigate+_IEBodyReadHTML too) $sSource = InetRead($sUrl) $sSource = BinaryToString($sSource) ;Convert to string. $aTotalShares = StringRegExp($sSource,'(?i)(?s)Total Shares Out Standing.*?"Holdnum">.*?(\d+)',3) ;get the value for Shares Out Standing (needs errorchecking) $aInstOwnership = StringRegExp($sSource,'(?i)(?s)Institutional Ownership.*?"Holdnum">.*?(\d+)',3) ;get the value for Institutional Ownership (needs errorchecking) ConsoleWrite("Total Shares Out Standing (millions):" & $aTotalShares[0] & @CRLF) ;display results ConsoleWrite("Institutional Ownership:" & $aInstOwnership[0] & "%" & @CRLF) Link to comment Share on other sites More sharing options...
DaleHohm Posted August 23, 2010 Share Posted August 23, 2010 You seem to be aware of the frame, but then you don't drill into it to try to find what you are after. Take a look at this code: #include <IE.au3> #include <Array.au3> $oIE = _IECreate ("http://www.nasdaq.com/asp/holdings.asp?symbol=AKAM&selected=AKAM&FormType=Institutional") $oFRAME = _IEFrameGetObjByName($oIE, "frmMain") $oTable = _IETableGetCollection($oFrame, 5) $aTable = _IETableWriteToArray($oTable, True) _ArrayDisplay($aTable) Dale Free Internet Tools: DebugBar, AutoIt IE Builder, HTTP UDF, MODIV2, IE Developer Toolbar, IEDocMon, Fiddler, HTML Validator, WGet, curl MSDN docs: InternetExplorer Object, Document Object, Overviews and Tutorials, DHTML Objects, DHTML Events, WinHttpRequest, XmlHttpRequest, Cross-Frame Scripting, Office object model Automate input type=file (Related) Alternative to _IECreateEmbedded? better: _IECreatePseudoEmbedded Better Better? IE.au3 issues with Vista - Workarounds SciTe Debug mode - it's magic: #AutoIt3Wrapper_run_debug_mode=Y Doesn't work needs to be ripped out of the troubleshooting lexicon. It means that what you tried did not produce the results you expected. It begs the questions 1) what did you try?, 2) what did you expect? and 3) what happened instead? Reproducer: a small (the smallest?) piece of stand-alone code that demonstrates your trouble Link to comment Share on other sites More sharing options...
redrum Posted August 23, 2010 Author Share Posted August 23, 2010 Many thanks for the replys, I have drilled down into Frames, but what I have used in prior webpages didn't work here. In studying your response code I am finding things, as usual, that I didn't know. Regards, Doug Link to comment Share on other sites More sharing options...
redrum Posted August 23, 2010 Author Share Posted August 23, 2010 probably dynamic HTML and maybe you're accessing wrong object or at a wrong timemaybe there's a timer which loads additional code on some eventpost some code and website nameI tried to use the "View Source icon" based on your suggestion, "Also suggest using the View Source icon in DebugBar toolbar to easily see all of the frames and their source", but cannot find it.Is this toolbar only available with the Corporate version?Also, I really appreciate the code you supplied, I didn't realize how the _IETableGetCollection and _IETableWriteToArray works. This will really simplify my code from what I have been doing.Thanks,Doug Link to comment Share on other sites More sharing options...
DaleHohm Posted August 23, 2010 Share Posted August 23, 2010 No, the View Source icon is in all versions. It is the one to the left of the eyeball icon. Dale Free Internet Tools: DebugBar, AutoIt IE Builder, HTTP UDF, MODIV2, IE Developer Toolbar, IEDocMon, Fiddler, HTML Validator, WGet, curl MSDN docs: InternetExplorer Object, Document Object, Overviews and Tutorials, DHTML Objects, DHTML Events, WinHttpRequest, XmlHttpRequest, Cross-Frame Scripting, Office object model Automate input type=file (Related) Alternative to _IECreateEmbedded? better: _IECreatePseudoEmbedded Better Better? IE.au3 issues with Vista - Workarounds SciTe Debug mode - it's magic: #AutoIt3Wrapper_run_debug_mode=Y Doesn't work needs to be ripped out of the troubleshooting lexicon. It means that what you tried did not produce the results you expected. It begs the questions 1) what did you try?, 2) what did you expect? and 3) what happened instead? Reproducer: a small (the smallest?) piece of stand-alone code that demonstrates your trouble Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now