Sokko Posted April 19, 2007 Share Posted April 19, 2007 I'm working on a small project in which I need to download a web page and extract the source code that is inside a tag with a particular property. For instance, locate the div with an ID of "features" and extract the source inside it, being careful not to be tripped up by any divs inside that one. Since I don't feel like writing my own cut-down HTML parser just for this project, I looked at IE.au3 for some function that could do this. So far I haven't been able to find anything useful. If I could get a handle on the div I could use _IEPropertyGet with innerHTML to pull out the contents, but there is no function (or if there is, I'm blind) that will even let me find a particular tag on the page, much less a tag with a specific ID, class, etc. Can this be done with the IE functions, or is there another way? (can't think of a RegEx that would work for this sort of thing at the moment) Link to comment Share on other sites More sharing options...
DaleHohm Posted April 19, 2007 Share Posted April 19, 2007 _IEGetObjByName() will get an element by name or ID. _IETagNameGetCollection() will get a collection of all elements with that tag or if you pass a zero-based index you can get a reference to a specific element. Dale Free Internet Tools: DebugBar, AutoIt IE Builder, HTTP UDF, MODIV2, IE Developer Toolbar, IEDocMon, Fiddler, HTML Validator, WGet, curl MSDN docs: InternetExplorer Object, Document Object, Overviews and Tutorials, DHTML Objects, DHTML Events, WinHttpRequest, XmlHttpRequest, Cross-Frame Scripting, Office object model Automate input type=file (Related) Alternative to _IECreateEmbedded? better: _IECreatePseudoEmbedded Better Better? IE.au3 issues with Vista - Workarounds SciTe Debug mode - it's magic: #AutoIt3Wrapper_run_debug_mode=Y Doesn't work needs to be ripped out of the troubleshooting lexicon. It means that what you tried did not produce the results you expected. It begs the questions 1) what did you try?, 2) what did you expect? and 3) what happened instead? Reproducer: a small (the smallest?) piece of stand-alone code that demonstrates your trouble Link to comment Share on other sites More sharing options...
Sokko Posted April 19, 2007 Author Share Posted April 19, 2007 Is there a way to get an element by looking at some other property (class comes to mind)? I'm guessing this would have to be done with _IETagNameGetCollection, but once you obtain the collection, how would you find out which elements have the class or other property you want? Link to comment Share on other sites More sharing options...
DaleHohm Posted April 19, 2007 Share Posted April 19, 2007 (edited) Example: $oDivs = _IETagNameGetCollection($oIE, "div") For $oDiv in $oDivs If String($oDiv.className) = "the one I'm looking for" Then ; Yahoo! I found one ; do something EndIf Next Dale Edited April 19, 2007 by DaleHohm Free Internet Tools: DebugBar, AutoIt IE Builder, HTTP UDF, MODIV2, IE Developer Toolbar, IEDocMon, Fiddler, HTML Validator, WGet, curl MSDN docs: InternetExplorer Object, Document Object, Overviews and Tutorials, DHTML Objects, DHTML Events, WinHttpRequest, XmlHttpRequest, Cross-Frame Scripting, Office object model Automate input type=file (Related) Alternative to _IECreateEmbedded? better: _IECreatePseudoEmbedded Better Better? IE.au3 issues with Vista - Workarounds SciTe Debug mode - it's magic: #AutoIt3Wrapper_run_debug_mode=Y Doesn't work needs to be ripped out of the troubleshooting lexicon. It means that what you tried did not produce the results you expected. It begs the questions 1) what did you try?, 2) what did you expect? and 3) what happened instead? Reproducer: a small (the smallest?) piece of stand-alone code that demonstrates your trouble Link to comment Share on other sites More sharing options...
Sokko Posted April 19, 2007 Author Share Posted April 19, 2007 (edited) Thanks for that example! But how did you know to use "className", and what else could you put in its place? It doesn't match up with the actual name of the property in the HTML, which is just "class", so I'm not sure how to extend this to other properties of tags. Also, why did you put a String function around it? Edited April 19, 2007 by Sokko Link to comment Share on other sites More sharing options...
DaleHohm Posted April 19, 2007 Share Posted April 19, 2007 >Thanks for that example! You're welcome >But how did you know to use "className", and what else could you put in its place? >It doesn't match up with the actual name of the property in the HTML, which is just "class", >so I'm not sure how to extend this to other properties of tags. See the link for the MSDN Document Object documentation in my Sig... then drill down to the DIV tag to see what properties it has. >Also, why did you put a String function around it? Experience. If a Div has no classname, then $oDiv.className returns a numeric 0 instead of a null string as you might expect. Since AutoIt uses variants rather than typed variables, it assumes that since the left side of the comparison is numeric, you want to do a numeric comparison so it converts the right side to numeric as well - and all strings evaluate to 0 as numerics... so, you get [if 0 = 0] which evaluates to True instead of [if "" = "what I'm looking for"] which would evaluate to False. Using String() on the left side forces a string comparison. You'll likely forget this, like I often do, but hopefully you'll remember when you are getting really strange results sometime and you just can't figure it out... and then, Doh! Dale Free Internet Tools: DebugBar, AutoIt IE Builder, HTTP UDF, MODIV2, IE Developer Toolbar, IEDocMon, Fiddler, HTML Validator, WGet, curl MSDN docs: InternetExplorer Object, Document Object, Overviews and Tutorials, DHTML Objects, DHTML Events, WinHttpRequest, XmlHttpRequest, Cross-Frame Scripting, Office object model Automate input type=file (Related) Alternative to _IECreateEmbedded? better: _IECreatePseudoEmbedded Better Better? IE.au3 issues with Vista - Workarounds SciTe Debug mode - it's magic: #AutoIt3Wrapper_run_debug_mode=Y Doesn't work needs to be ripped out of the troubleshooting lexicon. It means that what you tried did not produce the results you expected. It begs the questions 1) what did you try?, 2) what did you expect? and 3) what happened instead? Reproducer: a small (the smallest?) piece of stand-alone code that demonstrates your trouble Link to comment Share on other sites More sharing options...
Sokko Posted April 20, 2007 Author Share Posted April 20, 2007 See the link for the MSDN Document Object documentation in my Sig... then drill down to the DIV tag to see what properties it has.Ah, I see. Took me about five minutes to figure out you had to choose the "Collections" button and click the "HTML Elements" link under the description for the "childNodes" collection. :"> Thanks for the tip about String, I certainly hope I won't forget it. Link to comment Share on other sites More sharing options...
DaleHohm Posted April 20, 2007 Share Posted April 20, 2007 Ah yes... sorry I made it hard. I've now added a "DHTML Objects" link to my sig that I will direct others to for similar things in the future. Dale Free Internet Tools: DebugBar, AutoIt IE Builder, HTTP UDF, MODIV2, IE Developer Toolbar, IEDocMon, Fiddler, HTML Validator, WGet, curl MSDN docs: InternetExplorer Object, Document Object, Overviews and Tutorials, DHTML Objects, DHTML Events, WinHttpRequest, XmlHttpRequest, Cross-Frame Scripting, Office object model Automate input type=file (Related) Alternative to _IECreateEmbedded? better: _IECreatePseudoEmbedded Better Better? IE.au3 issues with Vista - Workarounds SciTe Debug mode - it's magic: #AutoIt3Wrapper_run_debug_mode=Y Doesn't work needs to be ripped out of the troubleshooting lexicon. It means that what you tried did not produce the results you expected. It begs the questions 1) what did you try?, 2) what did you expect? and 3) what happened instead? Reproducer: a small (the smallest?) piece of stand-alone code that demonstrates your trouble Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now