litlmike Posted April 15, 2008 Share Posted April 15, 2008 Hello Everybody!I want to grab the Movie Titles for movies 'Now Playing' at the below web address. Normally, I would use _IETableWriteToArray() to get the data, but I can already forsee some issues just getting the Movie Title, and not a bunch of other data. If that doesn't work, then I normally go to _IELinkGetCollection() and iterate through the links with some sort of criteria that returns only the links I want, but that won't work here because all of the links have the same properties (from what I can tell). Maybe if there were some way to use _IELinkGetCollection(), from just an indexed table, that might work. Or, as is common practice, the AutoIt community enlightens me by letting me know of some alternate way I have never thought of. I am partial to working with IE.au3 or COM, so I would like to think in that framework if possible. I wanted some feedback to see if there are other options that I am overlooking. Thanks!$sUrl = "http://www.fandango.com/edwardsfresnostadi...fyg/theaterpage" _ArrayPermute()_ArrayUnique()Excel.au3 UDF Link to comment Share on other sites More sharing options...
DaleHohm Posted April 16, 2008 Share Posted April 16, 2008 So, if you use DebugBar to look at the structure of the page and look for a pattern, you'll see that the movie titles are the text of a link (<a>) inside an <h4> element. So, if you get the collection of h4's with _IETagnameGetCollection, loop through them and get the _IEPropertyGet, innertext of the first link inside each h4. you should have what you want. Take a stab at it and post some code if you have trouble. Dale Free Internet Tools: DebugBar, AutoIt IE Builder, HTTP UDF, MODIV2, IE Developer Toolbar, IEDocMon, Fiddler, HTML Validator, WGet, curl MSDN docs: InternetExplorer Object, Document Object, Overviews and Tutorials, DHTML Objects, DHTML Events, WinHttpRequest, XmlHttpRequest, Cross-Frame Scripting, Office object model Automate input type=file (Related) Alternative to _IECreateEmbedded? better: _IECreatePseudoEmbedded Better Better? IE.au3 issues with Vista - Workarounds SciTe Debug mode - it's magic: #AutoIt3Wrapper_run_debug_mode=Y Doesn't work needs to be ripped out of the troubleshooting lexicon. It means that what you tried did not produce the results you expected. It begs the questions 1) what did you try?, 2) what did you expect? and 3) what happened instead? Reproducer: a small (the smallest?) piece of stand-alone code that demonstrates your trouble Link to comment Share on other sites More sharing options...
litlmike Posted April 16, 2008 Author Share Posted April 16, 2008 So, if you use DebugBar to look at the structure of the page and look for a pattern, you'll see that the movie titles are the text of a link (<a>) inside an <h4> element. So, if you get the collection of h4's with _IETagnameGetCollection, loop through them and get the _IEPropertyGet, innertext of the first link inside each h4. you should have what you want. Take a stab at it and post some code if you have trouble. DalePfffft... 2 hours later litlmike taps out. I looked through the help file for DebugBar (lol), then found it on your Sig; though it looks awesome I don't understand how to use to Debug, but I can learn more about it later. I did notice the pattern you mentioned, previously, by just looking through the HTML, but seeing a pattern and manipulating it are different. MSDN's search engine isn't working ATM and it makes learning about something I know nothing about very cumbersome. I think I get what you are saying though; I need to get a collection of 'h4' (I think it means headers, text size 4), then get the innertext property of that collection of h4, then Viola. But, I can find ANYWHERE in MSDN how to work with headers or h4. I assume _IETagnameGetCollection returns a collection object, but how do I tell that Func to get the h4, 'head' doesn't seem to return what I am looking for. I tried finding a list of things Tagname applies to, but whatever I found didn't produce the result. I tried A, head, etc. I abandoned the first set of code below for the second set, which returned too much data, but at least some of it contained the intended result. From there, I did not know how to refine it to h4 elements. This may be TMI, but I thought it important to show my work to the teacher #include<IE.au3> $oIE = _IECreate ("http://www.fandango.com/edwardsfresnostadium22andimax_aafyg/theaterpage", 1) $oInputs = _IETagNameGetCollection ($oIE, "h4") For $oInput In $oInputs ConsoleWrite ( _IEPropertyGet ($oInputs, "innertext") & @CRLF) NextoÝ÷ Ù«¢+Ø¥¹±Õ±Ðí%¹ÔÌÐì(ÀÌØí½%ô}% ÉÑ ÅÕ½Ðí¡ÑÑÀè¼½ÝÝܹ¹¹¼¹½´½ÝÉÍÉ͹½ÍÑ¥Õ´Èɹ¥µá}å½Ñ¡ÑÉÁÅÕ½Ðì°Ä¤(ÀÌØí½1¥¹Ìô}%1¥¹Ñ ½±±Ñ¥½¸ ÀÌØí½%¤()½ÈÀÌØí½1¥¹¬%¸ÀÌØí½1¥¹Ì(% ½¹Í½±]É¥Ñ }%AɽÁÉÑåÐ ÀÌØí½1¥¹¬°ÅÕ½Ðí¥¹¹ÉÑáÐÅÕ½Ð줵Àì I1¤)9á _ArrayPermute()_ArrayUnique()Excel.au3 UDF Link to comment Share on other sites More sharing options...
nobbe Posted April 16, 2008 Share Posted April 16, 2008 hi maybe another approach would help? this is not as elegant as dale's solution but i apporached the source from different side (split out all relevant parts of the html, then split results again etc..) ..coding time was 5min CODE ; theater #include <INET.au3> #include <array.au3> #include <String.au3> $start_url = "http://www.fandango.com/edwardsfresnostadium22andimax_aafyg/theaterpage" $sCode = _INetGetSource($start_url) ClipPut($sCode); just to check what we get $tmp = _StringBetween($sCode, 'Now Playing</a></h2>', '</table>') If @error <> 1 Then ; _ArrayDisplay($tmp, 'Stringbetween Search') ;, general selection between links $tmp1 = _StringBetween($tmp[0], '<LI>', '</LI>') If @error <> 1 Then ;_ArrayDisplay($tmp1, 'Stringbetween Search') ;; all entries For $iCC = 0 To UBound($tmp1, 1) - 1 $tmp2 = _StringBetween($tmp1[$iCC], '>', '</a>') If @error <> 1 Then $title = $tmp2[0] MsgBox(0, "Now playing", $title); --> get now all titles ;; all entries EndIf Next EndIf EndIf Link to comment Share on other sites More sharing options...
nobbe Posted April 16, 2008 Share Posted April 16, 2008 ok - me back again: after another 3 min reserch i come up with this CODE #include<IE.au3> $oIE = _IECreate ("http://www.fandango.com/edwardsfresnostadium22andimax_aafyg/theaterpage", 1) $oLinks = _IELinkGetCollection ($oIE) For $oLink In $oLinks If StringRight($oLink.href, 5) == "date=" then ConsoleWrite ( $oLink.href & @CRLF) ConsoleWrite ( _IEPropertyGet ($oLink, "innertext") & @CRLF) endif Next Link to comment Share on other sites More sharing options...
litlmike Posted April 16, 2008 Author Share Posted April 16, 2008 ok - me back again: after another 3 min reserch i come up with this CODE #include<IE.au3> $oIE = _IECreate ("http://www.fandango.com/edwardsfresnostadium22andimax_aafyg/theaterpage", 1) $oLinks = _IELinkGetCollection ($oIE) For $oLink In $oLinks If StringRight($oLink.href, 5) == "date=" then ConsoleWrite ( $oLink.href & @CRLF) ConsoleWrite ( _IEPropertyGet ($oLink, "innertext") & @CRLF) endif Next Well, this does work, but for learning purposes I would like to bridge the gap with Dale's method of grabbing the H4 elements, because I think that will come in handy in the future. Also, how is it that this method works, when I look at the HTML here is how the link is coded, but I see no mention of "date=" <a href="http://www.fandango.com/madeofhonor_102109/movieoverview">Made of Honor</a> _ArrayPermute()_ArrayUnique()Excel.au3 UDF Link to comment Share on other sites More sharing options...
nobbe Posted April 16, 2008 Share Posted April 16, 2008 hi so i guess i cant help anymore in live & programming there is more than one solution to solve a problem.. Link to comment Share on other sites More sharing options...
litlmike Posted April 16, 2008 Author Share Posted April 16, 2008 hi so i guess i cant help anymore in live & programming there is more than one solution to solve a problem..Well you can still help by answering this question from the last post: Also, how is it that this method works, when I look at the HTML here is how the link is coded, but I see no mention of "date=" <a href="http://www.fandango.com/madeofhonor_102109/movieoverview">Made of Honor</a> _ArrayPermute()_ArrayUnique()Excel.au3 UDF Link to comment Share on other sites More sharing options...
DaleHohm Posted April 16, 2008 Share Posted April 16, 2008 (edited) Ok, look this over: #include <IE.au3> $oIE = _IECreate ("http://www.fandango.com/edwardsfresnostadium22andimax_aafyg/theaterpage", 1) $oH4s = _IETagNameGetCollection ($oIE, "h4") For $oH4 in $oH4s ; loop though <H4>s $oA = _IETagNameGetCollection($oH4, "a", 0) ; get the first <A> inside the <H4> ConsoleWrite("MovieName: " & _IEPropertyGet($oA, "innertext") & @CR) Next Let me know if you have questions. Dale p.s. Regarding DebugBar, drag the target icon over the element you are interested in in the webpage and then examing the source it shows you on the left Edited April 16, 2008 by DaleHohm Free Internet Tools: DebugBar, AutoIt IE Builder, HTTP UDF, MODIV2, IE Developer Toolbar, IEDocMon, Fiddler, HTML Validator, WGet, curl MSDN docs: InternetExplorer Object, Document Object, Overviews and Tutorials, DHTML Objects, DHTML Events, WinHttpRequest, XmlHttpRequest, Cross-Frame Scripting, Office object model Automate input type=file (Related) Alternative to _IECreateEmbedded? better: _IECreatePseudoEmbedded Better Better? IE.au3 issues with Vista - Workarounds SciTe Debug mode - it's magic: #AutoIt3Wrapper_run_debug_mode=Y Doesn't work needs to be ripped out of the troubleshooting lexicon. It means that what you tried did not produce the results you expected. It begs the questions 1) what did you try?, 2) what did you expect? and 3) what happened instead? Reproducer: a small (the smallest?) piece of stand-alone code that demonstrates your trouble Link to comment Share on other sites More sharing options...
litlmike Posted April 16, 2008 Author Share Posted April 16, 2008 Ok, look this over: #include <IE.au3> $oIE = _IECreate ("http://www.fandango.com/edwardsfresnostadium22andimax_aafyg/theaterpage", 1) $oH4s = _IETagNameGetCollection ($oIE, "h4") For $oH4 in $oH4s ; loop though <H4>s $oA = _IETagNameGetCollection($oH4, "a", 0) ; get the first <A> inside the <H4> ConsoleWrite("MovieName: " & _IEPropertyGet($oA, "innertext") & @CR) Next Let me know if you have questions. Dale p.s. Regarding DebugBar, drag the target icon over the element you are interested in in the webpage and then examing the source it shows you on the left Ahhh, I was so close yet so far. This makes sense now, once you have the collection of h4, then get the collection of links, then get the innertext properties. How could I have known what are acceptable Tagnames? In the help file it mentions IMG and TR, but where in MSDN can I find a list of all acceptable elements to collect? I see I have more to learn about how HTML is structured with Objects, etc. I originally thought that I could only use _IETagName once, and that it must be either 'h4', 'a', 'head' or something like that. This is helpful to understand that once a collection of objects is made, I can then search that collection and return another collection. How do you find that DebugBar helps you in scripting? To identify where objects are located in the HTML, or are there more uses? Thanks as always _ArrayPermute()_ArrayUnique()Excel.au3 UDF Link to comment Share on other sites More sharing options...
DaleHohm Posted April 16, 2008 Share Posted April 16, 2008 TagNames are the core elements of HTML... BODY, TABLE, A, UL, LI, DIV, IMG, TR, OBJECT, P, etc. - essentailly anything inside <> There are many types of collections... TagName collections are just one of them. It isn't easy and the documentation can be confusing -- if this were not the case, there would be much less need for IE.au3. I use DebugBar primarily to understand the HTML and document structure and to examine page source. It is also good for finding and digging into frames and examining scripts and more... Dale Free Internet Tools: DebugBar, AutoIt IE Builder, HTTP UDF, MODIV2, IE Developer Toolbar, IEDocMon, Fiddler, HTML Validator, WGet, curl MSDN docs: InternetExplorer Object, Document Object, Overviews and Tutorials, DHTML Objects, DHTML Events, WinHttpRequest, XmlHttpRequest, Cross-Frame Scripting, Office object model Automate input type=file (Related) Alternative to _IECreateEmbedded? better: _IECreatePseudoEmbedded Better Better? IE.au3 issues with Vista - Workarounds SciTe Debug mode - it's magic: #AutoIt3Wrapper_run_debug_mode=Y Doesn't work needs to be ripped out of the troubleshooting lexicon. It means that what you tried did not produce the results you expected. It begs the questions 1) what did you try?, 2) what did you expect? and 3) what happened instead? Reproducer: a small (the smallest?) piece of stand-alone code that demonstrates your trouble Link to comment Share on other sites More sharing options...
litlmike Posted April 16, 2008 Author Share Posted April 16, 2008 TagNames are the core elements of HTML... BODY, TABLE, A, UL, LI, DIV, IMG, TR, OBJECT, P, etc. - essentailly anything inside <>There are many types of collections... TagName collections are just one of them.It isn't easy and the documentation can be confusing -- if this were not the case, there would be much less need for IE.au3.I use DebugBar primarily to understand the HTML and document structure and to examine page source. It is also good for finding and digging into frames and examining scripts and more...Dale"essentailly anything inside <>"Excellent to know! It seems then that potentially one could use _IETagnameGetCollection($oIE, "table"), instead of _IETableGetCollection() (though probably not advisable). Interesting to see the interconnectivity. Thanks for your help and your creation of the IE UDF, it does me wonders on a daily basis. _ArrayPermute()_ArrayUnique()Excel.au3 UDF Link to comment Share on other sites More sharing options...
litlmike Posted April 16, 2008 Author Share Posted April 16, 2008 Oops, I just noticed we need an extra step in this script. The current script you produced provides too much data; it should only display the movies that are under the heading "Showtimes", but it now includes those under the "Tickets Now Available for these Coming Attractions". After looking at DebugBar, it looks like <UL class=showtimes> is another pattern and it is not shared by the data not needed. So I gave it a try, but I failed miserably; the lesson is never try. Below I included the code I am working on, and also my interpretation of what my script and your script are saying. #include <IE.au3> $oIE = _IECreate ("http://www.fandango.com/edwardsfresnostadium22andimax_aafyg/theaterpage", 0) $oULs = _IETagNameGetCollection ($oIE, "UL") For $oUL in $oULs $oShowTimes = _IETagNameGetCollection ($oULs, "showtimes") For $oShowTime in $oShowTimes ; loop though $oH4s = _IETagNameGetCollection ($oIE, "h4") For $oH4 in $oH4s ; loop though <H4>s $oA = _IETagNameGetCollection($oH4, "a", 0) ; get the first <A> inside the <H4> ConsoleWrite("MovieName: " & _IEPropertyGet($oA, "innertext") & @CR) Next Next Next #cs Dale's Make a collection object that Gets all the h4 From that object, make a collection object, return the 1st indexed From that object, PropertyGet the innertext #ce #cs Mine Make a collection object that Gets all UL But how do I refine it to class = showtimes? [this should refine it to only the elements under Showtimes (I believe)] From that object, make a collection object, Loop and get the Tagname collection of h4s From that object, make a collection object, Loop and get the Tagname collection of <a> From that object, PropertyGet the innertext #ce _ArrayPermute()_ArrayUnique()Excel.au3 UDF Link to comment Share on other sites More sharing options...
DaleHohm Posted April 16, 2008 Share Posted April 16, 2008 You were on the right track. The string in class= is not a tag however - UL is. Also , you need to know that the property for class is className... therefore: #include <IE.au3> $oIE = _IECreate ("http://www.fandango.com/edwardsfresnostadium22andimax_aafyg/theaterpage", 1) $oULs = _IETagNameGetCollection ($oIE, "ul") For $oUL in $oULs $oH4s = _IETagNameGetCollection ($oul, "h4") If String($oUL.className) = "showtimes" Then For $oH4 in $oH4s $oA = _IETagNameGetCollection($oH4, "a", 0) ConsoleWrite("MovieName: " & _IEPropertyGet($oA, "innertext") & @CR) Next EndIf Next Dale Free Internet Tools: DebugBar, AutoIt IE Builder, HTTP UDF, MODIV2, IE Developer Toolbar, IEDocMon, Fiddler, HTML Validator, WGet, curl MSDN docs: InternetExplorer Object, Document Object, Overviews and Tutorials, DHTML Objects, DHTML Events, WinHttpRequest, XmlHttpRequest, Cross-Frame Scripting, Office object model Automate input type=file (Related) Alternative to _IECreateEmbedded? better: _IECreatePseudoEmbedded Better Better? IE.au3 issues with Vista - Workarounds SciTe Debug mode - it's magic: #AutoIt3Wrapper_run_debug_mode=Y Doesn't work needs to be ripped out of the troubleshooting lexicon. It means that what you tried did not produce the results you expected. It begs the questions 1) what did you try?, 2) what did you expect? and 3) what happened instead? Reproducer: a small (the smallest?) piece of stand-alone code that demonstrates your trouble Link to comment Share on other sites More sharing options...
litlmike Posted April 16, 2008 Author Share Posted April 16, 2008 You were on the right track. The string in class= is not a tag however - UL is. Also , you need to know that the property for class is className... therefore: #include <IE.au3> $oIE = _IECreate ("http://www.fandango.com/edwardsfresnostadium22andimax_aafyg/theaterpage", 1) $oULs = _IETagNameGetCollection ($oIE, "ul") For $oUL in $oULs $oH4s = _IETagNameGetCollection ($oul, "h4") If String($oUL.className) = "showtimes" Then For $oH4 in $oH4s $oA = _IETagNameGetCollection($oH4, "a", 0) ConsoleWrite("MovieName: " & _IEPropertyGet($oA, "innertext") & @CR) Next EndIf Next DaleNiiiiice....there is so much satisfaction in discovering a solution. Thanks. _ArrayPermute()_ArrayUnique()Excel.au3 UDF Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now