Jdop Posted July 11, 2007 Share Posted July 11, 2007 There's so many examples here that it's very hard to find a specific item... I need to load a web page, and extract certain text items from within a table. Then I' like to display those items in a custom gui. Is there anything like that floating around, or maybe just the extracting part? The learning curve is high enough that an example would save me a lot of time. Link to comment Share on other sites More sharing options...
Qualitybit Posted July 11, 2007 Share Posted July 11, 2007 There's so many examples here that it's very hard to find a specific item... I need to load a web page, and extract certain text items from within a table. Then I' like to display those items in a custom gui. Is there anything like that floating around, or maybe just the extracting part? The learning curve is high enough that an example would save me a lot of time.The examples within the helpfile should be sufficient.Check _IECreate and _IETableGetCollection & _IETableWriteToArray to obtain your array containing the desired table.hf,101011 [font="Courier New"][center]Me vs. 127.0.0.1 =>> 0:2But I never give up! >:-][/center][/font] Link to comment Share on other sites More sharing options...
dslchurns Posted July 12, 2007 Share Posted July 12, 2007 CODE $oTable = _IETableGetCollection($yourIEObject, 3) $aTableData = _IETableWriteToArray($oTable) $rows = UBound($aTableData) Link to comment Share on other sites More sharing options...
Jdop Posted July 12, 2007 Author Share Posted July 12, 2007 (edited) CODE $oTable = _IETableGetCollection($yourIEObject, 3) $aTableData = _IETableWriteToArray($oTable) $rows = UBound($aTableData) Thanks, I was able to get rolling on this and have completed quite a bit of the project. Maybe someone can help on this issue. I'm retrieving a collection using$oTableLinkText = _IETableGetCollection($oIE, 4)This works as expected. What I need are the LINKS in this table. I know I can get all links with _IELinkGetCollection, but using the same index (4) does not retrieve anything. I have to use -1, which gives me EVERY link on the web page.At that point i'd have to do some looping and compares to get the links i need.Seems there must be a better , more direct way to do this using only the original IETableGetCollection($oIE, 4) data. Edited July 12, 2007 by Jdop Link to comment Share on other sites More sharing options...
mikehunt114 Posted July 12, 2007 Share Posted July 12, 2007 Seems there must be a better , more direct way to do this using only the original IETableGetCollection($oIE, 4) data.Yup yup. If I'm understanding correctly, you're interested in all the links within a given table? This should give you some ideas: $oTable = _IETableGetCollection($yourIEObject, 3) $oLinks = _IETagNameGetCollection($oTable, "A") For $oLink In $oLinks $href = $oLink.href ConsoleWrite($href & @CR) Next IE Dev ToolbarMSDN: InternetExplorer ObjectMSDN: HTML/DHTML Reference Guide[quote]It is surprising what a man can do when he has to, and how little most men will do when they don't have to. - Walter Linn[/quote]--------------------[font="Franklin Gothic Medium"]Post a reproducer with less than 100 lines of code.[/font] Link to comment Share on other sites More sharing options...
PsaltyDS Posted July 12, 2007 Share Posted July 12, 2007 (edited) I'm retrieving a collection using $oTableLinkText = _IETableGetCollection($oIE, 4) This works as expected.What is expected from that command is an object pointing to the table, not the text of anything, yet... What I need are the LINKS in this table. I know I can get all links with _IELinkGetCollection, but using the same index (4) does not retrieve anything. I have to use -1, which gives me EVERY link on the web page. At that point i'd have to do some looping and compares to get the links i need. Seems there must be a better , more direct way to do this using only the original IETableGetCollection($oIE, 4) data.You might be able to get just the links from the table you cleverly selected the object for earlier: $oLinks = _IELinkGetCollection($oTableLinkText) $iNumLinks = @extended MsgBox(0, "Link Info", $iNumLinks & " links found") Dim $strLinks = "", $i = 0 For $oLink In $oLinks $strLinks &= $i & ": " & $oLink.href $i += 1 Next MsgBox(0, "Link Info", $strLinks) Can't test without a page to try it on... Edit: Added "$i += 1" to the loop. Had to get it in there before mikehu.... DOH! Edited July 12, 2007 by PsaltyDS Valuater's AutoIt 1-2-3, Class... Is now in Session!For those who want somebody to write the script for them: RentACoder"Any technology distinguishable from magic is insufficiently advanced." -- Geek's corollary to Clarke's law Link to comment Share on other sites More sharing options...
mikehunt114 Posted July 12, 2007 Share Posted July 12, 2007 *sneaks an $i += 1 into PSaltyDS's loop* IE Dev ToolbarMSDN: InternetExplorer ObjectMSDN: HTML/DHTML Reference Guide[quote]It is surprising what a man can do when he has to, and how little most men will do when they don't have to. - Walter Linn[/quote]--------------------[font="Franklin Gothic Medium"]Post a reproducer with less than 100 lines of code.[/font] Link to comment Share on other sites More sharing options...
Jdop Posted July 12, 2007 Author Share Posted July 12, 2007 (edited) Thanks for the quick help. Thats what I need. Strange though, I'm seeing a 'bug' in that the second item in the table is not outputting. All the other lines output properly. Here's what I see.Note: This is based on the first suggested code from mikehunt114, you guys are faster than I am.http://www.trade-ideas.com/SingleAlertType/NHP/New_high.html http://www.trade-ideas.com/Help.html#NHP ?http://www.trade-ideas.com/Help.html#NLP ?http://www.trade-ideas.com/SingleAlertType...w_high_ask.htmlhttp://www.trade-ideas.com/Help.html#NHA ?http://www.trade-ideas.com/SingleAlertType...ew_low_bid.html http://www.trade-ideas.com/Help.html#NLB ?etc.....Line 3 should be >> http://www.trade-ideas.com/SingleAlertType/NLP/New_low.html but it does not output. Is there some bug in the autoit code (i doubt it but you never know)Here is the relevant raw html from the actual web page I'm parsing. I don't see any 'malformed' code that would cause this to happen.<TR><TD><IMG SRC='http://static.trade-ideas.com/Alerts/NHP.gif'></TD><TD><A HREF='http://www.trade-ideas.com/SingleAlertType/NHP/New_high.html'>New high</A></TD></TD><TD ALIGN='center'><A HREF='/Help.html#NHP' TARGET='Alerts Help'><B>?</B></A></TD></TR><TR><TD><IMG SRC='http://static.trade-ideas.com/Alerts/NLP.gif'></TD><TD><A HREF='http://www.trade-ideas.com/SingleAlertType/NLP/New_low.html'>New low</A></TD></TD><TD ALIGN='center'><A HREF='/Help.html#NLP' TARGET='Alerts Help'><B>?</B></A></TD></TR><TR><TD><IMG SRC='http://static.trade-ideas.com/Alerts/NHA.gif'></TD><TD><A HREF='http://www.trade-ideas.com/SingleAlertType/NHA/New_high_ask.html'>New high ask</A></TD></TD><TD ALIGN='center'><A HREF='/Help.html#NHA' TARGET='Alerts Help'><B>?</B></A></TD></TR><TR><TD><IMG SRC='http://static.trade-ideas.com/Alerts/NLB.gif'></TD><TD><A HREF='http://www.trade-ideas.com/SingleAlertType/NLB/New_low_bid.html'>New low bid</A></TD></TD><TD ALIGN='center'><A HREF='/Help.html#NLB' TARGET='Alerts Help'><B>?</B></A></TD></TR> Edited July 12, 2007 by Jdop Link to comment Share on other sites More sharing options...
mikehunt114 Posted July 12, 2007 Share Posted July 12, 2007 (edited) That little link collection snippet works fine for me on that bit of HTML. Try running the code on only that section of the HTML (save it as a new .htm file on your puter, then run that through IE and apply the script). Other than that, I can only suggest to check @extended after the _IELinkGetCollection to see if it is finding the correct number of elements. If that number is correct, try looping through the links using For $i = 0 To (@extended - 1)...although I have no reason to believe For...In is misbehaving.Edit: typo Edited July 12, 2007 by mikehunt114 IE Dev ToolbarMSDN: InternetExplorer ObjectMSDN: HTML/DHTML Reference Guide[quote]It is surprising what a man can do when he has to, and how little most men will do when they don't have to. - Walter Linn[/quote]--------------------[font="Franklin Gothic Medium"]Post a reproducer with less than 100 lines of code.[/font] Link to comment Share on other sites More sharing options...
Jdop Posted July 12, 2007 Author Share Posted July 12, 2007 (edited) The second code example works, but returns all the links on the entire page. Not sure exactly what is going on there. Interestingly, BOTH code examples are missing that http://www.trade-ideas.com/SingleAlertType/NLP/New_low.html in the output stream.Very odd, wonder if its something in the page itself or autoit? Very little room for error here on my part.Heres the link to the original source page if someone wants to examine it.http://www.trade-ideas.com/SingleAlertType...pping_down.htmlI'm trying to extract the links in the frame titled "View alerts by type" Edited July 12, 2007 by Jdop Link to comment Share on other sites More sharing options...
Jdop Posted July 12, 2007 Author Share Posted July 12, 2007 Ya'll gave up on me already . lol. Thing should work, but doesn't. I can kludge fix it , but would be nice to know why that single line is 'invisible' to autoit/windows dom Link to comment Share on other sites More sharing options...
fu2m8 Posted July 13, 2007 Share Posted July 13, 2007 Ya'll gave up on me already . lol. Thing should work, but doesn't. I can kludge fix it , but would be nice to know why that single line is 'invisible' to autoit/windows dom hey not sure if i understood what you meant but the following seems to be close to what your after: #include <IE.au3> HttpSetProxy(0) Dim $oIE = _IECreate ("http://www.trade-ideas.com/SingleAlertType/SD/Offer_stepping_down.html", 0, 0, 1, -1) $oTable = _IETableGetCollection($oIE, 4) $oLinks = _IETagNameGetCollection($oTable, "a") $iNumLinks = @extended Dim $strLinks = "", $i = 0 For $oLink In $oLinks If StringInStr($oLink.href, "http://www.trade-ideas.com/SingleAlertType") Then ;checks the link to make sure its one of the proper links not the help file ones $strLinks &= $i & ": " & $oLink.href $i += 1 ConsoleWrite("Match Found - " & $oLink.outerText & " : " & $oLink.href & @LF) EndIf Next ConsoleWrite("TOTAL MATCHING LINKS FOUND: " & $i & @LF) Was it just the middle column links that you were after in that table? The above code returns 175 links which seems pretty close. Link to comment Share on other sites More sharing options...
Jdop Posted July 13, 2007 Author Share Posted July 13, 2007 fu2m8, (love the nicks around here ;-) ) your code seems to work around the 'bug' I was talking about in the other two versions. If you run those against the same page, you will see that the second link, 'new lows' does not get captured by the output routines. Link to comment Share on other sites More sharing options...
fu2m8 Posted July 13, 2007 Share Posted July 13, 2007 fu2m8, (love the nicks around here ;-) ) your code seems to work around the 'bug' I was talking about in the other two versions. If you run those against the same page, you will see that the second link, 'new lows' does not get captured by the output routines. hmm thought i was getting the same thing as you originally but I just ran the following and line 3 (i.e the one starting with 2: ...) seemed to have the correct output. Running v3.2.4.0 . I may have misunderstood what links you were after which is why i took out the help file related ones in the script I posted above. This version should return 351 links. #include <IE.au3> Dim $oIE = _IECreate ("http://www.trade-ideas.com/SingleAlertType/SD/Offer_stepping_down.html", 0, 0, 1, -1) $oTable = _IETableGetCollection($oIE, 4) $oLinks = _IETagNameGetCollection($oTable, "a") $iNumLinks = @extended Dim $strLinks = "", $i = 0 For $oLink In $oLinks ; If StringInStr($oLink.href, "http://www.trade-ideas.com/SingleAlertType") Then $strLinks &= $i & ": " & $oLink.href & @LF $i += 1 ; ConsoleWrite("Match Found - " & $oLink.outerText & " : " & $oLink.href & @LF) ;EndIf Next MsgBox(0, 0, $strLinks) ConsoleWrite("TOTAL MATCHING LINKS FOUND: " & $i & @LF) Good luck with it Link to comment Share on other sites More sharing options...
mikehunt114 Posted July 13, 2007 Share Posted July 13, 2007 I had a quick look last night before I went home, and your desired table looked like it was nested in at least one more table. Myabe have a second look at the DOM structure to confirm. That said, if you referenced the first table, all links within subsequently nested tables should be returned in a collection call. I haven't given up, just busy IE Dev ToolbarMSDN: InternetExplorer ObjectMSDN: HTML/DHTML Reference Guide[quote]It is surprising what a man can do when he has to, and how little most men will do when they don't have to. - Walter Linn[/quote]--------------------[font="Franklin Gothic Medium"]Post a reproducer with less than 100 lines of code.[/font] Link to comment Share on other sites More sharing options...
Jdop Posted July 17, 2007 Author Share Posted July 17, 2007 I had a quick look last night before I went home, and your desired table looked like it was nested in at least one more table. Myabe have a second look at the DOM structure to confirm. That said, if you referenced the first table, all links within subsequently nested tables should be returned in a collection call. I haven't given up, just busy Actually , I think I figured out what was happening. Each 'scan' has its own page. That scans link is omitted on the table. Don't know why I missed it when debugging, but I've just about finished the project with all the bells and whistles. Autoit, pretty cool. Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now