ViciousXUSMC Posted February 18, 2015 Share Posted February 18, 2015 I read all the IE parsing threads I see, hoping to find somebody with a question close to mine but usually its just different enough I cant quite figure it out. I am trying to read data from a page and display the results in a neat organized way. If I look at the page source I can find the part of the page that relates to the data I want to get. <table class="datalist"> <thead> <tr><th>Year</th><th>Agency</th><th>Position</th><th>Name (Last, First)</th> <th>Gender</th><th>Ethnic/Origin</th> <th>Annual Salary</th> </tr> </thead> <tr> <td class="fback">2014</td> <td class="fback">Board of County Commissioners (Polk)</td> <!-- <td>Transportation</td> --> <td class="fback">Service Worker/Equipment Operator III (Traffic Control Tech)</td> <td class="fback"><a href="/salary/detail/2014-board-of-county-commissioners-polk-transportation-service-workerequipment-operator-iii-traffic-control-tech-mosley-kenneth-r-3935360/">Mosley, Kenneth</a></td> <td class="fback">Male</td> <td class="fback">White (Not Hispanic/Latino)</td> <td class="fback">$39,354</td> </tr> </table> All I would need is the Name, position, and pay. But if its not possible to parse that easily I suppose just getting everything between the tags would be good enough to start. Also its possible to get multiple results, it would just be on a new table row <tr> Any help getting started? I will test and verify as we get some ideas rolling. If you need more information let me know. Link to comment Share on other sites More sharing options...
mikell Posted February 18, 2015 Share Posted February 18, 2015 (edited) An idea to start #Include <Array.au3> $txt = FileRead("1.html") $res = StringRegExp($txt, '(?s)-->.*?>([^<]+).*?([^><]+)</a>.*?\$([^<]+)', 3) _ArrayDisplay($res) Edited February 18, 2015 by mikell Link to comment Share on other sites More sharing options...
Solution Danp2 Posted February 18, 2015 Solution Share Posted February 18, 2015 Have you checked out _IETableWriteToArray? Latest Webdriver UDF Release Webdriver Wiki FAQs Link to comment Share on other sites More sharing options...
kylomas Posted February 19, 2015 Share Posted February 19, 2015 Try this, when possible IE functions are preferred because any format change might break a regexp solution... #include <array.au3> #include <ie.au3> _arraydisplay(_GetTbl(fileread(@scriptdir & '\html.txt'))) Func _GetTbl($html) Local $o_htmlfile = ObjCreate('HTMLFILE'), $str = '' If Not IsObj($o_htmlfile) Then Return SetError(-1) $o_htmlfile.open() $o_htmlfile.write($html) $o_htmlfile.close() Local $otbls = _IETagnameGetCollection($o_htmlfile, 'TABLE') if not isobj($otbls) then return seterror(-2) for $otbl in $otbls if $otbl.classname = 'datalist' then $a10 = _IETableWriteToArray($otbl,true) return $a10 endif next local $a10 = ['Not found'] return $a10 EndFunc The HTML code you posted is in this file pointed to by the script (unable to upload the file for some reason). Forum Rules Procedure for posting code "I like pigs. Dogs look up to us. Cats look down on us. Pigs treat us as equals." - Sir Winston Churchill Link to comment Share on other sites More sharing options...
ViciousXUSMC Posted February 19, 2015 Author Share Posted February 19, 2015 (edited) Have you checked out _IETableWriteToArray? Nope that was I all I needed, got a working code just from looking at the examples on the Function page. I'll look at the example posted just above this post as it is more complex than what I put together and see what the advantages are. My simple but working code is as follows #include <IE.au3> #include <Array.au3> $Name = InputBox("snip", "Name", "") $oIE = _IECreate("snip" & $Name, 0, 0) ;_IELoadWait($oIE) $oTable = _IETableGetCollection($oIE, 1) $aTableData = _IETableWriteToArray($oTable) _ArrayDisplay($aTableData, "Wage Lookup Results", "", 64) _IEQuit($oIE) Edited February 19, 2015 by ViciousXUSMC Link to comment Share on other sites More sharing options...
kylomas Posted February 19, 2015 Share Posted February 19, 2015 I'll look at the example posted just above this post as it is more complex than what I put together and see what the advantages are. There are no advantages. You have the URL to create an object. I had to make one. You know which table you want. All I knew was the table class. You listed part of the Web page source so I am NOT going to assume that there is only one table and tested for classname against all tables. Forum Rules Procedure for posting code "I like pigs. Dogs look up to us. Cats look down on us. Pigs treat us as equals." - Sir Winston Churchill Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now