jobotATX Posted December 15, 2014 Share Posted December 15, 2014 Hello, I'm new at this, I'm trying to scrape the post ID number at the bottom of the page so I can ideally put it into a spreadsheet. the HTML for a craigslist ad is below: <body class="posting en desktop w1024"> <script type="text/javascript"><!-- function C(k){return(document.cookie.match('(^|; )'+k+'=([^;]*)')||0)[2]} ....etc.. <div class="postinginfos"> <p class="postinginfo">post id: 4806467016</p> <p class="postinginfo">posted: <time datetime="2014-12-15T12:58:21-0600" title="2014-12-15 12:58pm">2 hours ago</time></p> I can get to the div using the class name, but I can't go any further. Here is my script ; Script Start - Add your code below here #include <IE.au3> Local $oIE = _IECreate("http://www.craigslist.org") WinWait("craigslist: austin jobs, apartments, personals, for sale, services, community, and events - Internet Explorer provided by Dell") _IELinkClickByText($oIE, "cars+trucks") _IELinkClickByText($oIE, "ALL CARS & TRUCKS") _IELinkClickByText($oIE, "list") _IELinkClickByIndex ($oIE, 27) ; - On the page, get the post ID, and copy it $tags = $oIE.document.GetElementsByTagName("div") For $tag in $tags $class_value = $tag.ClassName If $class_value = "postinginfos" Then ;not sure where to go from here EndIf Next I can do things like get everything in the postinginfos div and display it in a message box, but I cannot extract the post id out of the line: <p class="postinginfo">post id: 4806467016</p> I think I need to use the .innertext method, but I am unsure how to continue. Any help is appreciated! Link to comment Share on other sites More sharing options...
Moderators Solution SmOke_N Posted December 15, 2014 Moderators Solution Share Posted December 15, 2014 (edited) That's the "innerText" or "outerText" of the paragraph tag. BTW, Austin? You're right up the road from me pseudo code: #cs <div class="postinginfos"> <p class="postinginfo">post id: 4806467016</p> #ce $goDiv = _IETagNameGetCollection($oIE, "div") If @error Then Exit 2 Local $goPar For $oDiv In $goDiv If $oDiv.className = "postinginfos" Then $goPar = _IETagNameGetCollection($oDiv, "p") If @error Then ContinueLoop For $oPar In $goPar If $oPar.className = "postinginfo" Then ConsoleWrite("Found: " & $oPar.innerText & @CRLF) EndIf Next EndIf Next replace your $tags = section with that. Edited December 15, 2014 by SmOke_N Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer. Link to comment Share on other sites More sharing options...
mikell Posted December 15, 2014 Share Posted December 15, 2014 $postid = StringRegExpReplace($oIE.locationurl, '\D', "") Link to comment Share on other sites More sharing options...
jobotATX Posted December 17, 2014 Author Share Posted December 17, 2014 Thank you so much for the replies, I got it! Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now