John117 Posted January 5, 2012 Posted January 5, 2012 Hi, I am currently running into trouble getting data from a webpage. It seems that the data sometimes loads after the page completes -so its missing the data: $oIE = _IECreate("http://www.westhoustoninfiniti.com/j/i/30002/UsedInventory.html", 1, 1) _IELoadWait ($oIE) $Source = _IEDocReadHTML($oIE) ConsoleWrite($Source & @LF) Also: I need the page to show all results, or at least 100, not only 25. So I need it to be written as 100 before reading the html or any other method to pull all data quickly. Any suggestions?
kylomas Posted January 5, 2012 Posted January 5, 2012 Jon117, If your interest is the WEB page source try something like this Local $src = InetGet("http://www.westhoustoninfiniti.com/j/i/30002/UsedInventory.html","c:tmpsrc.txt") If $src = 0 Then MsgBox(0,'','inetget error = ' & @error) Run("notepad.exe c:tmpsrc.txt") kylomas Forum Rules Procedure for posting code "I like pigs. Dogs look up to us. Cats look down on us. Pigs treat us as equals." - Sir Winston Churchill
JohnOne Posted January 5, 2012 Posted January 5, 2012 Isolate this element that holds this number thing, and don't read the doc until it is 100 AutoIt Absolute Beginners Require a serial Pause Script Video Tutorials by Morthawt ipify Monkey's are, like, natures humans.
John117 Posted January 5, 2012 Author Posted January 5, 2012 Getting closer but need help submitting the change to return the results. #include <IE.au3> #include <string.au3> $oIE = _IECreate("http://www.westhoustoninfiniti.com/j/i/30002/UsedInventory.html", 1, 1) _IELoadWait ($oIE) $Source = _IEDocReadHTML($oIE) $Results = StringReplace($Source, "<INPUT id=results-per-page-state value=value;25; type=text name=f7>", "<INPUT id=results-per-page-state value=value;100; type=text name=f7>") ;~ $oIE.submit $Source = _IEDocReadHTML($oIE) ConsoleWrite($Source & @LF)
kylomas Posted January 5, 2012 Posted January 5, 2012 John117, Perhaps it would help if you would tell us what you are trying to do... kylomas Forum Rules Procedure for posting code "I like pigs. Dogs look up to us. Cats look down on us. Pigs treat us as equals." - Sir Winston Churchill
John117 Posted January 5, 2012 Author Posted January 5, 2012 Jon117, If your interest is the WEB page source try something like this Local $src = InetGet("http://www.westhoustoninfiniti.com/j/i/30002/UsedInventory.html","c:tmpsrc.txt") If $src = 0 Then MsgBox(0,'','inetget error = ' & @error) Run("notepad.exe c:tmpsrc.txt") kylomas Thanks - from what I can tell, the source fetches the data. so grabbing the source, returns little :-)
John117 Posted January 5, 2012 Author Posted January 5, 2012 John117,Perhaps it would help if you would tell us what you are trying to do...kylomasIf you manually open the website, you will see car data load. If you set it to 100 you will see all car data available. I want to pull all car data. vin, year, make, etc. With the results set to 100
JohnOne Posted January 5, 2012 Posted January 5, 2012 So you are not trying to parse a webpage are you really, you are trying to set an element to a particular value on a webpage. You might want to reflect that in your thread title. AutoIt Absolute Beginners Require a serial Pause Script Video Tutorials by Morthawt ipify Monkey's are, like, natures humans.
John117 Posted January 5, 2012 Author Posted January 5, 2012 So you are not trying to parse a webpage are you really, you are trying to set an element to a particularvalue on a webpage.You might want to reflect that in your thread title.No, I am trying to parse a webpage. That is the whole point. Setting the element just fetches the rest of the data to parse. as is, 25 of 39 can be parsed without setting the element to 100the problem is grabbing the data to parse. and it is not part of the source until some loading. . . .
kylomas Posted January 5, 2012 Posted January 5, 2012 Ahhh, now getting complete? picture... Forum Rules Procedure for posting code "I like pigs. Dogs look up to us. Cats look down on us. Pigs treat us as equals." - Sir Winston Churchill
JohnOne Posted January 5, 2012 Posted January 5, 2012 No, I am trying to parse a webpage. That is the whole point. Setting the element just fetches the rest of the data to parse. as is, 25 of 39 can be parsed without setting the element to 100the problem is grabbing the data to parse. and it is not part of the source until some loading. . . .No, It's not part of the source until you have set that element to 100.If you do a bit of searching for that then you can parse your webpage . AutoIt Absolute Beginners Require a serial Pause Script Video Tutorials by Morthawt ipify Monkey's are, like, natures humans.
John117 Posted January 5, 2012 Author Posted January 5, 2012 Yes, it is part, it default loads 25. However, it only becomes part of the source after loading. I only need to set the element after I have the parse worked out. One has little to do with the other, that is why the element question was an "also" and not part of the main post.
JohnOne Posted January 5, 2012 Posted January 5, 2012 Perhaps I'm just confused. Can you explain exactly what part you are having trouble with? AutoIt Absolute Beginners Require a serial Pause Script Video Tutorials by Morthawt ipify Monkey's are, like, natures humans.
John117 Posted January 5, 2012 Author Posted January 5, 2012 The data (car info) loads. But does not always load before: $Source = _IEBodyReadHTML($oIE) And since it is not part of the source I must wait for it to load. -rather than just grabbing the source. I would like to do something to grab, or pull that data without missing it or having to add Sleep(10000) before $Source = _IEBodyReadHTML($oIE) is there a method to pull the info myself rather than waiting to parse it (after it eventually loads). Or a way to insure it is ready to parse.
JohnOne Posted January 5, 2012 Posted January 5, 2012 I think I see what you mean now. When you view the source of the page from your browser, all the info you want is there...39 found. But when you try it with _IE functions the raw page source is being returned (not browser parsed source) Is this correct? AutoIt Absolute Beginners Require a serial Pause Script Video Tutorials by Morthawt ipify Monkey's are, like, natures humans.
John117 Posted January 5, 2012 Author Posted January 5, 2012 Yes, that is correct. After I manually do it, and it completely loads. It is there. I need to to run on its own though. and still return the data. but in a timely manor. not 15 seconds later or what ever I have to wait to fetch it just to insure it has loaded. Is there a part of the fetch that I can mimic without actually loading the rest of the page?
Robjong Posted January 6, 2012 Posted January 6, 2012 (edited) Hi, you could just set one of the select boxes for results-per-page to 100, which triggers there javascript to load new results. After that you can read out the HTML or do some more checks first, here is an example... expandcollapse popup#include <IE.au3> Global $oIE = _IECreate("http://www.westhoustoninfiniti.com/j/i/30002/UsedInventory.html", 1, 0) ; go to the page ; get the page to displays 100 results, or all if there are less then 100 Local $oElems = _IETagNameGetCollection($oIE, "select") ; get all select tags For $oElem In $oElems If StringInStr($oElem.className, 'resultsPerPage') Then ; this is the results-per-page select box _IEFormElementOptionSelect($oElem, 100) ; set it to 100 results per page ExitLoop ; exit the loop, there are more 'resultsPerPage' selects, no need to set it twice EndIf Next ; get the result count of the page, to check if it worked Local $oTable = _IETableGetCollection($oIE, 4) ; get the results table (5th form, zero based so it becomes index 4) Local $oElem = _IETagNameGetCollection($oTable, "td", 0) ; get the first td tag, contains a string with the result count Local $iResultCount = Int(StringRegExpReplace($oElem.innerText, "D+", "")) ; strip every non digit, now we have the result count ConsoleWrite("-> Result count A: " & $iResultCount & @CRLF) ; gives me 39 results atm ; you could check if the result count changed... ;~ If $iResultCount <= 25 Then ; it did not work or page is not done yet ;~ EndIf Local $oElem = _IETagNameGetCollection($oIE, "div") ; get all div tags Local $iCount = 0, $aDetail, $oFoo, $sLabel For $oDiv In $oElem ; loop over all div tags to get the results (detail boxes) If StringInStr($oDiv.className, 'details-box') Then ; this is a result $iCount += 1 ConsoleWrite("-- " & StringFormat("%03d", $iCount) & " -------------------------" & @CRLF) $oFoo = _IETagNameGetCollection($oDiv, "a", 1) ; get the 2nd link, this is the title of the detail box $sLabel = $oFoo.innertext ConsoleWrite("Label: " & $sLabel & @CRLF) $oFoo = _IETagNameGetCollection($oDiv, "div") ; get all div tags in this result For $oDivDetail In $oFoo If StringInStr($oDivDetail.className, 'details-text-item') Then ; this is a detail item $aDetail = StringRegExp($oDivDetail.innerHTML, "(?i)<strong>(.*?)</strong>s+([^>]+)", 3) ; get the key/value If @error Then ContinueLoop ConsoleWrite($aDetail[0] & ": " & $aDetail[1] & @CRLF) ;~ Switch StringStripWS(StringLower($aDetail[0]), 3) ; here you could do something with each item ;~ Case "retail price" ;~ Case "price" ;~ Case "mileage" ;~ Case "transmission" ;~ case "stock" ;~ case "ext. color" ;~ case "int. color" ;~ case "engine" ;~ case "vin" ;~ EndSwitch EndIf Next EndIf Next Edit: added a little debugging code Edit 2: Just read what you are attempting to do so rewrote the example, this works for me... Edited January 6, 2012 by Robjong
kylomas Posted January 6, 2012 Posted January 6, 2012 Robjong, Thank you...just learned a bunch about navigating a web page programmatically... kylomas Forum Rules Procedure for posting code "I like pigs. Dogs look up to us. Cats look down on us. Pigs treat us as equals." - Sir Winston Churchill
Robjong Posted January 6, 2012 Posted January 6, 2012 Hi, glad you learned something from it Did you see the new example? had some trouble posting so had to clean up my own mess.
kylomas Posted January 6, 2012 Posted January 6, 2012 yes, again tx... Forum Rules Procedure for posting code "I like pigs. Dogs look up to us. Cats look down on us. Pigs treat us as equals." - Sir Winston Churchill
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now