Jdop Posted September 26, 2012 Share Posted September 26, 2012 Having problems with a particular web site, maybe it's something obvious but I can't figure it out. This was working for me until recently, could they be doing something weird on the server side? The page loads perfectly in IE and Firefox, but the built in Autoit functions , all of a sudden return just the html header: This is what I get : <html><head><meta http-equiv="Refresh" content="0; URL=http://www.citronresearch.com/"></head><body></body></html> Here's the simple code for testing. #include <INet.au3> $mUrl="http://www.citronresearch.com" $temp=_INetGetSource($mUrl) ConsoleWrite( $temp & @CRLF) Link to comment Share on other sites More sharing options...
TagK Posted September 26, 2012 Share Posted September 26, 2012 (edited) Having problems with a particular web site, maybe it's something obvious but I can't figure it out. This was working for me until recently, could they be doing something weird on the server side? The page loads perfectly in IE and Firefox, but the built in Autoit functions , all of a sudden return just the html header: This is what I get : <html><head><meta http-equiv="Refresh" content="0; URL=http://www.citronresearch.com/"></head><body></body></html> Here's the simple code for testing. #include <INet.au3> $mUrl="http://www.citronresearch.com" $temp=_INetGetSource($mUrl) ConsoleWrite( $temp & @CRLF) You get that, because that is the only html on the page outside of a javascript. You need to use the ie.au3 library and use that to read innerhtml or whatever you want to get. Something like this perhaps : #include #include $MainForm = GUICreate("hidden",0,0,0,0) $oIE = _IECreateEmbedded() ;making an embedded ie window GUICtrlCreateObj($oIE, 99999, 99999, 0, 0) GUISetState(@SW_HIDE) ; made a ui thats hidden, so that we can embed the IE windows in it, and not bother anyone. $mUrl = "www.citronresearch.com" HotKeySet("{F2}",'_dostuff') ; press f2 on your keyboard to do it HotKeySet("{f3}","_noloop") ; press f3 to quit the script while 1 Sleep(100) WEnd func _dostuff() ConsoleWrite($mUrl) _IENavigate($oIE, $mUrl) $temp = _IEDocReadHTML($oIE) ConsoleWrite( $temp & @CRLF) EndFunc ;=> end of function func _noloop() Exit EndFunc Edited September 26, 2012 by TagK Programming Novice, interested in c++ (i know maybe 1%) AutoIT and many more.Projects : Anime renamer Link to comment Share on other sites More sharing options...
Jdop Posted September 26, 2012 Author Share Posted September 26, 2012 Ok, and having the same problem with http://www.citronresearch.com/feed/ , which is the rss feed . Other sites with similar code work fine. Isn't INETGETSOURCE supposed to load the entire source for the page? Link to comment Share on other sites More sharing options...
Jdop Posted September 27, 2012 Author Share Posted September 27, 2012 if INETGETSOURCE is not the right call, isn't there a reliable way to retrieve the entire source for the web page into a string variable? Link to comment Share on other sites More sharing options...
TagK Posted September 27, 2012 Share Posted September 27, 2012 Inetgetsource works fine, if the webite does not hide its source within a javascript. For those sites, use my previously posted example. Programming Novice, interested in c++ (i know maybe 1%) AutoIT and many more.Projects : Anime renamer Link to comment Share on other sites More sharing options...
JohnOne Posted September 27, 2012 Share Posted September 27, 2012 javascript hides source ? AutoIt Absolute Beginners Require a serial Pause Script Video Tutorials by Morthawt ipify Monkey's are, like, natures humans. Link to comment Share on other sites More sharing options...
TagK Posted September 27, 2012 Share Posted September 27, 2012 javascript hides source ?Not exactly, i may have chosen my words badly.Pages that use javascripts seem to generate the sourcecode as the page loads and the way the inetget function loads the source causes it to fail.So you can ofc see the code yourself if ypu navigate to the site and view it. But inetgetsource does not. The ie library does because it emulates a browser and then reads the endresult. Programming Novice, interested in c++ (i know maybe 1%) AutoIT and many more.Projects : Anime renamer Link to comment Share on other sites More sharing options...
JohnOne Posted September 27, 2012 Share Posted September 27, 2012 I usually see javascript when I use INet* functions. maybe it's php you're thinking off, or maybe I'm just wrong. AutoIt Absolute Beginners Require a serial Pause Script Video Tutorials by Morthawt ipify Monkey's are, like, natures humans. Link to comment Share on other sites More sharing options...
TagK Posted September 28, 2012 Share Posted September 28, 2012 I usually see javascript when I use INet* functions. maybe it's php you're thinking off, or maybe I'm just wrong. As far as I could see, the citron page you originally wanted the source from does not use PHP, they use a combination of xml, html and javascripts. From what I have found out, the inetgetsource bit freaks out when it encounters javascript, but then again. I do not know much about how the function was written. Long story short, The code it fetches is not the correct one, if you try it on a "pure" html website it works fine. for example http://www.sau.no, should give the result of this : <html> <head> <title>www.sau.no</title> </head> <body bgcolor=#FFFFFF> <center> <br> <br> <font face="Arial,Helvetica"> <font size=+1><b>www.sau.no</b></font> <br> <br> <img src="sau.jpg" width=288 height=211><br><br> Sauer er dumme dyr.<br>Sauen er mat for bl.a. <a href="http://www.ulv.no/">ulv</a>. <br> <br> <br> <br> <font size=0>Domenet er registrert gjennom <a href="http://www.domeneshop.no/">domeneshop.no</a>.</font> </font> </center> </body> </html> Programming Novice, interested in c++ (i know maybe 1%) AutoIT and many more.Projects : Anime renamer Link to comment Share on other sites More sharing options...
JohnOne Posted September 28, 2012 Share Posted September 28, 2012 (edited) is this not javascript returned here from this page? expandcollapse popup#include <String.au3> #include <INet.au3> $start = '<script type="text/javascript">' $end = '</script>' $mUrl = "http://www.autoitscript.com/forum/topic/144503-inetgetsource-not-loading-valid-web-page/" $temp = _INetGetSource($mUrl) $aTemp = _StringBetween($temp, $start, $end) For $i = 0 To UBound($aTemp) - 1 ConsoleWrite($start & @LF) ConsoleWrite($aTemp[$i] & @CRLF & @CRLF & @CRLF) ConsoleWrite($end & @LF & @LF & @LF) Next Edited September 28, 2012 by JohnOne AutoIt Absolute Beginners Require a serial Pause Script Video Tutorials by Morthawt ipify Monkey's are, like, natures humans. Link to comment Share on other sites More sharing options...
TagK Posted September 28, 2012 Share Posted September 28, 2012 is this not javascript returned here from this page? expandcollapse popup#include <String.au3> #include <INet.au3> $start = '<script type="text/javascript">' $end = '</script>' $mUrl = "http://www.autoitscript.com/forum/topic/144503-inetgetsource-not-loading-valid-web-page/" $temp = _INetGetSource($mUrl) $aTemp = _StringBetween($temp, $start, $end) For $i = 0 To UBound($aTemp) - 1 ConsoleWrite($start & @LF) ConsoleWrite($aTemp[$i] & @CRLF & @CRLF & @CRLF) ConsoleWrite($end & @LF & @LF & @LF) Next I did not try your code, but probably yes. The only thing I can think of that may cause THAT one to work and not the others is the specific page's webserver configuration. And that is well beyond me. Perhaps a bug or something? I just tried the function in one my scripts, found it did not do what I wanted, so i exchanged it for something else. if you have found a way now to get it to read javascript, try it on the page you originally wanted to read. See if it works there. Programming Novice, interested in c++ (i know maybe 1%) AutoIT and many more.Projects : Anime renamer Link to comment Share on other sites More sharing options...
JohnOne Posted September 28, 2012 Share Posted September 28, 2012 I guess there is php which generates the j ava script code. AutoIt Absolute Beginners Require a serial Pause Script Video Tutorials by Morthawt ipify Monkey's are, like, natures humans. Link to comment Share on other sites More sharing options...
bogQ Posted September 28, 2012 Share Posted September 28, 2012 (edited) probablybut better definition of explanation is<html><head><meta http-equiv="Refresh" content="0; URL=http://www.citronresearch.com/"></head><body></body></html>so it say refresh the page imideatlyand if you look at http://en.wikipedia.org/wiki/Meta_refreshyoul get that they probably block refresh after first load (on second load) with php or maby with java generated along with itso INet works fine, problem is that they refresh page on first load Edited September 28, 2012 by bogQ TCP server and client - Learning about TCP servers and clients connectionAu3 oIrrlicht - Irrlicht projectAu3impact - Another 3D DLL game engine for autoit. (3impact 3Drad related) There are those that believe that the perfect heist lies in the preparation.Some say that it’s all in the timing, seizing the right opportunity. Others even say it’s the ability to leave no trace behind, be a ghost. Link to comment Share on other sites More sharing options...
JohnOne Posted September 28, 2012 Share Posted September 28, 2012 So maybe winhttp functions might be better suited for such a site. AutoIt Absolute Beginners Require a serial Pause Script Video Tutorials by Morthawt ipify Monkey's are, like, natures humans. Link to comment Share on other sites More sharing options...
Jdop Posted October 3, 2012 Author Share Posted October 3, 2012 Reading the discussion here, I THINK I was able to avoid rewriting already extensive code by loading the troublesome web page TWICE. The second load seems to get all the data Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now