jksmurf Posted March 26, 2011 Share Posted March 26, 2011 (edited) Is there a difference between the way wget saves html and FileWrite does it? I'm having an odd propblem with the way FileWrite writes html files, in that a program (TVxb) I then use to parse the data in those files, does not recognise them and goes on to try to download them itself, rather than using the cached file prouced by thr script. TVxb uses wget (as below) to download the files (there is a separate issue with TVxb not doing javascript which is why I am using autoit to download the files). filewrite("C:\Users\MyName\Desktop\SetantaCache\TVxb-setanta.hk-" & stringreplace($savedate,"/","") & ".html",$sHTML) Here is what wget does in TVxb: wget -E -t 5 --header="Accept-Language: en-us,en;q=0.5" http://www.setanta.com/hongkong/TV-Listings/ I attach html's produced by both the script and TVxb. Thanks! k.htmls_from_two_versions.zip Edited March 26, 2011 by jksmurf Link to comment Share on other sites More sharing options...
jksmurf Posted March 31, 2011 Author Share Posted March 31, 2011 Apologies for the bump; anyone? Link to comment Share on other sites More sharing options...
jksmurf Posted May 13, 2011 Author Share Posted May 13, 2011 Apologies for the bump; anyone?The Cable provider pulled the EPG, so I now really need to get thsi working off the "Cable Provider" Provider website.Would anyone be able to shed some light on the difference between the way wget saves html and FileWrite does it?Thanksk. Link to comment Share on other sites More sharing options...
SquirrelyOne Posted May 14, 2011 Share Posted May 14, 2011 The Cable provider pulled the EPG, so I now really need to get thsi working off the "Cable Provider" Provider website.Would anyone be able to shed some light on the difference between the way wget saves html and FileWrite does it?Thanksk.I have NetZero and my intuition and experience have it that it isn't worth the trouble your trying to hack your way out of getting all the spam that comes with certain ISP and cable service providers. Look at the situation from their perspective for a moment -- they have a budget and they have a way to prevent hackers from providing ways around the companies anti-hacking regimen -- these companies have certain expectations about their profit and to stay in the business of giving people like you and me a chance to afford any cable or internet at all, they need prove to the banks that help them out with financing, what the budget will be.But as for your code, we would have to see the entire of the AutoIt script you are running; and from just what programming language is "wget" anyway? It looks a little familiar, but I can't remember... Link to comment Share on other sites More sharing options...
jksmurf Posted May 14, 2011 Author Share Posted May 14, 2011 Well I'm not really trying to get past any spam. I'm just trying to setup a script which will help me download a series of webpages for a TV EPG, that can processed by TVxB, a "scraper" which uses wget for use in my Software based PVR (nPVR). Unfortunately the site I am trying to scrape uses Javascript, so the wget doesn't work. My script 1. Loads http://www.setanta.com/HongKong/TV-Listings/ which loads today's EPG. 2. Saves that web page to a local dir in the format TVxb-Setanta.hk-20110215.html so that TVxB can parse it. 3. Clicks the NEXT date which uses Javascript in the form javascript:__doPostBack('ctl00$cphForm$AllCols$tvlHeader$rptDays$ctl02$btnDay','') to load the next days page. 4. Save that web page to a local dir in the fromat TVxb-Setanta.hk-20110216.html so that TVxB can parse it. 5. and so on. The script itself is attached. Glad for any help at all. k.SetantapBSwithUpdatePost19Mod4.zip Link to comment Share on other sites More sharing options...
SquirrelyOne Posted May 15, 2011 Share Posted May 15, 2011 I think that you should contact the vendor(s) of the (non-AutoIt) software(s) you are using in order to find out more about those other technologies. But good luck. Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now