jksmurf Posted February 15, 2011 Share Posted February 15, 2011 (edited) Please bear with me, it's my first script, but I'm stuck already and I've spent two hours googling about variables. I'm trying to setup a script which will help me download a series of webpages for a TV EPG, that can processed by TVxB, a "scraper" which uses wget. Unfortuntaly the site I am trying to scrape uses Javascript, so the wget doesn't work. So my script needs to 1. Load http://www.setanta.com/HongKong/TV-Listings/ which loads today's EPG. 2. Save that web page to a local dir in the format TVxb-Setanta.hk-20110215.html so that TVxB can parse it. 3. Click the NEXT date which uses Javascript in the form javascript:__doPostBack('ctl00$cphForm$AllCols$tvlHeader$rptDays$ctl02$btnDay','') to load the next days page. 4. Save that web page to a local dir in the fromat TVxb-Setanta.hk-20110216.html so that TVxB can parse it. 5. and so on. However A. My problem is that running the script without declaring anything says "Variable used without being declared". It doesn't seem to recognise 'ctl00$cphForm$AllCols$tvlHeader$rptDays$ctl02$btnDay' as a Variable. B. Trying to declare Global ctl00$cphForm$AllCols$tvlHeader$rptDays$ctl02$btnDay does not register it as a a Variable. C. Trying to declare Global $ctl00$cphForm$AllCols$tvlHeader$rptDays$ctl02$btnDay does not register it a a Variable either, it says Syntax error "$ctl00$cphForm" when I run the Syntax Check from SCITE ; Open Setanta Website and Save Each Javascript Generated EPG File for TVxB to Process ; Global $ctl00$cphForm$AllCols$tvlHeader$rptDays$ctl02$btnDay ; #include <IE.au3> ; $oIE = _IECreate() _IENavigate($oIE,"http://www.setanta.com/HongKong/TV-Listings/") ; _IENavigate($IE,"javascript:__doPostBack('ctl00$cphForm$AllCols$tvlHeader$rptDays$ctl02$btnDay','')") Really appreciate some guidance here. Thanks! k.Setanta.au3 Edited February 15, 2011 by jksmurf Link to comment Share on other sites More sharing options...
Moderators Melba23 Posted February 15, 2011 Moderators Share Posted February 15, 2011 (edited) jksmurf,Those incredibly long names will not be recognised as variables because they contain multiple "$" characters. AutoIt variables begin with "$" but cannot have any further "$" characters within them. So the first question is obvious: do you have to use those specific names? M23P.S. Welcome to the AutoIt forum, by the way! Edited February 15, 2011 by Melba23 Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind Open spoiler to see my UDFs: Spoiler ArrayMultiColSort ---- Sort arrays on multiple columnsChooseFileFolder ---- Single and multiple selections from specified path treeview listingDate_Time_Convert -- Easily convert date/time formats, including the language usedExtMsgBox --------- A highly customisable replacement for MsgBoxGUIExtender -------- Extend and retract multiple sections within a GUIGUIFrame ---------- Subdivide GUIs into many adjustable framesGUIListViewEx ------- Insert, delete, move, drag, sort, edit and colour ListView itemsGUITreeViewEx ------ Check/clear parent and child checkboxes in a TreeViewMarquee ----------- Scrolling tickertape GUIsNoFocusLines ------- Remove the dotted focus lines from buttons, sliders, radios and checkboxesNotify ------------- Small notifications on the edge of the displayScrollbars ----------Automatically sized scrollbars with a single commandStringSize ---------- Automatically size controls to fit textToast -------------- Small GUIs which pop out of the notification area Link to comment Share on other sites More sharing options...
hannes08 Posted February 15, 2011 Share Posted February 15, 2011 Hello jksmurf,There's an error in that code. In the last line you're trying to use the object "$IE" instead of "$oIE". $oIE = _IECreate()_IENavigate($oIE,"http://www.setanta.com/HongKong/TV-Listings/");_IENavigate($IE,"javascript:__doPostBack('ctl00$cphForm$AllCols$tvlHeader$rptDays$ctl02$btnDay','')")Regards,Hannes Regards,Hannes[spoiler]If you can't convince them, confuse them![/spoiler] Link to comment Share on other sites More sharing options...
jksmurf Posted February 15, 2011 Author Share Posted February 15, 2011 Thanks for the warm welcome gents, much appreciated. Boy you folks are quick! @ M23 "So the first question is obvious: do you have to use those specific names?" It's what was on the Webpage, so I thought I had no choice? @Hannes: Magic! It seems to work with "$oIE". I just cut/pasted code. Thanks, Now onto saving each page as an html with a certain filename. Cheers! k. Link to comment Share on other sites More sharing options...
AdmiralAlkex Posted February 15, 2011 Share Posted February 15, 2011 @ M23 "So the first question is obvious: do you have to use those specific names?" It's what was on the Webpage, so I thought I had no choice?Hi and Welcome to the forums!Maybe you have no choice for the string _IENavigate() needs, but that doesn't explain why you want a variable with that name. And then you are not even using the variable, so why have it at all?Do you understand what a variable is? Look at any piece of code written in AutoIt, and "Language Reference - Variables" in the helpfile and you'll see .Some of my scripts: ShiftER, Codec-Control, Resolution switcher for HTC ShiftSome of my UDFs: SDL UDF, SetDefaultDllDirectories, Converting GDI+ Bitmap/Image to SDL Surface Link to comment Share on other sites More sharing options...
kylomas Posted February 15, 2011 Share Posted February 15, 2011 jksmurf, That variable name sure looks like a bunch of variables, each preceeded by a "$" sign. You sure that you did'nt somehow remove a bunch of commas? Also, something is missing here. Your code as posted can't be doing what you expect, if it will run at all! kylomas Forum Rules Procedure for posting code "I like pigs. Dogs look up to us. Cats look down on us. Pigs treat us as equals." - Sir Winston Churchill Link to comment Share on other sites More sharing options...
jksmurf Posted February 15, 2011 Author Share Posted February 15, 2011 (edited) Thank you for the feedback. Hi AdmiralAlkex, thanks for the welcome. Well my wife (also an Engineer) says I might not be the sharpest sword in the rack but yes, I do understand what variable is, I'm also a chartered Engineer, so while I'm new here, to be fair, in the context of what I raised above where in point A I said that I was getting an error message i.e. "A. My problem is that running the script without declaring anything says "Variable used without being declared". So really the only reason I asked about variables at all, rather than inputs for _IENavigate() is that using $IE" instead of "$oIE", as Hannes kindly pointed out, was incorrect, and let to that error being displayed. It appears I do not even need the variable :-) Hi kylomas I agree, I even started declaring them as 7 individual variables in the several hours before I made the first post. Very frustrating! However if you navigate to http://www.setanta.com/HongKong/TV-Listings/ and hover over the dates for each day, as AdmiralAlkex noted above "Maybe you have no choice for the string _IENavigate() needs". So it's in only the _IENavigate() command. It seems the variable issue was a red herring due to $IE" instead of "$oIE". You are correct, it's not 100% complete the next step is to save each page in turn. It seems saving pages with _IEAction($oIE, "saveas") is a bit of an issue too, but I'm working on it, especially saving as a different name. ; Open Setanta Website and Save Each Javascript Generated EPG File for TVxB to Process ; ;Declare Variables for File Names $Date = @YEAR&@MON&@MDAY ; #include <IE.au3> ; $oIE = _IECreate() ; ; Open Setanta Website ; _IENavigate($oIE,"http://www.setanta.com/HongKong/TV-Listings/") ; ; Save it; InetGet is no good for Javascript pages as urls only ; _IEAction($oIE, "saveas") "C:\Utils\TVXB\TVXB04Test\cache\TVxb-setanta.hk-" & $Date & ".html" ; ; Move onto the next day using Javascript command ; _IENavigate($oIE,"javascript:__doPostBack('ctl00$cphForm$AllCols$tvlHeader$rptDays$ctl02$btnDay','')") ; ; Save it; ; _IEAction($oIE, "saveas") ; ; Save it; Move onto the next day using Javascript command; Save it; etc etc ; _IENavigate($oIE,"javascript:__doPostBack('ctl00$cphForm$AllCols$tvlHeader$rptDays$ctl03$btnDay','')") _IENavigate($oIE,"javascript:__doPostBack('ctl00$cphForm$AllCols$tvlHeader$rptDays$ctl04$btnDay','')") _IENavigate($oIE,"javascript:__doPostBack('ctl00$cphForm$AllCols$tvlHeader$rptDays$ctl05$btnDay','')") _IENavigate($oIE,"javascript:__doPostBack('ctl00$cphForm$AllCols$tvlHeader$rptDays$ctl06$btnDay','')") _IENavigate($oIE,"javascript:__doPostBack('ctl00$cphForm$AllCols$tvlHeader$btnNextWeek','')")Setanta.au3 Edited February 15, 2011 by jksmurf Link to comment Share on other sites More sharing options...
kylomas Posted February 16, 2011 Share Posted February 16, 2011 jksmurf, See this post >> for the "saveas" issue. I've been scraping sports statistics pages for a couple years w/o problems. Does javascript make a difference to the scraper that you are using? I process pages myself, albeit not with a great deal of sophistication. kylomas Forum Rules Procedure for posting code "I like pigs. Dogs look up to us. Cats look down on us. Pigs treat us as equals." - Sir Winston Churchill Link to comment Share on other sites More sharing options...
jksmurf Posted February 16, 2011 Author Share Posted February 16, 2011 (edited) Cheers kylomas, it looks like I might go the old Alt-F A route, than use the _IEAction($oIE, "saveas") alternative. Just need to understand the Syntax for cobbling together the filename. Good to know regarding your scraper experience! I use TVxB as a scraper (since 6 years), which uses wget on html. Been great so far, and I've become pretty good at the TVxB ini settings, but EPG's are becoming more and more java-based, and there's no switches or settings in the ini which I can get it to go to the next date in sequence. TVxB even says specifically it can't handle javascript, which appears to be true. I can't get it to download the EPG pages in the first instance (beyond the deafult (today) page, so I will use Autoit to send the Webpage javascript command then TVxB to parse the cached files. k. Edited February 16, 2011 by jksmurf Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now