Sign in to follow this  
Followers 0
bissquitt

Advanced Screen scraping ...maybe?

4 posts in this topic

So I am working on a script to scrape ISBN information from my schools bookstore website. The problem is that this information is limited to viewing 10 departments and 20 classes at a time.

I essentialy need to select the first 10 depts, click a button, select the first 20 classes, press a button, press another button (printer friendly format for easier scraping), scrape for info and output or just output the whole page. go back and select the next 20 classes till the end etc.

In regular programming terms it would be as simple as a nested do/while loop but I have been unable to figure out how to press buttons in any traditional language or PHP.

here is the page - http://bookstore.umbc.edu/SelectTermDept.aspx?trm=

Even something to just grab the pages would be a huge help, I can modify the script to scrape the info or just use php to do it once i have all the data. I'm an autoIt n00b unfortunatly. :)

Many thanks for any help

Share this post


Link to post
Share on other sites



@bissquitt

Look for this functions in helpfile (F1 in scite)

Send()
MouseClick()

Cheers, FireFox.


 

OS : Win XP SP2 (32 bits) / Win 7 SP1 (64 bits) / Win 8 (64 bits) | Autoit version: latest stable / beta.
Hardware : Intel(R) Core(TM) i5-2400 CPU @ 3.10Ghz / 8 GiB RAM DDR3.

My UDFs : Skype UDF | TrayIconEx UDF | GUI Panel UDF | Excel XML UDF | Is_Pressed_UDF

My Projects : YouTube Multi-downloader | FTP Easy-UP | Lock'n | WinKill | AVICapture | Skype TM | Tap Maker | ShellNew | Scriptner | Const Replacer | FT_Pocket | Chrome theme maker

My Examples : Capture toolIP Camera | Crosshair | Draw Captured Region | Picture Screensaver | Jscreenfix | Drivetemp | Picture viewer

My Snippets : Basic TCP | Systray_GetIconIndex | Intercept End task | Winpcap various | Advanced HotKeySet | Transparent Edit control

 

Share this post


Link to post
Share on other sites

#3 ·  Posted (edited)

send() and mouseclick() are very sloppy ways of interacting with a browser window.

Try looking at the _IE UDFs in the IE.au3 library, they can interact more directly with an Internet Explorer web browser.

I haven't ever used any of the functions, but there's ones like

_IEFormElementCheckboxSelect

_IEFormElementRadioSelect

_IELoadWait

_IEBodyReadText

that might be useful

There may be UDFs for interacting with Firefox, but I can't find them right now.

Edited by TurionAltec

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0