Jump to content

How to login to Site, crawl list of URLS save each page as HTML

Recommended Posts

So i've been tasked with logging into a site with IE, and then look at a text file that contains a list of URLs on the site to crawl.  I need to crawl each URL, save the HTML to a specified file name, then go to the next URL and repeat the process.

So far I am able to login to the site in IE with this:

#include <ie.au3>
#include <INet.au3>
#include <MsgBoxConstants.au3>


; Get ready to login!
$oIE = _IECreate ("http://localhost/books/login.aspx")
$oForm = _IEFormGetObjByName ($oIE, "form1")
$oQuery1 = _IEFormElementGetObjByName ($oForm, "userNameTextBox")
$oQuery2 = _IEFormElementGetObjByName ($oForm, "passwordTextBox")

; Start sending form values and then simulate a click to login
_IEFormElementSetValue ($oQuery1,$uname)
_IEFormElementSetValue ($oQuery2,$pwd)
_IEAction ($oButton, "click")

That gets me to login.

My text file (c:urls.txt) looks like this:


I have to then open IE up to the first URL, then Save file AS something, then go to the next one.  

Any suggestions?


Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

  • Recently Browsing   0 members

    • No registered users viewing this page.
  • Create New...