Jump to content

Download all files in a website? how to ?


Recommended Posts

Hi All!

Does anybody have some ideas on how to dowload a whole Website (root Dir + all subdirectories and files) on the computer ?

thanks in advance

:)

Some Projects:[list][*]ZIP UDF using no external files[*]iPod Music Transfer [*]iTunes UDF - fully integrate iTunes with au3[*]iTunes info (taskbar player hover)[*]Instant Run - run scripts without saving them before :)[*]Get Tube - YouTube Downloader[*]Lyric Finder 2 - Find Lyrics to any of your song[*]DeskBox - A Desktop Extension Tool[/list]indifference will ruin the world, but in the end... WHO CARES :P---------------http://torels.altervista.org

Link to comment
Share on other sites

Remember, AutoIt scripts things that you could normally do yourself. Find out how you would normally do it, and try to script it. If you trying to find out how to script a certian known action, ask here. If you're trying to figure out what that action is, it might be best to ask elsewhere.

To be more helpfull- It depends on if you have 'listdir' permissions on the site or not. If you do, get that list and simply use InetGet() on each item. If you don't, you may have to enumerate the top page with _InetGetSource() and then parse out each sub item (this is tipically what web crawlers do).

Edited by evilertoaster
Link to comment
Share on other sites

So speaking in code terms ? (for the second method)

The first thing I thought was to BruteForce the file names and download them... but It's OK for sites which do not pass 10 pages :)

Some Projects:[list][*]ZIP UDF using no external files[*]iPod Music Transfer [*]iTunes UDF - fully integrate iTunes with au3[*]iTunes info (taskbar player hover)[*]Instant Run - run scripts without saving them before :)[*]Get Tube - YouTube Downloader[*]Lyric Finder 2 - Find Lyrics to any of your song[*]DeskBox - A Desktop Extension Tool[/list]indifference will ruin the world, but in the end... WHO CARES :P---------------http://torels.altervista.org

Link to comment
Share on other sites

_InetGetSource() is going to get you the html for the page. You'll want to look though that (regular expressions or stringinstr() are your friend here), and find all the strings that start with "http://" or "www." and then check substrings to see if it's in the correct domain (same site).

Link to comment
Share on other sites

_InetGetSource() is going to get you the html for the page. You'll want to look though that (regular expressions or stringinstr() are your friend here), and find all the strings that start with "http://" or "www." and then check substrings to see if it's in the correct domain (same site).

WOW! wonderful idea :)

I didn't think about it :)

Thanks Alot!

Some Projects:[list][*]ZIP UDF using no external files[*]iPod Music Transfer [*]iTunes UDF - fully integrate iTunes with au3[*]iTunes info (taskbar player hover)[*]Instant Run - run scripts without saving them before :)[*]Get Tube - YouTube Downloader[*]Lyric Finder 2 - Find Lyrics to any of your song[*]DeskBox - A Desktop Extension Tool[/list]indifference will ruin the world, but in the end... WHO CARES :P---------------http://torels.altervista.org

Link to comment
Share on other sites

last question... how do I send 'listdir' command to the server and how do I get the response ?

Some Projects:[list][*]ZIP UDF using no external files[*]iPod Music Transfer [*]iTunes UDF - fully integrate iTunes with au3[*]iTunes info (taskbar player hover)[*]Instant Run - run scripts without saving them before :)[*]Get Tube - YouTube Downloader[*]Lyric Finder 2 - Find Lyrics to any of your song[*]DeskBox - A Desktop Extension Tool[/list]indifference will ruin the world, but in the end... WHO CARES :P---------------http://torels.altervista.org

Link to comment
Share on other sites

If you try to open a directory on the server, and you have list dir permissions, most webservers will give you a simple html 'index page'. This is very uncommon in production becuase it leaves you open to security risks. Most directories intstead have a defualt redirect - take http://www.autoitscript.com/ for example. It does not end with a file, yet it is serving you one, where as http://www.autoitscript.com/autoit3/files/beta/ simply lists files/folders in that directory.

Edited by evilertoaster
Link to comment
Share on other sites

ah ok so it's just determined by the presence of the index file :)

thanks

Some Projects:[list][*]ZIP UDF using no external files[*]iPod Music Transfer [*]iTunes UDF - fully integrate iTunes with au3[*]iTunes info (taskbar player hover)[*]Instant Run - run scripts without saving them before :)[*]Get Tube - YouTube Downloader[*]Lyric Finder 2 - Find Lyrics to any of your song[*]DeskBox - A Desktop Extension Tool[/list]indifference will ruin the world, but in the end... WHO CARES :P---------------http://torels.altervista.org

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...