finet Posted October 6, 2006 Posted October 6, 2006 Is there a way to download all files from a website directory?E.g http://www.site.com/docs/1.htm http://www.site.com/docs/2.htm http://www.site.com/docs/3.htm http://www.site.com/docs/x.htmI would like to download all the files in http://www.site.com/docs/ without knowing how many files there are and without knowing their name.(I have a file with URLS in it. Via _FileReadToArray and InetGet I can download them all. The problem with that is that I have to know the URL of each file)Thanks,Dirk
Moderators SmOke_N Posted October 6, 2006 Moderators Posted October 6, 2006 (edited) Is there a way to download all files from a website directory?E.g http://www.site.com/docs/1.htm http://www.site.com/docs/2.htm http://www.site.com/docs/3.htm http://www.site.com/docs/x.htmI would like to download all the files in http://www.site.com/docs/ without knowing how many files there are and without knowing their name.(I have a file with URLS in it. Via _FileReadToArray and InetGet I can download them all. The problem with that is that I have to know the URL of each file)Thanks,DirkIf you made them, wouldn't you know them? Edit:On another note, I do believe someone did this in FTP.http://www.autoitscript.com/forum/index.ph...st&p=124032 Edited October 6, 2006 by SmOke_N Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.
finet Posted October 6, 2006 Author Posted October 6, 2006 If you made them, wouldn't you know them? Edit:On another note, I do believe someone did this in FTP.http://www.autoitscript.com/forum/index.ph...st&p=124032==================================================The problem is making such an url file and keeping it uptodate.Thanks for your idea on ftp, but I am no programmer, new to AutoIt and don't know anything of ftp.It would be nice to have something like InetGet("http://www.site.com/docs/*.*",...
Moderators big_daddy Posted October 6, 2006 Moderators Posted October 6, 2006 (edited) If you enter "http://www.site.com/docs/" in a web browser does it give you a list of the files, or does it redirect you to the index page? Edited October 6, 2006 by big_daddy
finet Posted October 6, 2006 Author Posted October 6, 2006 If you enter "http://www.site.com/docs/" in a web browser does it give you a list of the files, or does it redirect you to the index page?Neither of both.It gives: Page not found. Error 404
Moderators big_daddy Posted October 6, 2006 Moderators Posted October 6, 2006 Neither of both. It gives: Page not found. Error 404The only other option I can think of is using a loop, but this will only work if the files are as your example above. $sURL = "http://www.site.com/docs/" For $i = 1 To 10 InetGet($sURL & $i & ".htm", @ScriptDir & "\" & $i & ".htm") If @error Then ExitLoop Next
MrBeatnik Posted October 6, 2006 Posted October 6, 2006 It seems it would be difficult, considering you can't get a listing of contents. However, if the links to each file is included *somewhere* on the website, then it can be done I guess... 1) Direct your script to go to the search webpage for this site (for example). 2) Get your script to perform a search on this page, to return all results with the required name. 3) Loop through each return for URL including the name. It really depends on how the server has been set up to deliver content. Please correct me if I am wrong in any of my posts. I like learning from my mistakes too.
finet Posted October 6, 2006 Author Posted October 6, 2006 The only other option I can think of is using a loop, but this will only work if the files are as your example above. $sURL = "http://www.site.com/docs/" For $i = 1 To 10 InetGet($sURL & $i & ".htm", @ScriptDir & "\" & $i & ".htm") If @error Then ExitLoop Next Thanks for your fast reply! But... I get one doc in the script directory: "1.htm" or "1.html" BTW the real adress is where all the files are listed in a webpage is (Dutch language site) http://www2.vlaanderen.be/ned/sites/ruimte...besluiten2.html The files to download are ambtenarenmb.html ambtenarennamen.html gemachtigdeambtenaar.html kbgewestplan.html kleinewerken.html merplicht.html normatieve.html voetgangersverkeer.html etcetera.... Greetings, Dirk
PsaltyDS Posted October 6, 2006 Posted October 6, 2006 Thanks for your fast reply! But... I get one doc in the script directory: "1.htm" or "1.html"BTW the real adress is where all the files are listed in a webpage is (Dutch language site)http://www2.vlaanderen.be/ned/sites/ruimte...besluiten2.htmlThe files to download are ambtenarenmb.html ambtenarennamen.html gemachtigdeambtenaar.html kbgewestplan.html kleinewerken.html merplicht.html normatieve.html voetgangersverkeer.html etcetera....Greetings,DirkThe administrator of the website can set Apache (or whatever web server service is used) to allow browsing the files. It appears the admin chose not to do so in this case. If you'd like that changed, contact the admin. Valuater's AutoIt 1-2-3, Class... Is now in Session!For those who want somebody to write the script for them: RentACoder"Any technology distinguishable from magic is insufficiently advanced." -- Geek's corollary to Clarke's law
DaleHohm Posted October 6, 2006 Posted October 6, 2006 The administrator of the website can set Apache (or whatever web server service is used) to allow browsing the files. It appears the admin chose not to do so in this case. If you'd like that changed, contact the admin. This has nothing to do with AutoIt, but if you want to retrieve the files, check out FlashGet http://www.flashget.comIt is a very nice, free download manager.Specifically what you want is the "Site Explorer" in the Tools menu.Trust me, check it out.Dale Free Internet Tools: DebugBar, AutoIt IE Builder, HTTP UDF, MODIV2, IE Developer Toolbar, IEDocMon, Fiddler, HTML Validator, WGet, curl MSDN docs: InternetExplorer Object, Document Object, Overviews and Tutorials, DHTML Objects, DHTML Events, WinHttpRequest, XmlHttpRequest, Cross-Frame Scripting, Office object model Automate input type=file (Related) Alternative to _IECreateEmbedded? better: _IECreatePseudoEmbedded Better Better? IE.au3 issues with Vista - Workarounds SciTe Debug mode - it's magic: #AutoIt3Wrapper_run_debug_mode=Y Doesn't work needs to be ripped out of the troubleshooting lexicon. It means that what you tried did not produce the results you expected. It begs the questions 1) what did you try?, 2) what did you expect? and 3) what happened instead? Reproducer: a small (the smallest?) piece of stand-alone code that demonstrates your trouble
PsaltyDS Posted October 6, 2006 Posted October 6, 2006 (edited) This has nothing to do with AutoIt, but if you want to retrieve the files, check out FlashGet http://www.flashget.comIt is a very nice, free download manager.Specifically what you want is the "Site Explorer" in the Tools menu.Trust me, check it out.DaleThat link is a 404 for me. Perhaps you meant FlashGot for FireFox?Still, that's not a security hack. If the web site admin doesn't allow it, I don't think you can list directory files. FlashGot follows the published links on the page(s), it doesn't search directories and list them if the admin didn't permit that already. Edit: Aha, found reference to FlashGet in Wikipedia. Edited October 6, 2006 by PsaltyDS Valuater's AutoIt 1-2-3, Class... Is now in Session!For those who want somebody to write the script for them: RentACoder"Any technology distinguishable from magic is insufficiently advanced." -- Geek's corollary to Clarke's law
DaleHohm Posted October 6, 2006 Posted October 6, 2006 That link is a 404 for me. Perhaps you meant FlashGot for FireFox?Still, that's not a security hack. If the web site admin doesn't allow it, I don't think you can list directory files. FlashGot follows the published links on the page(s), it doesn't search directories and list them if the admin didn't permit that already. Edit: Aha, found reference to FlashGet in Wikipedia.The link works fine for me. Correct - it is not bypassing the security, but they are using some sort of crawling technique to find what is on the site and in the folder.And as I said, FlashGet, not FlashGot.Dale Free Internet Tools: DebugBar, AutoIt IE Builder, HTTP UDF, MODIV2, IE Developer Toolbar, IEDocMon, Fiddler, HTML Validator, WGet, curl MSDN docs: InternetExplorer Object, Document Object, Overviews and Tutorials, DHTML Objects, DHTML Events, WinHttpRequest, XmlHttpRequest, Cross-Frame Scripting, Office object model Automate input type=file (Related) Alternative to _IECreateEmbedded? better: _IECreatePseudoEmbedded Better Better? IE.au3 issues with Vista - Workarounds SciTe Debug mode - it's magic: #AutoIt3Wrapper_run_debug_mode=Y Doesn't work needs to be ripped out of the troubleshooting lexicon. It means that what you tried did not produce the results you expected. It begs the questions 1) what did you try?, 2) what did you expect? and 3) what happened instead? Reproducer: a small (the smallest?) piece of stand-alone code that demonstrates your trouble
finet Posted October 6, 2006 Author Posted October 6, 2006 Thank you all for your kind assistance and advice! Dirk
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now