DeltaRocked Posted October 5, 2011 Share Posted October 5, 2011 (edited) Hello, This is a script which can be used as an alternative for wget . What wget doesnt do - this script will do. But doesnt actually match upto the numerous features of wget - I use it to traverse through the link and try to procure the resultant page as seen by the browser. With wget its is kind of messy. What this script does: 1: Server based Redirection - as of this moment 302. 2: Meta Refresh redirection. 3: Javascript Based Meta Refresh redirection. 4: Frame download. 5: Script source Download. Additionally : converts relative path to absolute path If you come across any anomalies - do report , as this is a beta script. Thanks and Regards DeltaR Requires an INI - wget.ini [EDIT] In past few weeks I have been extensively testing this script and found a lot of errors, especially with the response timeouts, meta-refresh and the Stringregx ini which I have been using. Modified: the ini [reg] http=(?i)(ftp|http|https):{0,1}//(w+:{0,1}w*@)?(S+)(:[0-9]+)?(/|/([w#!:.?+=&%@!-/]))?[""']{0,1}> href=(?i)(?<=href=['|"])[^'|"]*?(?=['|"]) src=(?i)(?:<[s*]{0,1}[^img][^>]*)src[s*]?=[s*]?["'](.*?)["'] action=(?i)(?:<[s*]{0,1}form[^>]*)action[s*]?=[s*]?["']{0,1}(.*?)['"]{0,1}(?: |>|s) frame_src=(?i)(?:<[s*]{0,1}frame[^>]*)src[s*]?=[s*]?["']{0,1}(.*?)['"]{0,1}(?: |>|s) script_src=(?i)(?:<[s*]{0,1}script[^>]*)src[s*]?=[s*]?["']{0,1}(.*?)['"]{0,1}(?: |>|s) embed_src=(?i)(?:<[s*]{0,1}embed[^>]*)src[s*]?=[s*]?["']{0,1}(.*?)['"]{0,1}(?: |>|s) document_write=(?i)(?<=unescape(['|"])[^'|"]*?(?=['|"]) input_hidden=(?i)(?:<[s*]{0,1}input[^>]*)type[s*]?=[s*]?["']{0,1}(hidden)['"]{0,1}(?: |>|s) input_button=(?i)(?:<[s*]{0,1}input[^>]*)type[s*]?=[s*]?["']{0,1}(button|radio|checkbox)['"]{0,1}(?: |>|s) input_name=(?i)(?:<[s*]{0,1}input[^>]*)[name|value|id][s*]?=[s*]?["']{0,1}(.*?)['"]{0,1}(?: |>|s) meta_tag=(?i)(?:<[s*]{0,1}meta http-equiv[s*]{0,1}=[s*]{0,1}["']{0,1}refresh["']{0,1}[^>]*)content[s*]?=[s*]?"(.*?)" [script] 1=join(t.pop()));eval( 2=window.location = "http://www.google.com/" 3=header('Location: http://www.yoursite.com/new_page.html') 4=(?i)(?:[s*]{0,1}document[^>]*)action[s*]?=[s*]?"(.*?)" 5=document.location= Modified the wget.au3 - now it is a lot faster and after testing more than 10,000 sites (phishing/clean), there have been ZERO errors as far as downloading of content is concerned. Some apache/PHP servers require additional headers, which have been added in the script. These headers are used to serve the content. Imagine - a phishing site serving you content based on your browser language. This problem took up a lot of my time. wget.au3 Edited October 20, 2012 by deltar Link to comment Share on other sites More sharing options...
stormbreaker Posted March 30, 2012 Share Posted March 30, 2012 Sounds quite interesting, could you please post an example to demonstrate a file-download (with Resuming). It would be easier for Networking beginners like me to understand. Regards ---------------------------------------- :bye: Hey there, was I helpful? ---------------------------------------- My Current OS: Win8 PRO (64-bit); Current AutoIt Version: v3.3.8.1 Link to comment Share on other sites More sharing options...
DeltaRocked Posted March 31, 2012 Author Share Posted March 31, 2012 (edited) 1: I have no idea as to how resume function works. The posted script is a working one , in place of $url = 'https://' & 'www.gmail.com' Provide any URL it will work, and download the file. but since I am writing everything into the log, you will have to do some basic modifications for file download. Presently am busy with something else, hence , cannot help you with the code. But the general Idea has been outlined. To begin with - trace $sData , it holds all the data , so for each request you will have to create a different file. Line No. 611: If _WinHttpQueryDataAvailable($hRequest) Then ; $sData &= @CRLF & '*****' & @CRLF & $ldomain & $lpath & @CRLF & '*****' & @CRLF While 1 $sChunk = _WinHttpReadData($hRequest) If @error Then ExitLoop $sData &= $sChunk WEnd __WinhttpClose() Return ;$sData Else __WinhttpClose() __CWriteRes("Site is experiencing problems.") FileWriteLine('error.log', $F_url);& @CRLF & 'result=' & $result_log) Return ;~ Exit 6 EndIf [EDIT] After going through the code, i think I will make those modifications and will update the script accordingly. cause even though this is an alternative for wget but doesnt do much justice . Initially, I used this script for analysing phishing websites (the analyzer part has been removed), hence, you will find all the downloaded content in one single file. Regards Deltar Edited October 20, 2012 by deltar Link to comment Share on other sites More sharing options...
stormbreaker Posted March 31, 2012 Share Posted March 31, 2012 You are really a life-saver. Thank You. Regards ---------------------------------------- :bye: Hey there, was I helpful? ---------------------------------------- My Current OS: Win8 PRO (64-bit); Current AutoIt Version: v3.3.8.1 Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now