Search the Community
Showing results for tags 'Wget meta refresh frame'.
-
Hello, This is a script which can be used as an alternative for wget . What wget doesnt do - this script will do. But doesnt actually match upto the numerous features of wget - I use it to traverse through the link and try to procure the resultant page as seen by the browser. With wget its is kind of messy. What this script does: 1: Server based Redirection - as of this moment 302. 2: Meta Refresh redirection. 3: Javascript Based Meta Refresh redirection. 4: Frame download. 5: Script source Download. Additionally : converts relative path to absolute path If you come across any anomalies - do report , as this is a beta script. Thanks and Regards DeltaR Requires an INI - wget.ini [EDIT] In past few weeks I have been extensively testing this script and found a lot of errors, especially with the response timeouts, meta-refresh and the Stringregx ini which I have been using. Modified: the ini [reg] http=(?i)(ftp|http|https):{0,1}//(w+:{0,1}w*@)?(S+)(:[0-9]+)?(/|/([w#!:.?+=&%@!-/]))?[""']{0,1}> href=(?i)(?<=href=['|"])[^'|"]*?(?=['|"]) src=(?i)(?:<[s*]{0,1}[^img][^>]*)src[s*]?=[s*]?["'](.*?)["'] action=(?i)(?:<[s*]{0,1}form[^>]*)action[s*]?=[s*]?["']{0,1}(.*?)['"]{0,1}(?: |>|s) frame_src=(?i)(?:<[s*]{0,1}frame[^>]*)src[s*]?=[s*]?["']{0,1}(.*?)['"]{0,1}(?: |>|s) script_src=(?i)(?:<[s*]{0,1}script[^>]*)src[s*]?=[s*]?["']{0,1}(.*?)['"]{0,1}(?: |>|s) embed_src=(?i)(?:<[s*]{0,1}embed[^>]*)src[s*]?=[s*]?["']{0,1}(.*?)['"]{0,1}(?: |>|s) document_write=(?i)(?<=unescape(['|"])[^'|"]*?(?=['|"]) input_hidden=(?i)(?:<[s*]{0,1}input[^>]*)type[s*]?=[s*]?["']{0,1}(hidden)['"]{0,1}(?: |>|s) input_button=(?i)(?:<[s*]{0,1}input[^>]*)type[s*]?=[s*]?["']{0,1}(button|radio|checkbox)['"]{0,1}(?: |>|s) input_name=(?i)(?:<[s*]{0,1}input[^>]*)[name|value|id][s*]?=[s*]?["']{0,1}(.*?)['"]{0,1}(?: |>|s) meta_tag=(?i)(?:<[s*]{0,1}meta http-equiv[s*]{0,1}=[s*]{0,1}["']{0,1}refresh["']{0,1}[^>]*)content[s*]?=[s*]?"(.*?)" [script] 1=join(t.pop()));eval( 2=window.location = "http://www.google.com/" 3=header('Location: http://www.yoursite.com/new_page.html') 4=(?i)(?:[s*]{0,1}document[^>]*)action[s*]?=[s*]?"(.*?)" 5=document.location= Modified the wget.au3 - now it is a lot faster and after testing more than 10,000 sites (phishing/clean), there have been ZERO errors as far as downloading of content is concerned. Some apache/PHP servers require additional headers, which have been added in the script. These headers are used to serve the content. Imagine - a phishing site serving you content based on your browser language. This problem took up a lot of my time. wget.au3