Jump to content

[Modified] wget alternative


DeltaRocked
 Share

Recommended Posts

Hello,

This is a script which can be used as an alternative for wget . What wget doesnt do - this script will do. But doesnt actually match upto the numerous features of wget - I use it to traverse through the link and try to procure the resultant page as seen by the browser. With wget its is kind of messy.

What this script does:

1: Server based Redirection - as of this moment 302.

2: Meta Refresh redirection.

3: Javascript Based Meta Refresh redirection.

4: Frame download.

5: Script source Download.

Additionally : converts relative path to absolute path

If you come across any anomalies - do report , as this is a beta script.

Thanks and Regards

DeltaR

Requires an INI - wget.ini

[EDIT]

In past few weeks I have been extensively testing this script and found a lot of errors, especially with the response timeouts, meta-refresh and the Stringregx ini which I have been using.

Modified: the ini

[reg]
http=(?i)(ftp|http|https):{0,1}//(w+:{0,1}w*@)?(S+)(:[0-9]+)?(/|/([w#!:.?+=&%@!-/]))?[""']{0,1}>
href=(?i)(?<=href=['|"])[^'|"]*?(?=['|"])
src=(?i)(?:<[s*]{0,1}[^img][^>]*)src[s*]?=[s*]?["'](.*?)["']
action=(?i)(?:<[s*]{0,1}form[^>]*)action[s*]?=[s*]?["']{0,1}(.*?)['"]{0,1}(?: |>|s)
frame_src=(?i)(?:<[s*]{0,1}frame[^>]*)src[s*]?=[s*]?["']{0,1}(.*?)['"]{0,1}(?: |>|s)
script_src=(?i)(?:<[s*]{0,1}script[^>]*)src[s*]?=[s*]?["']{0,1}(.*?)['"]{0,1}(?: |>|s)
embed_src=(?i)(?:<[s*]{0,1}embed[^>]*)src[s*]?=[s*]?["']{0,1}(.*?)['"]{0,1}(?: |>|s)
document_write=(?i)(?<=unescape(['|"])[^'|"]*?(?=['|"])
input_hidden=(?i)(?:<[s*]{0,1}input[^>]*)type[s*]?=[s*]?["']{0,1}(hidden)['"]{0,1}(?: |>|s)
input_button=(?i)(?:<[s*]{0,1}input[^>]*)type[s*]?=[s*]?["']{0,1}(button|radio|checkbox)['"]{0,1}(?: |>|s)
input_name=(?i)(?:<[s*]{0,1}input[^>]*)[name|value|id][s*]?=[s*]?["']{0,1}(.*?)['"]{0,1}(?: |>|s)
meta_tag=(?i)(?:<[s*]{0,1}meta http-equiv[s*]{0,1}=[s*]{0,1}["']{0,1}refresh["']{0,1}[^>]*)content[s*]?=[s*]?"(.*?)"
[script]
1=join(t.pop()));eval(
2=window.location = "http://www.google.com/"
3=header('Location: http://www.yoursite.com/new_page.html')
4=(?i)(?:[s*]{0,1}document[^>]*)action[s*]?=[s*]?"(.*?)"
5=document.location=

Modified the wget.au3 - now it is a lot faster and after testing more than 10,000 sites (phishing/clean), there have been ZERO errors as far as downloading of content is concerned.

Some apache/PHP servers require additional headers, which have been added in the script. These headers are used to serve the content.

Imagine - a phishing site serving you content based on your browser language. This problem took up a lot of my time.

wget.au3

Edited by deltar
Link to comment
Share on other sites

  • 5 months later...

:oops: Sounds quite interesting, could you please post an example to demonstrate a file-download (with Resuming). It would be easier for Networking beginners like me to understand.

Regards

----------------------------------------

:bye: Hey there, was I helpful?

----------------------------------------

My Current OS: Win8 PRO (64-bit); Current AutoIt Version: v3.3.8.1

Link to comment
Share on other sites

1: I have no idea as to how resume function works.

The posted script is a working one , in place of

$url = 'https://' & 'www.gmail.com'

Provide any URL it will work, and download the file. but since I am writing everything into the log, you will have to do some basic modifications for file download.

Presently am busy with something else, hence , cannot help you with the code. But the general Idea has been outlined.

To begin with - trace $sData , it holds all the data , so for each request you will have to create a different file.

Line No. 611:

If _WinHttpQueryDataAvailable($hRequest) Then
; $sData &= @CRLF & '*****' & @CRLF & $ldomain & $lpath & @CRLF & '*****' & @CRLF
While 1
$sChunk = _WinHttpReadData($hRequest)
If @error Then ExitLoop
$sData &= $sChunk
WEnd
__WinhttpClose()
Return ;$sData
Else
__WinhttpClose()
__CWriteRes("Site is experiencing problems.")
FileWriteLine('error.log', $F_url);& @CRLF & 'result=' & $result_log)
Return
;~ Exit 6
EndIf

[EDIT]

After going through the code, i think I will make those modifications and will update the script accordingly. cause even though this is an alternative for wget but doesnt do much justice .

Initially, I used this script for analysing phishing websites (the analyzer part has been removed), hence, you will find all the downloaded content in one single file.

Regards

Deltar

Edited by deltar
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...