Jump to content
Sign in to follow this  
dabus

How do I get the real download-filename from a php-url like ...

Recommended Posts

dabus

A full exampe:

The url is something like this:

http://www.shsforums.net/index.php?automod...load&id=141

If you load it, the default filename will be fairydragon_WeiDU.zip

Mostly all browsers can get it, but I don't want to start IE, get a popup an copy the filename.

Is it possible? IE Management and INetGet don't seem to fit.

_INetGetSource is not suitable, since some of the files are big and I don't need the source but the name of it...

I searched the forum for an hour, found a nice HTML.udf but I don't know html or php and so the results by testing and failing are ... uuh, bad? :)

I did find out that the other side uses apache by testing another one, but it didn't tell me the name of the file I was going to get.

Well, it seems like I'm close, but I don't get it done.

Any thoughts?

Share this post


Link to post
Share on other sites
walle

EDIT:NVM, Tired, didn't read your question fully

Took a fast look at the site, InetGet should work just fine.

Maybe this will work, haven't tried it out yet.

$URL = "http://www.shsforums.net/index.php?autocom=downloads&showcat=16" 
$OIE = _IECREATE($URL, 0, 0) ; Running IE in silence
$OLINKS = _IELINKGETCOLLECTION($OIE)
$COLLINKS = _IELINKGETCOLLECTION($OIE)
For $OLINK In $COLLINKS
    $SURL = $OLINK.href
    If StringInStr($SURL, "confirm_download") Then ExitLoop
Next
If $SURL <> "" Then
        $SAVE = (@TempDir & "\download.zip")
        InetGet($SURL, $TORRENTSAVE)

    Else
        MsgBox(16, "Error", "No links found.")
        EndIf
Edited by walle

Share this post


Link to post
Share on other sites
torels

you could try studying the php code... you dont' have to know it fully

just keep in mind that $var = $_GET['id'] is the "id=somevalue" part of the url


Some Projects:[list][*]ZIP UDF using no external files[*]iPod Music Transfer [*]iTunes UDF - fully integrate iTunes with au3[*]iTunes info (taskbar player hover)[*]Instant Run - run scripts without saving them before :)[*]Get Tube - YouTube Downloader[*]Lyric Finder 2 - Find Lyrics to any of your song[*]DeskBox - A Desktop Extension Tool[/list]indifference will ruin the world, but in the end... WHO CARES :P---------------http://torels.altervista.org

Share this post


Link to post
Share on other sites
cppman

Whenever there is a download URL like that, it usually means the server is going to send the "recommended" filename inside a header(forces a download), and it is up to your browser to use that filename. With that said, you can send an HTTP request for that URL and then parse the returned header for the filename.

Edited by cppman

Share this post


Link to post
Share on other sites
Richard Robertson

The user already stated not knowing anything about the protocols. I think they need more specific advice.

Share this post


Link to post
Share on other sites
torels

isn't it much simpler just doing:

_inetgetsource(thephpfile)

and find the php command Header ?

anyway... how do you send an http request via autoit and recieve the response ?


Some Projects:[list][*]ZIP UDF using no external files[*]iPod Music Transfer [*]iTunes UDF - fully integrate iTunes with au3[*]iTunes info (taskbar player hover)[*]Instant Run - run scripts without saving them before :)[*]Get Tube - YouTube Downloader[*]Lyric Finder 2 - Find Lyrics to any of your song[*]DeskBox - A Desktop Extension Tool[/list]indifference will ruin the world, but in the end... WHO CARES :P---------------http://torels.altervista.org

Share this post


Link to post
Share on other sites
cppman

isn't it much simpler just doing:

_inetgetsource(thephpfile)

and find the php command Header ?

anyway... how do you send an http request via autoit and recieve the response ?

_INetGetSource isn't going to download the source of the PHP file. It will download the source of the generated HTML.

#include "HTTP.au3"

$socket = _HTTPConnect("shsforums.net", 80)
_HTTPGet("shsforums.net", "/index.php?automodule=downloads&req=download&code=confirm_download&id=141", $socket)

Do
    $data = TCPRecv($socket, 2048)
Until ($data <> "")

MsgBox(0, "", $data)

However, it is saying the URL is moved permanently.. so I'm not sure.

Edited by cppman

Share this post


Link to post
Share on other sites
Mubo

A full exampe:

The url is something like this:

http://www.shsforums.net/index.php?automod...load&id=141

If you load it, the default filename will be fairydragon_WeiDU.zip

Mostly all browsers can get it, but I don't want to start IE, get a popup an copy the filename.

Is it possible? IE Management and INetGet don't seem to fit.

_INetGetSource is not suitable, since some of the files are big and I don't need the source but the name of it...

I searched the forum for an hour, found a nice HTML.udf but I don't know html or php and so the results by testing and failing are ... uuh, bad? :)

I did find out that the other side uses apache by testing another one, but it didn't tell me the name of the file I was going to get.

Well, it seems like I'm close, but I don't get it done.

Any thoughts?

Isn't very detailed, in the study!

Share this post


Link to post
Share on other sites
Richard Robertson

If it says it's moved permanently, that would just mean that the server uses that message to indicate the file.

Share this post


Link to post
Share on other sites
Mojo

I believe it's not possible to get the actual content (or code) of the php file itself - if it was the webserver running it would be misconfigured.

That's the purpose of php, to run on the server and generate the html code.

I could be wrong however, just about 99% sure!?! :)


You can fool some of the people all of the time, and all of the people some of the time, but you can not fool all of the people all of the time. Abraham Lincoln - http://www.ae911truth.org/ - http://www.freedocumentaries.org/

Share this post


Link to post
Share on other sites
aec

Recently MrCreatoR made this post for CheckFileSize.

You can find there the link to the download page for the excellent autoit program by G. Sandler (sources available) that does not only resolve the file name for php urls but also provides its filesize.

Share this post


Link to post
Share on other sites
weaponx

$oHTTP = ObjCreate('winhttp.winhttprequest.5.1')
$oHTTP.Open('POST', 'http://www.shsforums.net/index.php?automodule=downloads&req=download&code=confirm_download&id=141', 1)
$oHTTP.SetRequestHeader('Content-Type','application/x-www-form-urlencoded')
;$oHTTP.setTimeouts(5000, 5000, 15000, 15000)
$oHTTP.Send()
$oHTTP.WaitForResponse
$ContentDisposition = $oHTTP.GetResponseHeader("Content-Disposition")
$array = StringRegExp($ContentDisposition, 'filename="(.*)"',3)
ConsoleWrite($array[0] & @CRLF)

;ConsoleWrite($oHTTP.GetAllResponseHeaders())

Edited by weaponx

Share this post


Link to post
Share on other sites
dabus

Maybe it's a little bit late but I had a few days off and just saw this response, so : Thank you all and the solution from weaponx works nicely. You saved me some headaches. :)

Share this post


Link to post
Share on other sites
Mojo

$oHTTP = ObjCreate('winhttp.winhttprequest.5.1')
$oHTTP.Open('POST', 'http://www.shsforums.net/index.php?automodule=downloads&req=download&code=confirm_download&id=141', 1)
$oHTTP.SetRequestHeader('Content-Type','application/x-www-form-urlencoded')
;$oHTTP.setTimeouts(5000, 5000, 15000, 15000)
$oHTTP.Send()
$oHTTP.WaitForResponse
$ContentDisposition = $oHTTP.GetResponseHeader("Content-Disposition")
$array = StringRegExp($ContentDisposition, 'filename="(.*)"',3)
ConsoleWrite($array[0] & @CRLF)

;ConsoleWrite($oHTTP.GetAllResponseHeaders())
thx for posting this script. Unfortunately it doesn't work with Windows 2000.

I'll try MsCreator's solution, or otherwise try to fix/change it to work with W2K.


You can fool some of the people all of the time, and all of the people some of the time, but you can not fool all of the people all of the time. Abraham Lincoln - http://www.ae911truth.org/ - http://www.freedocumentaries.org/

Share this post


Link to post
Share on other sites
weaponx

thx for posting this script. Unfortunately it doesn't work with Windows 2000.

I'll try MsCreator's solution, or otherwise try to fix/change it to work with W2K.

It should work with 2000, do you have SP4 installed?

Share this post


Link to post
Share on other sites
ResNullius

It should work with 2000, do you have SP4 installed?

I can verify it works on a Windows 2000 install with Service Pack 4

Share this post


Link to post
Share on other sites
dabus

Well, here is the code I ripped out of MrCreators GUI, so all credit goes to him.

I just removed all the gui-stuff and added some feedback.

And I accept a " ' " in my filename, since I got some in the files that I need.

I hope I did not make a lot of mistakes, I tested it with 260 files and it works just as I needed. So maybe I did just right.

It works fine, although it's a little bit "more" code, but I think I got something worth in return. :)

And I can verify it works under wine. Bonus. :)

Global $HTTPUserAgent = "Opera/9.27 (Windows NT 5.1; U; en)"
Global $Limit_TimeOut = 10000
Global $HTTP_TCP_Def_Port = 80
Global $HTTP_TCP_Port = $HTTP_TCP_Def_Port
Global $LAST_SOCKET = -1


$check=Check_URL_Size_Proc('http://www.autoitscript.com/cgi-bin/getfile.pl?autoit3/autoit-v3-setup.exe')
If $check[0]<>0 Then ConsoleWrite('!Error='&$check[0]&@CR)
If $check[0]=0 Then
    ConsoleWrite('Size By='&$check[1]&@CR)
    ConsoleWrite('Size MB='&$check[2]&@CR)
    ConsoleWrite('Filename='&$check[3]&@CR)
    ConsoleWrite('Resonse Time='&$check[4]&@CR)
    ConsoleWrite('Resume='&$check[5]&@CR)
    ConsoleWrite('Real URL='&$check[6]&@CR)
EndIf   

Func Check_URL_Size_Proc($Read_URL_Input)
    Local $RetString[7]
    Local $Check_Response = Response_Parser($Read_URL_Input)
    $GET_RESPONSE = $Check_Response[0]
    Local $Response_Time = $Check_Response[1]
    Local $sHost = $Check_Response[2]
    Local $sPage = $Check_Response[3]
    If $Check_Response[4] Then 
        $RetString[0]=$Check_Response[4]
        Return $RetString
    EndIf   
    If StringInStr($GET_RESPONSE, "HTTP/1.1 206 Partial Content") Or _
        StringInStr($GET_RESPONSE, "Accept-Ranges:") Then $RetString[5]=1 ; Resume supported
    Local $FileName = StringRegExpReplace(_HexURLToString($sPage), "^.*/|[\?&;=\^%@#!/;<>].*$", "")
    If $FileName = "" Then $FileName = "N/A"
    If StringInStr($GET_RESPONSE, 'filename="') Then $FileName = StringStripWS(_GetMidleString($GET_RESPONSE, 'filename="', '"'), 3)
    Local $FileSize_Bytes = Number(_GetMidleString($GET_RESPONSE, "Content-Length: ", "(\n|$)"))
    If $FileSize_Bytes = 0 And StringInStr($GET_RESPONSE, "Location: ") Then
        $FileSize_Bytes = InetGetSize(StringStripWS(_GetMidleString($GET_RESPONSE, "Location: ", "(\n|$)"), 3))
    EndIf
    Local $FileSize_MBytes = Round($FileSize_Bytes / (1024*1024), 2)
    $RetString[1]=$FileSize_Bytes
    $RetString[2]=$FileSize_MBytes
    $RetString[3]=$FileName
    $RetString[4]=$Response_Time
    $RetString[6]=$Check_Response[5]; real url
    Return $RetString
EndFunc   ;==>Check_URL_Size_Proc

Func Response_Parser($sURL)
    Local $sURI_Referrer = StringRegExpReplace($sURL, "/[^/]*$", "")
    Local $sNewLocation = $sURL
    Local $aHostPage = _GetHostAndPage($sURL)
    Local $iTimer = TimerInit()
    Local $iLocationCount = 0
    Local $iCheckErr = 0
    Local $Response_Time = 0
    Local $Check_Response = ""
    
    If StringLeft($sURL, 3) = "ftp" Then
        Local $FileSize_Bytes = InetGetSize($sURL)
        If $FileSize_Bytes > 0 Then 
            $Error = 0
            $Check_Response &= @CRLF & "Content-Length: " & $FileSize_Bytes
        Else
            $Error = 1
        EndIf
        Local $aRet[6] = [$Check_Response, TimerDiff($iTimer), $aHostPage[0], $aHostPage[1], $Error, $sURL]
        Return $aRet
    EndIf
    
    $Check_Response = _HTTPGetResponse($aHostPage[0], $aHostPage[1], "HEAD", $sURI_Referrer)
    $iCheckErr = @error
    If StringRegExp($Check_Response, "(?i)Content-Type:(.*?)html") Then
        $Check_Response = _HTTPGetResponse($aHostPage[0], $aHostPage[1], "GET", $sURI_Referrer)
        $iCheckErr = @error
    EndIf
    If StringInStr($Check_Response, @CRLF & @CRLF) Then
        Local $sMetaRedirect_URL = ""
        If StringRegExp($Check_Response, '(?i)(?s).*<meta.*="Refresh" content="\d+; URL=.*"') Then
            $sMetaRedirect_URL = StringRegExpReplace($Check_Response, _
                '(?i)(?s).*<meta.*?="Refresh" content="\d+; URL=(.*)".*', '\1')
            If $sMetaRedirect_URL <> "" And $sMetaRedirect_URL <> $Check_Response Then
                $aHostPage = _GetHostAndPage($sMetaRedirect_URL)
                $Check_Response = _HTTPGetResponse($aHostPage[0], $aHostPage[1], "HEAD", $sURI_Referrer)
            EndIf
        EndIf
        $Check_Response = StringRegExpReplace($Check_Response, "(?i)(?s)" & @CRLF & @CRLF & ".*$", "")
    EndIf
    For $i = 1 To 2
        If StringInStr($Check_Response, "Location:") Then
            Local $sNewLocation = StringStripWS(_GetMidleString($Check_Response, "Location: ", "(\n|$)"), 3)
            If $sNewLocation <> "" Then
                $sNewLocation = StringReplace($sNewLocation, " ", "%20")
                If StringLeft($sNewLocation, 1) = "/" Then $sNewLocation = $aHostPage[0] & $sNewLocation
                $aHostPage = _GetHostAndPage($sNewLocation)
                If StringRegExp($aHostPage[0], ":\d+$") Then
                    $HTTP_TCP_Port = Number(StringRegExpReplace($aHostPage[0], ".*?:", ""))
                    $aHostPage[0] = StringLeft($aHostPage[0], StringInStr($aHostPage[0], ":")-1)
                EndIf
                $Check_Response_Tmp = _HTTPGetResponse($aHostPage[0], $aHostPage[1], "HEAD", $sURI_Referrer)
                $iCheckErr = @error
                If StringLeft($Check_Response_Tmp, 6) <> "<html>" And Not $iCheckErr Then $Check_Response = $Check_Response_Tmp
            EndIf
        EndIf
    Next
    $HTTP_TCP_Port = $HTTP_TCP_Def_Port
    $Response_Time = TimerDiff($iTimer)
    If StringInStr($Check_Response, @CRLF & @CRLF) Then _
        $Check_Response = StringRegExpReplace($Check_Response, "(?i)(?s)" & @CRLF & @CRLF & ".*", "")
    If StringRegExp($Check_Response, "(?i)HTTP/[0-9.]+ [0-9]+ (OK|Found)") Then
        $Error=0
    Else    
        If StringInStr($Check_Response, "400 Bad Request") Then
            $Error='Bad Request'
        ElseIf StringInStr($Check_Response, "403 Forbidden") Then
            $Error='Forbidden'
        ElseIf StringInStr($Check_Response, "404 Not Found") Then
            $Error='Not Found'
        ElseIf $iCheckErr Then
            $Error='No Connection to the Server'
        Else
            $Error='Can not find Server'
        EndIf
    EndIf
    $Check_Response = StringStripWS($Check_Response, 3)
    $aHostPage[0] = StringStripWS($aHostPage[0], 3)
    $aHostPage[1] = StringStripWS($aHostPage[1], 3)
    Local $aRet[6] = [$Check_Response, $Response_Time, $aHostPage[0], $aHostPage[1], $Error, $sNewLocation]
    Return $aRet
EndFunc   ;==>Response_Parser

Func _GetHostAndPage($sURL)
    Local $aHostPage[2]
    $aHostPage[0] = StringRegExpReplace($sURL, "\A.*?//", "")
    $aHostPage[0] = _GetMidleString($aHostPage[0], "\A", "/|.*")
    $aHostPage[0] = StringRegExpReplace($aHostPage[0], "/+$", "")
    $aHostPage[1] = _GetMidleString($sURL, $aHostPage[0], "$")
    If StringLeft($aHostPage[1], 1) <> "/" Then $aHostPage[1] = "/" & $aHostPage[1]
    $aHostPage[0] = StringStripWS($aHostPage[0], 3)
    $aHostPage[1] = StringStripWS($aHostPage[1], 3)
    Return $aHostPage
EndFunc   ;==>_GetHostAndPage

Func _GetMidleString($sString, $sStart, $sEnd, $iCase = -1, $iRetType = 0)
    Local $iCaseSence = ''
    If $iCase = -1 Then $iCaseSence = '(?i)'
    Local $aArray = StringRegExp($sString, '(?s)' & $iCaseSence & $sStart & '(.*?)' & $sEnd, 3)
    Local $IsArrayCheck = IsArray($aArray)
    If $IsArrayCheck And $iRetType = 1 Then Return $aArray
    If $IsArrayCheck Then Return $aArray[0]
    Return SetError(1, 0, "")
EndFunc   ;==>_GetMidleString

Func _EncodeURL($sURL)
    Local $BinaryString = StringReplace(StringToBinary($sURL, 4), '0x', '', 1)
    Local $UniBinLen = StringLen($BinaryString)
    Local $EncodedString, $UniBinChar
    
    For $i = 1 To $UniBinLen Step 2
        $UniBinChar = StringMid($BinaryString, $i, 2)
        If StringRegExp(BinaryToString('0x' & $UniBinChar, 4), '(?i)[a-z0-9]|-|_|\.|^/') Then
            $EncodedString &= BinaryToString('0x' & $UniBinChar)
        Else
            $EncodedString &= '%' & $UniBinChar
        EndIf
    Next
    Return $EncodedString
EndFunc   ;==>_EncodeURL

Func _HexURLToString($URLHex)
    Local $StrArray = StringSplit($URLHex, "")
    Local $RetString = "", $iDec
    Local $Ubound = UBound($StrArray)
    
    For $i = 1 To $Ubound-1
        If $StrArray[$i] = "%" And $i+2 <= $Ubound-1 Then
            $i += 2
            $iDec = Dec($StrArray[$i-1] & $StrArray[$i])
            If Not @error Then
                $RetString &= Chr($iDec)
            Else
                $RetString &= $StrArray[$i-2]
            EndIf
        Else
            $RetString &= $StrArray[$i]
        EndIf
    Next
    Return $RetString
EndFunc   ;==>_HexURLToString

Func _HTTPConnect($Host)
    TCPStartup()
    Local $Name_To_IP = TCPNameToIP($Host)
    Local $Socket = TCPConnect($Name_To_IP, $HTTP_TCP_Port)
    
    If $Socket = -1 Then
        TCPCloseSocket($Socket)
        Return SetError(1, 0, "")
    EndIf
    
    $LAST_SOCKET = $Socket
    Return $Socket
EndFunc

Func _HTTPGetResponse($Host, $Page, $sRequest="GET", $sReferrer="")
    Local $Socket = _HTTPConnect($Host)
    If @error Then Return SetError(1, 0, "")
    
    _HTTPGet($Host, $Page, $Socket, $sRequest, $sReferrer)
    If @error Then
        _HTTPClose($Socket)
        Return SetError(2, 0, "")
    EndIf
    
    Local $Recv = "", $CurrentRecv
    Local $iTimer = TimerInit()
    
    While 1
        $CurrentRecv = TCPRecv($Socket, 100)
        If @error <> 0 Then ExitLoop
        If $CurrentRecv <> "" Then $Recv &= $CurrentRecv
        
        If TimerDiff($iTimer) >= $Limit_TimeOut Then
            $Limit_TimeOutOver = True
            ExitLoop
        EndIf
    WEnd
    
    _HTTPClose($Socket)
    
    Return $Recv
EndFunc

Func _HTTPGet($Host, $Page, $Socket, $sRequest="GET", $sReferrer="")
    Local $Command = $sRequest & " " & $Page & " HTTP/1.1" & @CRLF
    $Command &= "Host: " & $Host & @CRLF
    $Command &= "User-Agent: " & $HTTPUserAgent & @CRLF
    $Command &= "Referer: " & $sReferrer & @CRLF
    $Command &= "Connection: close" & @CRLF & @CRLF
    
    Local $BytesSent = TCPSend($Socket, $Command)
    If $BytesSent = 0 Then Return SetError(1, @error, 0)
    Return $BytesSent
EndFunc

Func _HTTPClose($Socket=-1)
    TCPCloseSocket($Socket)
    TCPShutdown()
    Return 1
EndFunc
Edited by dabus

Share this post


Link to post
Share on other sites
neo291

Generally such server side scripts FORCES ie's download manager to download the file. And moreover the purpose of such scripts is to authenticate users, who r abt to download them.

I use DAP, and have set it to monitor downloadable files (in the list of downloadable files just put the extensions u want to download separately), n this will get u a popup each time a downloadable file is encountered on ne site or link. Then u can decide u want to downloaded it usin "REGULAR" mode or u wud like the dap to download it.

Oh, You gonna need DAP 5.3 for this. Dint try on newer versions.


I'm not a programmer. Just a Power User.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

×

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.