Jump to content

InetGet - Get File Name from HTTP Header


 Share

Recommended Posts

Hi,

I want to download some documents from internet to a local folder. This works great with InetGet(...) if I know the file names, but if I don't know the filenames and types I'm totally stuck. To use the files after download I would at least need some way to identify the file format. But even so this information is available in the http response header I can't use it with InetGet(), as this method seems to simply ignore the header.

So, does someone here know a way to retrieve the file name from the Content-Disposition field of the response header?

Or is there some function like InetGetHeader(), which just returns the header of the response to a URL reuest?

Thanks in advance.

- Michael

Link to comment
Share on other sites

If you want to use the supplied filename, you'll have to do the download with WinHTTP I think.

- open session

- connect to server

- request file

- read headers

- read data to file

- close request, connection, and session

*GERMAN* [note: you are not allowed to remove author / modified info from my UDFs]My UDFs:[_SetImageBinaryToCtrl] [_TaskDialog] [AutoItObject] [Animated GIF (GDI+)] [ClipPut for Image] [FreeImage] [GDI32 UDFs] [GDIPlus Progressbar] [Hotkey-Selector] [Multiline Inputbox] [MySQL without ODBC] [RichEdit UDFs] [SpeechAPI Example] [WinHTTP]UDFs included in AutoIt: FTP_Ex (as FTPEx), _WinAPI_SetLayeredWindowAttributes

Link to comment
Share on other sites

Hi,

I don't know if this function already exists, but now I've done it here you go :

#include <Array.au3> ;only to display header into an array

TCPStartup()

Global $aRespHeader = StringSplit(_INetGetHeader("http://www.autoitscript.com/forum/topic/144662-inetget-get-file-name-from-http-header/"), @CR)

_ArrayDisplay($aRespHeader)

TCPShutdown()

Func _INetGetHeader($sURL) ;By FireFox (some of it from _INetGetPostSource by GTASpider)
    Local $iSocket, $sHeader, $sRecv, $iIP, $sHost, $aRegExp, $sHttp1, $iErr, $iSend, $sRecvHeader

    If $sURL = "" Then Return SetError(1, 0, 0)

    If StringLeft($sURL, 7) <> "http://" And StringLeft($sURL, 8) <> "https://" Then $sURL = "http://" & $sURL
    If StringRight($sURL, 1) <> "/" Then $sURL &= "/"

    $aRegExp = StringRegExp($sURL, "http?://(.*?)/", 3)
    If @error Then Return SetError(2, 0, 0)

    $sHost = $aRegExp[0]
    If $sHost = "" Then Return SetError(3, 0, 0)

    $sHttp1 = StringTrimLeft($sURL, StringInStr($sURL, "/", -1, 3) - 1)
    If $sHttp1 = "" Then Return SetError(3, 0, 0)

    $sHeader = "GET " & $sHttp1 & " HTTP/1.1" & @CRLF & _
            "Host: " & $sHost & @CRLF & _
            "Connection: close" & @CrLf & _
            "User-Agent: AutoIt v3" & @CrLf & @CrLf

    TCPStartup() ;If not already done
    $iIP = TCPNameToIP($sHost)

    If $iIP = "" Or StringInStr($iIP, ".") = 0 Then Return SetError(4, 0, 0)
    $iSocket = TCPConnect($iIP, 80)
    If @error Or $iSocket < 0 Then Return SetError(5, 0, 0)

    $iSend = TCPSend($iSocket, $sHeader)
    If @error Or $iSend < 1 Then Return SetError(6, 0, 0)

    While 1
        $sRecv = TCPRecv($iSocket, 1024)
        $iErr = @error
        If $sRecv <> "" Then
            $sRecvHeader &= $sRecv

            If StringInStr($sRecv, @CrLf & @CrLf) Then
                $sRecvHeader = StringLeft($sRecvHeader, StringInStr($sRecvHeader, @CRLF & @CRLF) -1)
                ExitLoop
            EndIf
        EndIf
        If $iErr Then Return SetError(7, 0, 0)
    WEnd

    TCPCloseSocket($iSocket)

    Return $sRecvHeader
EndFunc   ;==>_INetGetHeader

Edit : StringTrim fix.

Br, FireFox.

Edited by FireFox
Link to comment
Share on other sites

Thanks a lot for your quick responses.

What I've forgotten to mention is that I'm using Autoit in a company with a rigid security environment, where normally you can't at all access internet except via IE. However for some reason InetGet() can, which is one reason I'm very enthusiastic about Autoit. But as it seems InetGet() is working in a different way than WinHttp and TCPConnect.

So can one of you perhaps give me a hint about what way InetGet() is accessing the internet? Perhaps I will be able to find a way to retrieve the header information this way, too.

@ProgAndy:

Actually I had already tried to use WinHTTP.au3 to get the header information, however somehow (probably because of the security environment in my company) it fails already when I try to connect to the server.

@Firefox

You script looks pretty much like I was hoping for, but it also fails in my environment (error code 4 - empty $iIP).

Link to comment
Share on other sites

it fails in my environment (error code 4 - empty $iIP).

Maybe it's a network restriction as you said, so it's blocking the function TCPNameToIP.

Can you try the command ping in your cmd prompt to see if it works? If so then you can get the IP according to the domain name thanks to that command.

Or you can directly set the IP instead of the domain or you can make a simple php script to convert the domain name to an IP.

Br, FireFox.

Link to comment
Share on other sites

Hello Firefox, thanks for taking care of my problem so much.

Sadly ping doesn't work, not even with a specific (numeric) IP and replacing the $iIP by a numeric IP does also produce a failure when it is tried to establish a connection to the socket.

Before I started using Autoit I tried many ways to automate downloads, but they all failed. So I really think I need to go the way InetGet() goes. (Which way that ever is :( )

Link to comment
Share on other sites

not even with a specific (numeric) IP and replacing the $iIP by a numeric IP does also produce a failure when it is tried to establish a connection to the socket.

So you have the error code 5 ?

Have you tried to change the proxy by the HttpSetProxy func?

Br, FireFox.

Edited by FireFox
Link to comment
Share on other sites

Yes, with numeric IP it is error code 5.

Setting up a proxy also doesn't work - I don't know exactly how it is done, but after proxy login there comes still error messages from the proxy of an other level.

So still think which way ever InetGet() is using to access the internet is the way I should go. If I can't find out which way this is, probably it will be able to manage the downloads with IE automation, but it would be great to go without that.

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...