Jump to content

Download PDF file


Niblob
 Share

Recommended Posts

Hi,

I need to download (save to a specific directory) a PDF file which is only accessible after logging on to an application.

I can use _IECreate(), enter the login criteria and then use _IENavigate() which opens the PDF file in the browser. I know I could save it by getting AutoIT to click File/Save as... and then entering text and clicking through the dialogs to confirm over-writing an existing file, but I really don't like this method. The dialog boxes may move, or the file may or may not need to be over-written, etc. Well, as a last resort I will use that method, but I would really rather do it in a more reliable way. Here are a couple of ideas which I've already investigated a bit:

Ideally, I would like to use INetGet(), but I can't see how to get it to use an existing browser session - which would be already logged on to the application.

I have tried using the XMLHTTP object, which allows me to log onto the application and make an HTTP request for the PDF file. This seems to work, but I don't know how to access the stream that is returned by the http object. It is not in the responseText property, which is where I would expect it to be. Or maybe it is, but I don't know how to access it.

Here's my code - I'm afraid it might confuse more than illuminate, so I apologise in advance!

#include <IE.au3>
#include <Date.au3>
#Include <Array.au3>

;Set up objects
$oxmlhttp = ObjCreate("MSXML2.XMLHTTP.4.0")

;Set up variables
$url1 = "https://the.application.url/login.php"
$url2 = "https://the.application.url/example.pdf"
$tempFile = "c:\temp\temp.pdf"
$intLimit = 30

;Start of login process
    $strPostData = ""
    $strPostData = $strPostData & "USER=user&"
    $strPostData = $strPostData & "PASSWORD=pass&"
    $strPostData = $strPostData & "SMENC=ISO-8859-1&"
    $strPostData = $strPostData & "target=https://the.application.url/&"
    $strPostData = $strPostData & "smauthreason=0"
    
    ;####Post XML Document####
    $oxmlhttp.Open ("POST", $url1, True)
    $oxmlhttp.setRequestHeader ("Content-Type", "application/x-www-form-urlencoded") ; Set content type
    $oxmlhttp.send ($strPostData)
    
    
    ;####Wait for response####
    $timBegin = _NowCalc()
    Do
        sleep(1000)
    Until $oxmlhttp.ReadyState = 4 Or _DateDiff("s", $timBegin, _NowCalc()) > $intLimit
    If $oxmlhttp.ReadyState <> 4 Then
        msgbox("","","Timed out when logging in")
        $oxmlhttp.abort
        exit
    EndIf
    msgbox("","",$oxmlhttp.ResponseText)
;end of login process

;start of file retrieval process
    $oxmlhttp.Open ("GET", $url2, False)
    $oxmlhttp.send ()
    
    ;####Wait for response####
    $timBegin = _NowCalc()
    Do
        sleep(1000)
    Until $oxmlhttp.ReadyState = 4 Or _DateDiff("s", $timBegin, _NowCalc()) > $intLimit
    If $oxmlhttp.ReadyState <> 4 Then
        msgbox("","","Timed out when getting document")
        $oxmlhttp.abort
        exit
    EndIf
    msgbox("","",StringLen($oxmlhttp.ResponseText))
    filewrite($tempFile,$oxmlhttp.ResponseText)
;end of retrieval process

Actually, I've just had a better idea. Maybe it's clearer to just post a bit of code to download a random PDF file as the logging on using this method is working. Here it is:

#include <IE.au3>
#include <Date.au3>
#Include <Array.au3>

;Set up objects
$oxmlhttp = ObjCreate("MSXML2.XMLHTTP.4.0")

;Set up variables
$url2 = "http://www.sba.gov/idc/groups/public/documents/sba_homepage/serv_sstd_tablepdf.pdf"
$tempFile = "c:\temp\temp.pdf"
$intLimit = 30


;start of file retrieval process
    $oxmlhttp.Open ("GET", $url2, False)
    $oxmlhttp.send ()
    
    ;####Wait for response####
    $timBegin = _NowCalc()
    Do
        sleep(1000)
    Until $oxmlhttp.ReadyState = 4 Or _DateDiff("s", $timBegin, _NowCalc()) > $intLimit
    If $oxmlhttp.ReadyState <> 4 Then
        msgbox("","","Timed out when getting document")
        $oxmlhttp.abort
        exit
    EndIf
    ;msgbox("","",StringLen($oxmlhttp.ResponseText))
    $temp = fileopen($tempFile,2)
    filewrite($temp,$oxmlhttp.ResponseText)
;end of retrieval process

Thanks in advance for any help!

Link to comment
Share on other sites

I'll focus on your second code example.

First, you don't need the timer loop since you set Async to False. Second, you need .ResponseBody, not .ResponseText as the file is a binary stream:

$oxmlhttp = ObjCreate("MSXML2.XMLHTTP")

;Set up variables
$url2 = "http://www.sba.gov/idc/groups/public/documents/sba_homepage/serv_sstd_tablepdf.pdf"
$tempFile = "c:\temp\temp.pdf"


;start of file retrieval process
$oxmlhttp.Open ("GET", $url2, False)
$oxmlhttp.send ()

$temp = fileopen($tempFile,2)
filewrite($temp,$oxmlhttp.ResponseBody)
FileClose($temp)
;end of retrieval process

Dale

Edited by DaleHohm

Free Internet Tools: DebugBar, AutoIt IE Builder, HTTP UDF, MODIV2, IE Developer Toolbar, IEDocMon, Fiddler, HTML Validator, WGet, curl

MSDN docs: InternetExplorer Object, Document Object, Overviews and Tutorials, DHTML Objects, DHTML Events, WinHttpRequest, XmlHttpRequest, Cross-Frame Scripting, Office object model

Automate input type=file (Related)

Alternative to _IECreateEmbedded? better: _IECreatePseudoEmbedded  Better Better?

IE.au3 issues with Vista - Workarounds

SciTe Debug mode - it's magic: #AutoIt3Wrapper_run_debug_mode=Y

Doesn't work needs to be ripped out of the troubleshooting lexicon. It means that what you tried did not produce the results you expected. It begs the questions 1) what did you try?, 2) what did you expect? and 3) what happened instead?

Reproducer: a small (the smallest?) piece of stand-alone code that demonstrates your trouble

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...