Jump to content

Recommended Posts

Posted

Hi everyone,

My script uses IE11 on Win7 to log in to a site and enters data into a couple of forms. Upon clicking a link this data is used by the site to generate a PDF report.

With my current set-up if I do this manually the PDF opens in a new IE tab and I can download or print it. If I right-click the link that creates the PDF and choose Save Target As the PDF is generated and the Open/Save As dialogue at the bottom of the screen opens. All good.

However I would like the script to automatically download the PDF and close IE and then exit. Closing IE (_IEQuit) and exiting the script are easy enough, but I'm struggling getting the script to download the PDF.

The link to generate the PDF contains a unique number each time the page with the link is reached, so it's not static. The link position however, using _IELinkGetCollection I can tell the link to generate the PDF is always the 10th one from the end of the page, so using $iNumLinks - 10 I am able to click the link.

What I believe I need to use is InetGet however the problem I've been facing is that the link isn't static and I haven't worked out a way to access the link by index - is this possible?

Here is the website HTML for the section containing the link although I don't think it's of much use but it at least shows the format of the link (I can't post a link as it's a password protected area)...

<div class="rmButton right"><a title="Generates a PDF version of the market report in a new window." href="/rmplus/generatePdf?mr_id=60991" target="_blank">print/save as pdf</a></div>

The full link https://www.rightmove.co.uk/rmplus/generatePdf?mr_id=60991 just for completeness - visiting it will give a HTTP 500 unless logged in.

And here is the code that clicks this link opening the generated PDF in a new tab...

$oLinks = _IELinkGetCollection($oIE)
$iNumLinks = @extended
$PrintPDF = _IELinkClickByIndex($oIE, ($iNumLinks - 10))

So, how to use InetGet to visit that link? Or is there a way to Save As the newly opened tab? I've tried _IEAction($oIE, "saveas") but it seems not to work in a tab containing only a PDF.

Posted
  On 11/11/2016 at 9:59 AM, Dent said:

The full link https://www.rightmove.co.uk/rmplus/generatePdf?mr_id=60991 just for completeness - visiting it will give a HTTP 500 unless logged in.

Expand  

 

This is what I get when I click this link:

  Quote

Sorry

Sorry there is a problem

Reference No #2301-161111-100642-10401

Expand  

 

Signature beginning:
Please remember: "AutoIt"..... *  Wondering who uses AutoIt and what it can be used for ? * Forum Rules *
ADO.au3 UDF * POP3.au3 UDF * XML.au3 UDF * IE on Windows 11 * How to ask ChatGPT for AutoIt Codefor other useful stuff click the following button:

  Reveal hidden contents

Signature last update: 2023-04-24

Posted

Two ways (first method is easier):

- download all the page content with _IEBodyReadHTML and retrieve your mr_id through regular expressions (StringRegExp)

- use .getattribute method to retrieve href attribute in the div with "Generates a PDF version of the market report in a new window." a title attribute.

Then you can build full link and use InetGet to download the file.

Posted

HREF is not pointing directly to PDF file , so I'm not sure if InetGet will work.

 

Signature beginning:
Please remember: "AutoIt"..... *  Wondering who uses AutoIt and what it can be used for ? * Forum Rules *
ADO.au3 UDF * POP3.au3 UDF * XML.au3 UDF * IE on Windows 11 * How to ask ChatGPT for AutoIt Codefor other useful stuff click the following button:

  Reveal hidden contents

Signature last update: 2023-04-24

Posted

Thank-you both for your input.

j0kky, I was really hoping your first method would work - I don't know why I didn't think of this, but I have a habit of coding whilst tired.

So I used StringInStr and StringMid etc to obtain the actual link, here's the code...

$Page = _IEBodyReadHTML($oIE)
$PrintPDFStart = StringInStr($Page, "?mr_id=")
$PrintPDFEnd = StringMid($Page, $PrintPDFStart, 12)
$PrintPDF = ("https://www.rightmove.co.uk/rmplus/generatePdf" & $PrintPDFEnd)
$hDownload = InetGet($PrintPDF, @ScriptDir & "\Report.pdf")
Do
    Sleep(250)
Until InetGetInfo($hDownload, 2)
InetClose($hDownload)

I checked the result of $PrintPDF to be sure the URL is 100% correct, which it is, but unfortunately the script just hangs. mLipok you were right :(

I haven't tried your second method as I also don't see this working as I think the problem is with InetGet()

I think the answer may lie in trying to save the PDF from the newly created IE tab.

Posted

Q1

In this new tab can you manualy Save this pdf file ? I want to know the file name.

Q2

When this new tab is open , how You close them ?

 

Ps.

Long time ago I had the same problem 

 

Signature beginning:
Please remember: "AutoIt"..... *  Wondering who uses AutoIt and what it can be used for ? * Forum Rules *
ADO.au3 UDF * POP3.au3 UDF * XML.au3 UDF * IE on Windows 11 * How to ask ChatGPT for AutoIt Codefor other useful stuff click the following button:

  Reveal hidden contents

Signature last update: 2023-04-24

Posted

Hi mLipok and thanks for replying again.

Q1. In the new tab which just has the title 'rightmove.co.uk' I can manually save the pdf and the filename is 'generatePDF.pdf' but I haven't been able to save it using AutoIt yet.

Q2. I have only been able to close the tab manually, I have tried _IEAttach to the new tab and then _IEQuit but it doesn't work. I think I would need to interact with the interface by simulating a mouse-click.

Posted

I think I have solution to Q1.

But I'm not able to find it quickly .

 

Signature beginning:
Please remember: "AutoIt"..... *  Wondering who uses AutoIt and what it can be used for ? * Forum Rules *
ADO.au3 UDF * POP3.au3 UDF * XML.au3 UDF * IE on Windows 11 * How to ask ChatGPT for AutoIt Codefor other useful stuff click the following button:

  Reveal hidden contents

Signature last update: 2023-04-24

Posted

I think part of the issue is that you didn't indicate that InetGet should download the file in the background. The revised code should look like

$hDownload = InetGet($PrintPDF, @ScriptDir & "\Report.pdf", $INET_FORCERELOAD, $INET_DOWNLOADBACKGROUND)
Do
    Sleep(250)
Until InetGetInfo($hDownload, 2)

The other issue that you are likely running into is that the InetGet will fail due to security reasons. In the past, I have resolved this by _IECreateEmbedded within a hidden GUI so that the web browser is running in the same context as the script.

Posted
  On 11/12/2016 at 1:18 PM, Danp2 said:

I think part of the issue is that you didn't indicate that InetGet should download the file in the background. The revised code should look like

$hDownload = InetGet($PrintPDF, @ScriptDir & "\Report.pdf", $INET_FORCERELOAD, $INET_DOWNLOADBACKGROUND)
Do
    Sleep(250)
Until InetGetInfo($hDownload, 2)

The other issue that you are likely running into is that the InetGet will fail due to security reasons. In the past, I have resolved this by _IECreateEmbedded within a hidden GUI so that the web browser is running in the same context as the script.

Expand  

Thanks Danp2, it could be a context thing, InetGet doesn't return an error it just hangs. I don't think I'll be able to sniff the connection either as it's SSL.

I tried your variation above and the same thing happens. InetRead at least does return that 0 bytes have been read so I guess it is getting blocked.

Maybe using IUAutomation is the way and actually controlling the new tab that is created when the link is clicked - but I've never used that UDF.

Posted
  On 11/12/2016 at 12:48 PM, mLipok said:

I think I have solution to Q1.

But I'm not able to find it quickly .

 

Expand  

Something like this:

_Example()
Func _Example()
    Local $sIECacheDir = _IE_GetCacheDir()
    Local $aKatalogi = _FileListToArrayRec($sIECacheDir, 'generatePDF*.pdf', 1, 1, 0, 2)
EndFunc


Func _IE_GetCacheDir()
    Local $sIECacheDir = RegRead('HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\Explorer\User Shell Folders\', 'Cache')
    $sIECacheDir = StringReplace($sIECacheDir, '%USERPROFILE%', @UserProfileDir)
    Return SetError(0, 0, $sIECacheDir)
EndFunc   ;==>_IE_GetCacheDir

 

Signature beginning:
Please remember: "AutoIt"..... *  Wondering who uses AutoIt and what it can be used for ? * Forum Rules *
ADO.au3 UDF * POP3.au3 UDF * XML.au3 UDF * IE on Windows 11 * How to ask ChatGPT for AutoIt Codefor other useful stuff click the following button:

  Reveal hidden contents

Signature last update: 2023-04-24

Posted
  On 11/12/2016 at 11:53 AM, Dent said:

Q2. I have only been able to close the tab manually, I have tried _IEAttach to the new tab and then _IEQuit but it doesn't work. I think I would need to interact with the interface by simulating a mouse-click.

Expand  

Can you post here HTML snippet with this second IE tab ?
_IEAttach works ?
Try to change IE content before Quit, just like that:

Local _IEAttach(......
    $sHTML = _
            "<HTML>" & @CRLF & _
            "<HEAD>" & @CRLF & _
            "<TITLE>Empty Page</TITLE>" & @CRLF & _
            "</HEAD>" & @CRLF & _
            "</HTML>" & _
            ""
    _IEDocWriteHTML($oIE, $sHTML)
    _IEQuit($oIE)

 

Signature beginning:
Please remember: "AutoIt"..... *  Wondering who uses AutoIt and what it can be used for ? * Forum Rules *
ADO.au3 UDF * POP3.au3 UDF * XML.au3 UDF * IE on Windows 11 * How to ask ChatGPT for AutoIt Codefor other useful stuff click the following button:

  Reveal hidden contents

Signature last update: 2023-04-24

Posted

The new IE tab that is created just displays the pdf so I cannot control it with any _IE functions and there is no HTML source to view.

I think I'll have to control the window in some other way :(

Posted

 

  On 11/13/2016 at 8:21 PM, Dent said:

displays the pdf so I cannot control it with any _IE functions and there is no HTML source to view.

Expand  

Are you sure ?

Try this:

#include <IE.au3>

_Example()
Func _Example()
    Local $sURL = 'http://lipok.pl/test_pdf/nicniewidac.html'

    Local $oIE = _IECreate($sURL)
    Local $sHTML = _IEDocReadHTML($oIE)
    MsgBox(0, '', $sHTML)

    $sHTML = _
            "<HTML>" & @CRLF & _
            "<HEAD>" & @CRLF & _
            "<TITLE>Empty Page</TITLE>" & @CRLF & _
            "</HEAD>" & @CRLF & _
            "</HTML>" & _
            ""

    _IEDocWriteHTML($oIE, $sHTML)

EndFunc   ;==>_Example

 

When this MsgBox PopUp try to use F12 (I know do not not works), but then try Developers Tools from IE Menu >>  Tool's

 

 

  On 11/13/2016 at 8:21 PM, Dent said:

The new IE tab that is created just displays the pdf so I cannot control it with any _IE functions

Expand  

Here is another interesting example:

 

Signature beginning:
Please remember: "AutoIt"..... *  Wondering who uses AutoIt and what it can be used for ? * Forum Rules *
ADO.au3 UDF * POP3.au3 UDF * XML.au3 UDF * IE on Windows 11 * How to ask ChatGPT for AutoIt Codefor other useful stuff click the following button:

  Reveal hidden contents

Signature last update: 2023-04-24

Posted

Did you try my snippet from #13

??

btw.
If I'm thinking correctly you can use :

using one of this:
_WinINet_FindFirstUrlCacheEntryEx
_WinINet_FindNextUrlCacheEntryEx
_WinINet_FtpCommand
_WinINet_InternetFindNextFile

 

Signature beginning:
Please remember: "AutoIt"..... *  Wondering who uses AutoIt and what it can be used for ? * Forum Rules *
ADO.au3 UDF * POP3.au3 UDF * XML.au3 UDF * IE on Windows 11 * How to ask ChatGPT for AutoIt Codefor other useful stuff click the following button:

  Reveal hidden contents

Signature last update: 2023-04-24

Posted

Would anyone be able to VNC into my machine and try to resolve this?

I would happily give the login details out but there is an email verification process that means I would need to be involved, if you VNC to my machine it is already configured correctly.

mLipok, yes I tried your snippet from #13 but it didn't work.

Posted (edited)

Quick example (from here):

#include "WinHttp.au3"
Global $hOpen = _WinHttpOpen()
Global $hConnect = _WinHttpConnect($hOpen, 'daveismyname.com')
Global $hRequest = _WinHttpOpenRequest($hConnect, 'POST', '/demos/formpdf/', Default, 'https://daveismyname.com/demos/formpdf/')
_WinHttpSendRequest($hRequest, 'Content-Type: application/x-www-form-urlencoded', 'name=Dent&email=dent%40wayfarer.com&submit=Submit')
_WinHttpReceiveResponse($hRequest)
If _WinHttpQueryDataAvailable($hRequest) Then
    Global $sHeaders = _WinHttpQueryHeaders($hRequest)
    Global $sFile = @ScriptDir & '\' & StringRegExpReplace($sHeaders, '(?si).*filename="(\S+)".*', '$1')
    Global $dChunk, $dData
    While 1
        $dChunk = _WinHttpReadData($hRequest, 2)
        If @error Then ExitLoop
        $dData = _WinHttpSimpleBinaryConcat($dData, $dChunk)
    WEnd
    Global $hFile = FileOpen($sFile, 26)
    FileWrite($hFile, $dData)
    FileClose($hFile)
    If Not FileExists($sFile) Then MsgBox(48, "Error occurred", "No file created." & @CRLF)
    ShellExecute($sFile)
Else
    MsgBox(48, "Error occurred", "No data available." & @CRLF)
EndIf
_WinHttpCloseHandle($hRequest)
_WinHttpCloseHandle($hConnect)
_WinHttpCloseHandle($hOpen)

 

Edited by GMK
Correct code; add link

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...