Dante_t

Downloading files from https sites using inetget

14 posts in this topic

#1 ·  Posted

Hi Guys, I need help. I have searched the forum before posting and i couldn't find anything. The code below works fine when downloading files from "http" sites, but when trying to download from "https" sites, no files are downloaded. I tried different sites and I experience the same problem everywhere. Is there something I'm missing or doing wrong? Please note that I'm not a programmer and I'm new to this. I'm just using logic wherever i can to get things done. your help will be highly appreciated.

 

#include <InetConstants.au3>
#include <MsgBoxConstants.au3>
#include <WinAPIFiles.au3>

; Download a file in the background.
; Wait for the download to complete.


Example()

Func Example()
    ; Save the downloaded file to the temporary folder.
    Local $sFilePath = "d:\"

    ; Download the file in the background with the selected option of 'force a reload from the remote site.'
    Local $hDownload = InetGet("https://en.wikipedia.org/wiki/HTTPS#/media/File:Internet2.jpg", $sFilePath& "Internet2.jpg", $INET_FORCERELOAD, $INET_DOWNLOADBACKGROUND)

    ; Wait for the download to complete by monitoring when the 2nd index value of InetGetInfo returns True.
    Do
        Sleep(250)
    Until InetGetInfo($hDownload, $INET_DOWNLOADCOMPLETE)

    ; Retrieve the number of total bytes received and the filesize.
    Local $iBytesSize = InetGetInfo($hDownload, $INET_DOWNLOADREAD)
    Local $iFileSize = FileGetSize($sFilePath&"Internet2.jpg")

    ; Close the handle returned by InetGet.
    InetClose($hDownload)

    ; Display details about the total number of bytes read and the filesize.
    MsgBox($MB_SYSTEMMODAL, "", "The total download size: " & $iBytesSize & @CRLF & _
            "The total filesize: " & $iFileSize)

    ; Delete the file.
    ;FileDelete($sFilePath)
EndFunc   ;==>Example

 

Share this post


Link to post
Share on other sites



#4 ·  Posted

I was just using Wikipedia as an example because to actually get the files from the site i want i have to login. the logging in and navigating to the correct web page part is sorted. I'm actually trying to download files from https://rda.ucar.edu/. file sizes are between 17 and 20 MB. I tried using your code but still no luck. I've seen people mentioning things like cookies for sites that require authentication, etc. I don't understand what they were saying but maybe that's where my problem is.

My login in code is attached. i will pm my credentials. link to the file i want to download is on the code.

 

login&navigate.au3

Share this post


Link to post
Share on other sites

#5 ·  Posted

Maybe this will work for you --

 

Share this post


Link to post
Share on other sites

#6 ·  Posted

Hi again,

just found this tip elsewhere here  in the forum...

inetget ("https://" & $user & ":" & $pass & "@www.mywebsite.com)

can you try something like...

Local $hDownload = InetGet("https://" & $User & ":" & $Pwd & "@rda.ucar.edu/data/ds083.2/grib2/2017/2017.07/fnl_20170724_06_00.grib2", $sFilePath & "fnl_20170724_06_00.grib2", $INET_DOWNLOADWAIT);, $INET_FORCERELOAD)

...with your username and password?

good luck!

Share this post


Link to post
Share on other sites

#7 ·  Posted (edited)

22 hours ago, Danp2 said:

Maybe this will work for you --

 

Hi Danp2, which method are you referring to? can you please share your code if ystill have?

Edited by Dante_t
spelling error

Share this post


Link to post
Share on other sites

#8 ·  Posted

20 hours ago, MarxBros said:

Hi again,

just found this tip elsewhere here  in the forum...

inetget ("https://" & $user & ":" & $pass & "@www.mywebsite.com)

can you try something like...

Local $hDownload = InetGet("https://" & $User & ":" & $Pwd & "@rda.ucar.edu/data/ds083.2/grib2/2017/2017.07/fnl_20170724_06_00.grib2", $sFilePath & "fnl_20170724_06_00.grib2", $INET_DOWNLOADWAIT);, $INET_FORCERELOAD)

...with your username and password?

good luck!

Will try this when i get home, thank you MarxBros

Share this post


Link to post
Share on other sites

#9 ·  Posted

21 hours ago, MarxBros said:

Hi again,

just found this tip elsewhere here  in the forum...

inetget ("https://" & $user & ":" & $pass & "@www.mywebsite.com)

can you try something like...

Local $hDownload = InetGet("https://" & $User & ":" & $Pwd & "@rda.ucar.edu/data/ds083.2/grib2/2017/2017.07/fnl_20170724_06_00.grib2", $sFilePath & "fnl_20170724_06_00.grib2", $INET_DOWNLOADWAIT);, $INET_FORCERELOAD)

...with your username and password?

good luck!

Hi MarxBros,

I tried this method and it doesnt seem to work. the code just runs continually without any file being downloaded.. how can i use IE object as Danp2 suggested above?

Regards

Share this post


Link to post
Share on other sites

#10 ·  Posted

1 hour ago, Dante_t said:

Hi Danp2, which method are you referring to? can you please share your code if ystill have?

You create a hidden GUI containing an embedded IE object. Use this object to log into the website and navigate to the correct page. Retrieve links to be downloaded and then use InetGet to perform the download.

Like this --

; *******************************************************
; Download files from website
; *******************************************************
Func RetrieveFiles()
Local $oButton, $oElement, $oForm, $oFrame, $oTable, $aTableData, $oLinks, $oLink, $iNumLinks
Local $filename, $i, $trs, $href, $created, $array

    $IsDownloaded = False

    _FileWriteLog($LogFile, "Logging into website")

    ; Create embedded IE object
    $oIE = _IECreateEmbedded()

    ; Dummy GUI
    GUICreate("Dummy", 640, 580, _
        (@DesktopWidth - 640) / 2, (@DesktopHeight - 580) / 2)
    GUICtrlCreateObj($oIE, 10, 40, 600, 360)
    GUISetState(@SW_SHOW)

    ; Open instance of website
    _IENavigate ($oIE, $Url)

    If @error Then
        ErrNotify("_IENavigate error. Errorcode = " & @error)
        Return
    EndIf

    ; Log into website
    $oForm = _IEFormGetCollection($oIE,0)

    If @error Then
        ErrNotify("_IEFormGetCollection error. Errorcode = " & @error)
        Return
    EndIf

    $oElement = _IEFormElementGetObjByName($oForm, "user")

    If @error Then
        ErrNotify("_IEFormElementGetObjByName (user) error. Errorcode = " & @error)
        Return
    EndIf

    _IEFormElementSetValue($oElement,$User)

    $oElement = _IEFormElementGetObjByName($oForm, "password")

    If @error Then
        ErrNotify("_IEFormElementGetObjByName (password) error. Errorcode = " & @error)
        Return
    EndIf

    _IEFormElementSetValue($oElement,$Pass)

    _IEFormSubmit($oForm)
    _IELoadWait ($oIE)

    _FileWriteLog($LogFile, "Switching to Downloads tab")

    ; Switch to Downloads tab
    _IELinkClickByText($oIE, "Downloads")

    If @error Then
        ErrNotify("_IELinkClickByText (Downloads) error. Errorcode = " & @error)
        Return
    EndIf

    $oForm = _IEFormGetObjByName($oIE, "frmSearch")

    If @error Then
        ErrNotify("_IEFormGetObjByName (Downloads) error. Errorcode = " & @error)
        Return
    EndIf

    $oElement = _IEFormElementGetObjByName($oForm, "lstboxes")

    If @error Then
        ErrNotify("_IEFormElementGetObjByName (Downloads) error. Errorcode = " & @error)
        Return
    EndIf

    For $i = 0 To $oElement.Length - 1
        _IEFormElementOptionSelect($oElement, $i, 1, "byIndex", 1)
        _IELoadWait($oIE)

        _FileWriteLog($LogFile, "Retrieving links (" & $i & ")")

        $oLinks = _IEGetObjByName($oIE, "file_list_link", -1)

        If @error Then
            ErrNotify("_IEGetObjByName (Downloads) error. Errorcode = " & @error)
            Return
        EndIf

        $iNumLinks = @extended

        For $oLink In $oLinks
            $href = $oLink.href

            $filename = StringTrimLeft($href, StringInStr($href, "*", 0, -1))

            If StringRegExp($filename, "(?i)\.(?:tif|txt)$") Then
                ; Make sure file wasn't already processed
                If Not FileExists($ProcessedDir & $filename) Then
                    $IsDownloaded = True

                    _FileWriteLog($LogFile, "Retrieving file: " & $filename)

                    ; Retrieve file

                    ConsoleWrite($InDir & $filename & @CRLF)

                    InetGet($href, $InDir & $filename, 1)

                    If @error Then
                        ErrNotify("InetGet (Downloads) error. Errorcode = " & @error)
                        Return
                    EndIf
                EndIf
            EndIf
        Next

        ; Reselect the desired objects due to prior page reload
        $oForm = _IEFormGetObjByName($oIE, "frmSearch")

        If @error Then
            ErrNotify("_IEFormGetObjByName (Downloads) error. Errorcode = " & @error)
            Return
        EndIf

        $oElement = _IEFormElementGetObjByName($oForm, "lstboxes")

        If @error Then
            ErrNotify("_IEFormElementGetObjByName (Downloads) error. Errorcode = " & @error)
            Return
        EndIf
    Next

    _IELinkClickByText($oIE, "Log off")

    ; Release GUI
    GUIDelete()

    _FileWriteLog($LogFile, "Completed File Retrieval")
EndFunc

 

Share this post


Link to post
Share on other sites

#11 ·  Posted

Huge credits to trancexx for the wonderful  WinHttp.au3 , this works nice - using the *appropriate* credentials  :)

#include "WinHttp.au3"

$sServerAddress = "https://rda.ucar.edu"
$sGeneratorLocation = "/cgi-bin/login"

; sorry, these are private
$sEmail = "xxx@xxx"
$sPassword = "xxxxxxxx"

$hOpen = _WinHttpOpen()

; get access cookie
$hConnect = _WinHttpConnect($hOpen, $sServerAddress)
_WinHttpSimpleSSLRequest($hConnect)
_WinHttpCloseHandle($hConnect) 

; build and fill the login form
$sForm = _
        '<form action="' & $sServerAddress & $sGeneratorLocation & '" method="post">' & _
        '   <input name="email" />' & _
        '   <input name="passwd" />' & _
        '   <input name="remember" />' & _
        '   <input name="do" />' & _
        '   <input name="url" />' & _
        '</form>'
$hConnect = $sForm 

$sReturned = _WinHttpSimpleFormFill($hConnect, $hOpen, _
        Default, _
        "name:email", $sEmail, _
        "name:passwd", $sPassword, _ 
        "name:remember", "on", _
        "name:do", "login", _
        "name:url", "/")
If @error Then
    MsgBox(4096, "Error", @error) ; ref. WinHttp.chm
Else
    MsgBox(4096, "OK", "login successful, now download")

; download
  $sFile = "fnl_20170724_06_00.grib2"
  $sTarget = "data/ds083.2/grib2/2017/2017.07/" & $sFile

  $hConnect = _WinHttpConnect($hOpen, $sServerAddress)
  $hRequest = _WinHttpOpenRequest($hConnect, "GET", $sTarget)  
  _WinHttpSendRequest($hRequest)
  _WinHttpReceiveResponse($hRequest)
  If _WinHttpQueryDataAvailable($hRequest) Then

    $headers = _WinHttpQueryHeaders($hRequest)
    $Length = StringRegExpReplace($headers, '(?s).*Content-Length:\h*(\d+).*', "$1") 
    Local $bChunk, $bData = Binary("")

    ProgressOn($sFile, "Downloading", "0 %")
    While 1
         $bChunk = _WinHttpReadData($hRequest, 2, 8192) ; binary
         If @error Then ExitLoop
         $bData &= $bChunk
         $percent = Int((BinaryLen($bData)/$Length)*100)
         ProgressSet($percent, "Downloading", $percent & " %")
     WEnd
     ProgressSet(100, "Done !", "100 %")
     FileWrite(@scriptdir & "\" & $sFile, $bData)
     Sleep(1000)
     ProgressOff()
  Else
      MsgBox(48, "", "connection error")
  EndIf

; close handles
_WinHttpCloseHandle($hRequest)
_WinHttpCloseHandle($hConnect)
_WinHttpCloseHandle($hOpen)
EndIf

 

Share this post


Link to post
Share on other sites

#12 ·  Posted (edited)

Hi Danp2, i wanted to use your code but i just didnt know where to edit it for it to do what i require it to do. Thank you

Hi Mikell, I used this code. the logging in was successful. the file i would like to download is 17MB in size but this code only downloads 440bytes file. the content of the file downloaded is as follows:

"Your browser sent a request that this server could not understand.
Reason: You're speaking plain HTTP to an SSL-enabled server port.
 Instead use the HTTPS scheme to access this URL, please"
 

 

Is there something im not doing right?

 

Regards

Dante

Edited by Dante_t
big qoutes

Share this post


Link to post
Share on other sites

#13 ·  Posted

#include "WinHttp.au3"

$sServerAddress = "https://rda.ucar.edu"
$sGeneratorLocation = "/cgi-bin/login"

$sEmail = "xxxxxxx@xxxxxx"
$sPassword = "xxxxxxx"

$hOpen = _WinHttpOpen()

; collect access cookie first
$hConnect = _WinHttpConnect($hOpen, $sServerAddress)
_WinHttpSimpleSSLRequest($hConnect)
_WinHttpCloseHandle($hConnect) 

; build and fill the login form
$sForm = _
        '<form action="' & $sServerAddress & $sGeneratorLocation & '" method="post">' & _
        '   <input name="email" />' & _
        '   <input name="passwd" />' & _
        '   <input name="remember" />' & _
        '   <input name="do" />' & _
        '   <input name="url" />' & _
        '</form>'
$hConnect = $sForm 

$sReturned = _WinHttpSimpleFormFill($hConnect, $hOpen, _
        Default, _
        "name:email", $sEmail, _
        "name:passwd", $sPassword, _ 
        "name:remember", "on", _
        "name:do", "login", _
        "name:url", "/")
If @error Then
    MsgBox(4096, "Error", @error)
Else
    MsgBox(4096, "OK", "login successful, now download")





; download
  $sFile = "fnl_20170724_06_00.grib2"
  $sTarget = "data/ds083.2/grib2/2017/2017.07/" & $sFile

  _WinHttpCloseHandle($hConnect)
  $hConnect = _WinHttpConnect($hOpen, $sServerAddress)
  $hRequest = _WinHttpSimpleSendSSLRequest($hConnect, "GET", $sTarget)
  If _WinHttpQueryDataAvailable($hRequest) Then

    $headers = _WinHttpQueryHeaders($hRequest)
    $Length = StringRegExpReplace($headers, '(?s).*Content-Length:\h*(\d+).*', "$1") 
    Local $bChunk, $bData = Binary("")

    ProgressOn($sFile, "Downloading", "0 %")
    While 1
         $bChunk = _WinHttpReadData($hRequest, 2, 8192) ; binary
         If @error Then ExitLoop
         $bData &= $bChunk
         $percent = Int((BinaryLen($bData)/$Length)*100)
         ProgressSet($percent, "Downloading", $percent & " %")
     WEnd
     ProgressSet(100, "Done !", "100 %")
     FileWrite(@scriptdir & "\" & $sFile, $bData)
     Sleep(1000)
     ProgressOff()
  Else
      MsgBox(48, "", "connection error")
  EndIf

; close handles
_WinHttpCloseHandle($hRequest)
_WinHttpCloseHandle($hConnect)
_WinHttpCloseHandle($hOpen)
EndIf

Share this post


Link to post
Share on other sites

#14 ·  Posted

Just now, Dante_t said:
#include "WinHttp.au3"

$sServerAddress = "https://rda.ucar.edu"
$sGeneratorLocation = "/cgi-bin/login"

$sEmail = "xxxxxxx@xxxxxx"
$sPassword = "xxxxxxx"

$hOpen = _WinHttpOpen()

; collect access cookie first
$hConnect = _WinHttpConnect($hOpen, $sServerAddress)
_WinHttpSimpleSSLRequest($hConnect)
_WinHttpCloseHandle($hConnect) 

; build and fill the login form
$sForm = _
        '<form action="' & $sServerAddress & $sGeneratorLocation & '" method="post">' & _
        '   <input name="email" />' & _
        '   <input name="passwd" />' & _
        '   <input name="remember" />' & _
        '   <input name="do" />' & _
        '   <input name="url" />' & _
        '</form>'
$hConnect = $sForm 

$sReturned = _WinHttpSimpleFormFill($hConnect, $hOpen, _
        Default, _
        "name:email", $sEmail, _
        "name:passwd", $sPassword, _ 
        "name:remember", "on", _
        "name:do", "login", _
        "name:url", "/")
If @error Then
    MsgBox(4096, "Error", @error)
Else
    MsgBox(4096, "OK", "login successful, now download")





; download
  $sFile = "fnl_20170724_06_00.grib2"
  $sTarget = "data/ds083.2/grib2/2017/2017.07/" & $sFile

  _WinHttpCloseHandle($hConnect)
  $hConnect = _WinHttpConnect($hOpen, $sServerAddress)
  $hRequest = _WinHttpSimpleSendSSLRequest($hConnect, "GET", $sTarget)
  If _WinHttpQueryDataAvailable($hRequest) Then

    $headers = _WinHttpQueryHeaders($hRequest)
    $Length = StringRegExpReplace($headers, '(?s).*Content-Length:\h*(\d+).*', "$1") 
    Local $bChunk, $bData = Binary("")

    ProgressOn($sFile, "Downloading", "0 %")
    While 1
         $bChunk = _WinHttpReadData($hRequest, 2, 8192) ; binary
         If @error Then ExitLoop
         $bData &= $bChunk
         $percent = Int((BinaryLen($bData)/$Length)*100)
         ProgressSet($percent, "Downloading", $percent & " %")
     WEnd
     ProgressSet(100, "Done !", "100 %")
     FileWrite(@scriptdir & "\" & $sFile, $bData)
     Sleep(1000)
     ProgressOff()
  Else
      MsgBox(48, "", "connection error")
  EndIf

; close handles
_WinHttpCloseHandle($hRequest)
_WinHttpCloseHandle($hConnect)
_WinHttpCloseHandle($hOpen)
EndIf

This code worked for me. Thank you Mikell for the solution and thank you everyone who tried to help. much appreciated.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now

  • Similar Content

    • swatsapkraz
      By swatsapkraz
      First script here. Thanks for taking the time.
      I want to download a file from my dropbox or other cloud file host and I want autoit to read the file and proceed.
      Here are the references I've gone through, it's just I'm not familiar yet with autoit so I'm looking for advice:
      https://www.autoitscript.com/autoit3/docs/functions/InetGet.htm
      https://www.autoitscript.com/autoit3/docs/functions/FileRead.htm
       
      How would I start out downloading a text file from dropbox and if in the file there is a 1 then it will proceed with the rest of the script if there is a 0 or if the file cannot be downloaded I want it to just end.
       
      Thank you for taking the time to read this and I apologize in advance if this seems very trivial for some but this is my first script and I'm hoping this is the correct place to ask this question.
    • jonson1986
      By jonson1986
      Hey
      I'm trying to use InetGet function to download multiple images from a website, some pages having three images, some pages having 4 images some of them having more...
      I wrote belong codes to work with autoit and having issues when autoit find not matching url available then autoit just script stopped without any error and I just want to same all the avaialble images on the website if links are not more left then script should moves on but script just stopped...
      Here is complete scenerio
      I've so many webpages with random number of images are hosting on those pages, in below code, InetGet able to download first three files easily and when it reaches to 4th link that is missing on the webpage that's why script stopped just but I want autoit to download only those images those are links are available and rest of files needs to be skipped automatically for this page if on the next page 4th link of image avaiable then autoit script needs to download 4th one also.
      Furthermore, please help me to download files with it's original names instead of whatever name I want to same them like in InetGet I've to give some name to the file that I'm willind to download instead of original name that is available online.
      Please Help me.
      Here i my code;
      $File6 = @ScriptDir & "\images_source.txt" $txt6 = FileRead($File6) $target_source7 = _StringBetween($txt6, 'src="', '"') if Not @error Then InetGet ( $target_source7[0], @ScriptDir & '\Description\Image1.jpg'); 1st image download coz link is available InetGet ( $target_source7[1], @ScriptDir & '\Description\Image2.jpg'); 2nd image download coz link is available InetGet ( $target_source7[2], @ScriptDir & '\Description\Image3.jpg'); 3rd image download coz link is available InetGet ( $target_source7[3], @ScriptDir & '\Description\Image4.jpg'); 4th image not able to download coz link is not available and script just stopped EndIf  
    • luckyluke
      By luckyluke
      Hello all,
      Im trying to get the information from https website, but it does not return any thing, here is the code:
      Global $oHTTP = ObjCreate("winhttp.winhttprequest.5.1")
      $agent ='Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.112 Safari/537.36'
      $url = "https://www.sportinglife.com/racing/results"
      $oHTTP.Open("GET", $url, False)
      $oHTTP.setRequestHeader ("User-Agent", $agent)
      $oHTTP.Option(4) = 13056
      $oHTTP.Send()
      $src = ($oHTTP.ResponseText)
      ConsoleWrite($url & @CRLF)
      MsgBox(0, '$src', $src)
      when i tried with other website, it is working, but this code does not works with this website. Pls help me
      thank you.
    • 31290
      By 31290
      Hi Guys, 
      Since I'm able to get a Dell equipment warranty status thanks to my API key, I'm using an UDF to extract data from an XML file and get the end date. > 
      Thing is, when using InetGet, the original file is in JSON format and the UDF is not working anymore, even if I download the file with the xml extension. Therefore, and when I manually download the page with Chrome, I have a proper XML file where the UDF is working fine.
      Here's my code:
      I even tried to convert the json to xml > https://www.autoitscript.com/forum/topic/185717-js-json-to-xml/
      I took a look here https://www.autoitscript.com/forum/topic/104150-json-udf-library-fully-rfc4627-compliant/ but I don't understand anything :/
       
      The XML read UDF is just perfect for my needs but I'm stuck here... 
      Thanks for any help you can provide
      -31290-
      3MTXM12.json
      3MTXM12.xml
    • Dent
      By Dent
      Hi everyone,
      My script uses IE11 on Win7 to log in to a site and enters data into a couple of forms. Upon clicking a link this data is used by the site to generate a PDF report.
      With my current set-up if I do this manually the PDF opens in a new IE tab and I can download or print it. If I right-click the link that creates the PDF and choose Save Target As the PDF is generated and the Open/Save As dialogue at the bottom of the screen opens. All good.
      However I would like the script to automatically download the PDF and close IE and then exit. Closing IE (_IEQuit) and exiting the script are easy enough, but I'm struggling getting the script to download the PDF.
      The link to generate the PDF contains a unique number each time the page with the link is reached, so it's not static. The link position however, using _IELinkGetCollection I can tell the link to generate the PDF is always the 10th one from the end of the page, so using $iNumLinks - 10 I am able to click the link.
      What I believe I need to use is InetGet however the problem I've been facing is that the link isn't static and I haven't worked out a way to access the link by index - is this possible?
      Here is the website HTML for the section containing the link although I don't think it's of much use but it at least shows the format of the link (I can't post a link as it's a password protected area)...
      <div class="rmButton right"><a title="Generates a PDF version of the market report in a new window." href="/rmplus/generatePdf?mr_id=60991" target="_blank">print/save as pdf</a></div> The full link https://www.rightmove.co.uk/rmplus/generatePdf?mr_id=60991 just for completeness - visiting it will give a HTTP 500 unless logged in.
      And here is the code that clicks this link opening the generated PDF in a new tab...
      $oLinks = _IELinkGetCollection($oIE) $iNumLinks = @extended $PrintPDF = _IELinkClickByIndex($oIE, ($iNumLinks - 10)) So, how to use InetGet to visit that link? Or is there a way to Save As the newly opened tab? I've tried _IEAction($oIE, "saveas") but it seems not to work in a tab containing only a PDF.