Jump to content

Download a websites source with WinHttp


Go to solution Solved by TheXman,

Recommended Posts

With my my first steps on WinHttp i have tried to download sourcecode from several websites for testing. The code i used seems to work everywhere until i found this site. Playing with several variations and commands i didn't get it working and don't understand what may be wrong. Someone who can help me with this?
It seems to hang a while when doing the send request and then throws the connection error
 

#include <WinHttp.au3>
#include <array.au3>

Local $url = "https://www.adobe.com/devnet-docs/acrobatetk/tools/ReleaseNotesDC/index.html"
$aUrlParts = _WinHttpCrackUrl($url)

Local $hInternet = _WinHttpOpen("")

Consolewrite($aUrlParts[2] & @LF)
Local $hConnect = _WinHttpConnect($hInternet, $aUrlParts[2])

Consolewrite($aUrlParts[6] & $aUrlParts[7] & @LF)
Local $hRequest = _WinHttpOpenRequest($hConnect, "GET", $aUrlParts[6] & $aUrlParts[7], Default, Default, Default, $WINHTTP_FLAG_SECURE)

_WinHttpSendRequest($hRequest)
consolewrite("Send request errorcode: " & @error & " --- " & @extended & @LF)
_WinHttpReceiveResponse($hRequest)
consolewrite("Receive response errorcode: " & @error & " --- " & @extended & @LF)

If _WinHttpQueryDataAvailable($hRequest) Then

    $headers = _WinHttpQueryHeaders($hRequest)
    consolewrite("Header response: " & $headers & @LF)
    Local $sSourceCode = ""
    Local $sData = ""
    While True
        $sData = _WinHttpReadData($hRequest,0,8192)
        If @error Or $sData = "" Then ExitLoop
        $sSourceCode &= $sData
    Wend
    consolewrite($sSourceCode & @LF)
Else
    MsgBox(48, "", "Connection error")
EndIf

_WinHttpCloseHandle($hRequest)
_WinHttpCloseHandle($hConnect)
_WinHttpCloseHandle($hInternet)

 

Best regards
Andi

 

Link to comment
Share on other sites

Thanx for the hint. I will investigate this. But i tried the same site with COM object and it works. Shouldn't this have the same issue then when its the TLS problem ?
Or am i misunderstanding the mechanisms completely ?

 

Local $sURL = "https://www.adobe.com/devnet-docs/acrobatetk/tools/ReleaseNotesDC/index.html"
Local $oHTTP = ObjCreate("winhttp.winhttprequest.5.1")
$oHTTP.Open("GET", $sURL)
$oHTTP.Send()
$sHTML = $oHTTP.Responsetext
consolewrite($sHTML & @LF)

 

Link to comment
Share on other sites

Seems this problem is more difficult than i thought or the interest about this is less. Some may say: Just use the working script version and don't think about the other one but i like understanding the things ☺️
I also would prefer using the winhttp udf and maybe i am only missing a parameter due to less knowledge.

So i would be very happy getting some more hints about this little tricky task...

Link to comment
Share on other sites

Posted (edited)

@Danp2    Thanks for the links. I will take a deeper look at this 👍 Maybe thats the answer to my problems... I will report back

Add:     i tested both scripts before with Windows10, Server 2019 and Server 2022 with the same results...

Edited by agivx3
Link to comment
Share on other sites

  • Solution
Posted (edited)
On 5/21/2024 at 8:46 PM, agivx3 said:

i didn't get it working and don't understand what may be wrong

Using the script in your original post, I got the same result as you.  However, I get a valid response when using an acceptable user agent string.  What's an acceptable user agent string and how a website handles requests & responses based on the user-agent string, is up to the web site developer/administrator.

 

Change: 

$hInternet = _WinHttpOpen()

to
 
$hInternet = _WinHttpOpen("curl")

 

Edited by TheXman
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...