Jump to content

Recommended Posts

Posted

I am looking to create a script which refreshes/reads a webpage every few seconds. My goal is to see if the page has changed, then I will send myself a notification that the webpage has been updated.

 

However, rather than downloading the entire webpage every single time, is there a way to check when the webpage last updated?

 

If not, is there away to partially download/read html source until a specific tag is hit?

 

Goal: I would like to increase my poll rate and not excessively waste data.

Posted (edited)

Do you want to use IE or WebDriver , Which browser ?

Edited by mLipok

Signature beginning:
Please remember: "AutoIt"..... *  Wondering who uses AutoIt and what it can be used for ? * Forum Rules *
ADO.au3 UDF * POP3.au3 UDF * XML.au3 UDF * IE on Windows 11 * How to ask ChatGPT for AutoIt Codefor other useful stuff click the following button:

  Reveal hidden contents

Signature last update: 2023-04-24

Posted
  On 4/6/2021 at 12:45 PM, Pured said:

I'm not sure how to find a page's dom

Expand  

Just right click on an element of the web page and select "Inspect" in the context menu.

If you are looking at any change in a page, it would be quite easy to do with a generic solution.  If you looking for a specific change in a specific tag (or part of the web page), I am afraid it is going to be very hard to find a generic solution.  You will need to adapt your code to each individual page.

Posted

@Nine Going through this forum, I know I can get the size of a page, but it seems like I have to download the entire page rather than just fetch the size.

 

It would be nice if there is a way to fetch the date modified or size of a webpage without getting anything else, since I don't care too much about what has changed (I hope...). 

 

Otherwise like you said, using doms (I have other scripts which traverse through html and to tag-specific jobs), so I can do that myself, but it would require a full load of the page, which I don't want.

Posted (edited)
  On 4/6/2021 at 1:27 PM, Pured said:

It would be nice if there is a way to fetch the date modified or size of a webpage without getting anything else, since I don't care too much about what has changed

Expand  

@Pured

If that is all you want, then just retrieve the web page's header.  That will only include information about the page and not the page itself.  The Content-Length header will give you the size of the content or if it exists, you can look for the Last-Modified header.  Depending on the framework used by the web page, looking at those headers may not be a very accurate way to tell if the content has changed.

There are several ways to get just the header.  If I were looking for the header, then an easy way would be to execute a simple cURL command or send a HTTP HEAD request using one of the numerous ways to do HTTP requests.

Example header:

HTTP/2 200
server: nginx
date: Tue, 06 Apr 2021 13:44:06 GMT
content-type: text/html;charset=UTF-8
content-length: 198416
vary: Accept-Encoding
x-powered-by: PHP/7.2.34
x-ips-loggedin: 0
vary: cookie,Accept-Encoding
x-xss-protection: 0
x-frame-options: sameorigin
expires: Tue, 06 Apr 2021 13:45:06 GMT
cache-control: max-age=60, public
pragma: public
set-cookie: ips4_IPSSessionFront=lfntg0ifj1a53lnh9s4vvvndpt; path=/; secure; HttpOnly
set-cookie: ips4_guestTime=1617716646; path=/; secure; HttpOnly
last-modified: Tue, 06 Apr 2021 13:44:06 GMT
x-powered-by: PleskLin

 

  On 4/6/2021 at 4:21 AM, Pured said:

Goal: I would like to increase my poll rate and not excessively waste data.

Expand  

Be warned, continuously "banging" on a website, very frequently, will most likely get your IP address blocked if the host has any type of decent anti-hammering protection.  That type of behavior is usually frowned upon.  😉

 

Edited by TheXman
  • 2 weeks later...
Posted

Dear 

i am new in Http API and i need your support 

i try this code but it did not gave me the head.

by the way this page need user ID and Password.

#include "WinHttp.au3"

Opt("MustDeclareVars", 1)

; Open needed handles
Global $hOpen = _WinHttpOpen()
Global $hConnect = _WinHttpConnect($hOpen, "https://intranet.airlineretailing.amadeus.com:55001")
; Specify the reguest:
MsgBox(0, " $hConnect",  $hConnect)    ;" works gave ID"
Global $hRequest = _WinHttpOpenRequest($hConnect, Default, "/1ASIHGAPSV/arp/index.html")
MsgBox(0, "$hRequest",  $hRequest)       ;" works gave ID"
; Send request
_WinHttpSendRequest($hRequest)

; Wait for the response
_WinHttpReceiveResponse($hRequest)

; Get full header
Global $sHeader = _WinHttpQueryHeaders($hRequest)

; Close handles
_WinHttpCloseHandle($hRequest)
_WinHttpCloseHandle($hConnect)
_WinHttpCloseHandle($hOpen)

; Display retrieved header
MsgBox(0, "Header", $sHeader)

 

Thank

Posted (edited)

@samibb

You've been on this forum long enough to know better than to try to "hijack" a thread with your unrelated issues.  If you have questions about using the WinHTTP.au3 functions, then post your questions in the WinHTTP UDF topic or start a new one.

As it relates to the example that you posted, if you added @error checking then you would have seen that the _WinHttpReceiveResponse failed.  The most likely reason that it failed is because you tried to send an https request without using the $WINHTTP_FLAG_SECURE flag in your _WinHttpOpenRequest.

Look at the reply below for an example of a _WinHttpOpenRequest that is setting up a secure request using WinHTTP.  Note that you were a member of this topic.  :think:  There are numerous other examples of sending https requests using the WinHTTP UDF lib.

here's another one that gives a brief explanation of why you need the flag for secure connections:

If you have additional questions, then post them in a proper topic, not one that is asking about how to "partially read a webpage".

Edited by TheXman

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...