Jump to content

Reading data from the website


 Share

Recommended Posts

Hello

The function below worked for a long time. Now it doesn't download anything and I don't know where the error is.

#include <InetConstants.au3>

Local $Link = 'https://allegro.pl/kategoria/nasiona-warzywa-99776?bmatch=e2101-d3681-c3682-hou-1-4-0319'
Local $dStrona = InetRead($Link)
Local $sStrona = BinaryToString($dStrona, $SB_UTF8)
;GUICtrlSetData($myedit, $sStrona)
ConsoleWrite($sStrona)

The function was good because it reading data in the background.

Thank you for any hints or possibly other solutions / functions that are reading data in the background.

 

Link to comment
Share on other sites

To start, after any function call, check the @error and @extended values to see what (if anything) went wrong.

#include <InetConstants.au3>

Local $Link = 'https://allegro.pl/kategoria/nasiona-warzywa-99776?bmatch=e2101-d3681-c3682-hou-1-4-0319'
Local $dStrona = InetRead($Link)
If @error Then Exit ConsoleWrite("Inet Error: " & @error & " Extended: " & @extended & @CRLF)
Local $sStrona = BinaryToString($dStrona, $SB_UTF8)
If @error Then Exit ConsoleWrite("BinaryToString Error: " & @error & " Extended: " & @extended & @CRLF)
;GUICtrlSetData($myedit, $sStrona)
ConsoleWrite($sStrona)

All my code provided is Public Domain... but it may not work. ;) Use it, change it, break it, whatever you want.

Spoiler

My Humble Contributions:
Personal Function Documentation - A personal HelpFile for your functions
Acro.au3 UDF - Automating Acrobat Pro
ToDo Finder - Find #ToDo: lines in your scripts
UI-SimpleWrappers UDF - Use UI Automation more Simply-er
KeePass UDF - Automate KeePass, a password manager
InputBoxes - Simple Input boxes for various variable types

Link to comment
Share on other sites

Unsure if it's related, but had a similar issue recently with a InetRead script which I had been using for a while, I shared it with a colleague and it failed with error 13, figured out it didn't like being run from an untrusted path (the shared drive), once he copied it to a trusted local path it worked fine.

Link to comment
Share on other sites

I did not share the script with anyone. Everything worked fine. It wasn't until a few days ago that it suddenly stopped. Everything runs from the same laptop. InetRead is cool because it runs in the background. I don't know of any other similar function running in the background.

Link to comment
Share on other sites

I tried using WinHTTP UDF and I am getting a captcha warning that seems to disallow access programmatically to the site.  But I am not versed enough to be sure about this...Maybe someone else could verify it.

Link to comment
Share on other sites

WinHTTP using COM, WinHTTP UDF, and CURL all successfully retrieve the full web page when a recognized User-Agent is supplied.  I tested it using the current Firefox User-Agent.

 

1 hour ago, walec said:

InetRead is cool because it runs in the background.

I don't know what you mean when you say that InetRead runs in the background.  The script will wait until the InetRead finishes before continuing.  InetRead does NOT have a parameter that allows it to run in the background like InetGet.

Edited by TheXman
Link to comment
Share on other sites

20 minutes ago, TheXman said:

WinHTTP COM. WinHTTP UDF, and CURL all successfully retrieve the full web page when a recognized User-Agent is supplied.  When I tested it, I used a current Firefox User-Agent.

TheXman, If it worked for you, could you please send this piece of code?

Link to comment
Share on other sites

Here's what I used:

#include <Constants.au3>

http_get_example()

Func http_get_example()
    Local $oHttp = Null, $oComErr = Null

    ;Register COM Error Handler
    $oComErr = ObjEvent("AutoIt.Error", com_error_handler)
    If @error Then Exit MsgBox($MB_ICONERROR + $MB_TOPMOST, "ERROR", "Unable to register COM error handler - @error = " & @error)

    ;Create HTTP COM object
    $oHttp = ObjCreate("winhttp.winhttprequest.5.1")
    If @error Then Exit MsgBox($MB_ICONERROR + $MB_TOPMOST, "ERROR", "Unable to create HTTP COM object - @error = " & @error)

    With $oHttp
        ;Open GET request
        .Open("GET", "https://allegro.pl/kategoria/nasiona-warzywa-99776?bmatch=e2101-d3681-c3682-hou-1-4-0319")
        If @error Then Exit MsgBox($MB_ICONERROR + $MB_TOPMOST, "ERROR", StringFormat("(0x%X) %s", $oComErr.RetCode, $oComErr.WinDescription))

        ;Set request header(s)
        .SetRequestHeader("User-Agent", "Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:86.0) Gecko/20100101 Firefox/86.0")

        ;Send request
        .Send()
        If @error Then Exit MsgBox($MB_ICONERROR + $MB_TOPMOST, "ERROR", StringFormat("(0x%X) %s", $oComErr.RetCode, $oComErr.Description))

        ConsoleWrite(StringFormat("HTTP Status: %s %s", .Status, .StatusText) & @CRLF)

        ;If http status code not 200
        If .Status <> 200 Then Exit MsgBox($MB_ICONERROR + $MB_TOPMOST, "ERROR", StringFormat("HTTP Status Code = %s %s", .Status, .StatusText))

        ;Display response
        ConsoleWrite(@CRLF & "HTTP Response:" & @CRLF)
        ConsoleWrite(.ResponseText    & @CRLF)
    EndWith
EndFunc

Func com_error_handler($oError)
    With $oError
        ConsoleWrite(@CRLF & "COM ERROR DETECTED!" & @CRLF)
        ConsoleWrite("  Error ScriptLine....... " & .scriptline & @CRLF)
        ConsoleWrite("  Error Number........... " & "0x" & Hex(.number) & " (" & .number & ")" & @CRLF)
        ConsoleWrite("  Error WinDescription... " & StringStripWS(.windescription, $STR_STRIPTRAILING) & @CRLF)
        ConsoleWrite("  Error Description...... " & StringStripWS(.description   , $STR_STRIPTRAILING) & @CRLF)
        ConsoleWrite("  Error RetCode.......... " & "0x" & Hex(Number(.retcode)) & " (" & Number(.retcode) & ")" & @CRLF)
    EndWith
    Return ; Return so @error can be trapped by the calling function
EndFunc

 

Edited by TheXman
Link to comment
Share on other sites

TheXman, Works for me. I can read html code and extract the necessary information. Now I have to convert for my own needs and understand the commands.

Thank you very much.

The topic can be closed.

 

Link to comment
Share on other sites

Link to comment
Share on other sites

TheXman, your code has worked for me until today. Now the error below is displayed.

HTTP Status: 403 Forbidden

I suspect it is related to CAPTCHA. When I entered the site from a browser, I had to confirm that I was not a robot. Despite confirmation, the function still does not work.

Link to comment
Share on other sites

You have made a few statements and have not asked a single question or provided any details other than an HTTP status code.  I try not to make assumptions.  However, one thing is almost certain, if it was working before and now it isn't, then your issue is most likely not related to the AutoIt script.  Unless, of course, you modified it in such a away that would cause the error.  But I have no way of knowing because you didn't provide the script that you're running.  So if you want assistance, you need to provide much more information, starting with the script that you are running -- not a few lines without context but a script that reproduces the issue.

If I were to give a wild guess, which I really hate doing, then I would guess that the site has anti-hammering and other protections that try to keep people from doing whatever it is you're trying to do.  So my guess is that you are trying to test your script over and over and have tripped one of their mitigation routines.

Edited by TheXman
Link to comment
Share on other sites

  • Developers
59 minutes ago, walec said:

I suspect it is related to CAPTCHA. When I entered the site from a browser, I had to confirm that I was not a robot.

Doesn't this implicitly means that it will be against their TOS to have a Script read their website?

Jos

PS: All other please refrain from posting for the moment. Thanks

SciTE4AutoIt3 Full installer Download page   - Beta files       Read before posting     How to post scriptsource   Forum etiquette  Forum Rules 
 
Live for the present,
Dream of the future,
Learn from the past.
  :)

Link to comment
Share on other sites

That's right, it tests the script non-stop on the same page.

Local $Use_Agent = 'Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:86.0) Gecko/20100101 Firefox/86.0'
local $Link = 'https://allegro.pl/kategoria/ogrod-1532?string=legutko&bmatch=e2101-d3681-c3682-hou-1-4-0319'


Local $sStrona = http_get_example($Link, $Use_Agent)
;GUICtrlSetData($myedit, $sStrona)


Func http_get_example($Link, $Use_Agent)
    Local $oHttp = Null, $oComErr = Null

    ;Register COM Error Handler
    $oComErr = ObjEvent("AutoIt.Error", com_error_handler)
    If @error Then Exit MsgBox($MB_ICONERROR + $MB_TOPMOST, "ERROR", "Unable to register COM error handler - @error = " & @error)

    ;Create HTTP COM object
    $oHttp = ObjCreate("winhttp.winhttprequest.5.1")
    If @error Then Exit MsgBox($MB_ICONERROR + $MB_TOPMOST, "ERROR", "Unable to create HTTP COM object - @error = " & @error)

    With $oHttp
        ;Open GET request
        .Open("GET", $Link)
        If @error Then Exit MsgBox($MB_ICONERROR + $MB_TOPMOST, "ERROR", StringFormat("(0x%X) %s", $oComErr.RetCode, $oComErr.WinDescription))

        ;Set request header(s)
        .SetRequestHeader("User-Agent", $Use_Agent)

        ;Send request
        .Send()
        If @error Then Exit MsgBox($MB_ICONERROR + $MB_TOPMOST, "ERROR", StringFormat("(0x%X) %s", $oComErr.RetCode, $oComErr.Description))

         ;If @error Then
            ;Send request
            ;.Send()
            ;If @error Then Exit MsgBox($MB_ICONERROR + $MB_TOPMOST, "ERROR", StringFormat("(0x%X) %s", $oComErr.RetCode, $oComErr.Description))
            ;GUICtrlSetData($myedit, .ResponseText)
            ;MsgBox($MB_ICONERROR + $MB_TOPMOST, "ERROR", StringFormat("(0x%X) %s", $oComErr.RetCode, $oComErr.Description))
            ;Exit
        ;EndIf

        ConsoleWrite(StringFormat("HTTP Status: %s %s", .Status, .StatusText) & @CRLF)

        ;If http status code not 200
        If .Status <> 200 Then Exit MsgBox($MB_ICONERROR + $MB_TOPMOST, "ERROR", StringFormat("HTTP Status Code = %s %s", .Status, .StatusText))

        ;Display response
        ;ConsoleWrite(@CRLF & "HTTP Response:" & @CRLF)
        ;ConsoleWrite(.ResponseText    & @CRLF)
        Return .ResponseText
    EndWith
EndFunc

Func com_error_handler($oError)
    With $oError
        ConsoleWrite(@CRLF & "COM ERROR DETECTED!" & @CRLF)
        ConsoleWrite("  Error ScriptLine....... " & .scriptline & @CRLF)
        ConsoleWrite("  Error Number........... " & "0x" & Hex(.number) & " (" & .number & ")" & @CRLF)
        ConsoleWrite("  Error WinDescription... " & StringStripWS(.windescription, $STR_STRIPTRAILING) & @CRLF)
        ConsoleWrite("  Error Description...... " & StringStripWS(.description   , $STR_STRIPTRAILING) & @CRLF)
        ConsoleWrite("  Error RetCode.......... " & "0x" & Hex(Number(.retcode)) & " (" & Number(.retcode) & ")" & @CRLF)
    EndWith
    Return ; Return so @error can be trapped by the calling function
EndFunc
>"C:\Program Files (x86)\AutoIt3\SciTE\..\autoit3.exe" /ErrorStdOut "D:\Adconeurope\Allegro AutoIt\Allegro.au3"    
HTTP Status: 403 Forbidden
>Exit code: 1    Time: 510.5

 

Link to comment
Share on other sites

  • Developers

@walec,

Please answer my question!

1 hour ago, Jos said:

Doesn't this implicitly means that it will be against their TOS to have a Script read their website?

 

SciTE4AutoIt3 Full installer Download page   - Beta files       Read before posting     How to post scriptsource   Forum etiquette  Forum Rules 
 
Live for the present,
Dream of the future,
Learn from the past.
  :)

Link to comment
Share on other sites

  • Developers
2 minutes ago, walec said:

I don't know if it would be against with their TOS or not.

Then it is time to check no as that Captcha is there for a reason!

Thread closed unless we're get the proper information whether this is allowed or not.

SciTE4AutoIt3 Full installer Download page   - Beta files       Read before posting     How to post scriptsource   Forum etiquette  Forum Rules 
 
Live for the present,
Dream of the future,
Learn from the past.
  :)

Link to comment
Share on other sites

  • Jos locked this topic
Guest
This topic is now closed to further replies.
 Share

×
×
  • Create New...