Jump to content
Sign in to follow this  
Herb191

How to detect if a URL is a download link before using _IENavigate?

Recommended Posts

Is there a fast way of check to see if a URL is a download link before I use _IENavigate?

Also, does anyone know how to disable all popup windows (like the are your sure you want to navigate away windows) without disabling scripting?

Thanks

Edited by Herb191

Share this post


Link to post
Share on other sites

Hey,

What do the links look like, are they just URI's to a file? Do you know what types of files to expect?

If so you could just use StringRegExp, something like this for example.

$file_url = "http://www.example.com/path/to/file.zip"
If StringRegExp($file_url, "/[^/]+\.(rar|zip)\z") Then
    ConsoleWrite("File URL: " & $file_url & @CRLF) ; do what you need want with the file URL here
EndIf
Edited by Robjong

Share this post


Link to post
Share on other sites

Hey,

What do the links look like, are they just URI's to a file? Do you know what types of files to expect?

If so you could just use StringRegExp, something like this for example.

$file_url = "http://www.example.com/path/to/file.zip"
If StringRegExp($file_url, "/[^/]+\.(rar|zip)\z") Then
    ConsoleWrite("File URL: " & $file_url & @CRLF) ; do what you need want with the file URL here
EndIf

Hi Robjong,

Thanks for the response. I am actually trying to weed out any URL's that are download links. Unfortunately I never know what the URL's are going to look like because I am using a web crawler to get them from completely random websites. I have tried something similar to what you have above but there are thousands of possible download files and my script inevitably finds one and stops working.

Share this post


Link to post
Share on other sites

What do you do with valid URL's? show them in an IE window? get the source?

If it is because you want to parse the URL's in a crawler like fashion, or show only html pages for example,

you could check the content-type of the URL by checking the headers...

Here is a rough example:

Global $aAcceptedContentTypes = "text/\w+" ; allows any text content type
;~ Global $aAcceptedContentTypes[2] = ["text/html", "text/plain"] ; if the type string  contains regex meta characters you can escape the string by putting it between \Q and \E(example: \Qtext/foo\E)
 
$url_text = "http://www.autoitscript.com"
$url_file = "http://www.autoitscript.com/cgi-bin/getfile.pl?autoit3/autoit-v3-setup.exe"
 
$result = _CheckContentType($url_text, $aAcceptedContentTypes)
ConsoleWrite("- " & $result & @CRLF)
 
$result = _CheckContentType($url_file, $aAcceptedContentTypes)
ConsoleWrite("- " & $result & @CRLF)
 
 
Func _CheckContentType($sURL, $mContentTypes)
    Local $oHTTP = ObjCreate("winhttp.winhttprequest.5.1")
    $oHTTP.open("HEAD", $sURL)
    $oHTTP.Send()
    If IsArray($mContentTypes) Then
        For $i = 0 To UBound($mContentTypes) - 1
            If StringRegExp($oHTTP.GetAllResponseHeaders, "(?i)content-type:\s*" & $mContentTypes[$i] & ";") Then Return SetError(0, 0, True)
        Next
    ElseIf StringRegExp($oHTTP.GetAllResponseHeaders, "(?i)content-type:\s*" & $mContentTypes & ";") Then
        Return SetError(0, 0, True)
    EndIf
    Return SetError(1, 0, False)
EndFunc

If you want download the files/source you could use 'GET' instead 'HEAD' for the open function and download if it passes the check, this would save you a request.

Edited by Robjong

Share this post


Link to post
Share on other sites

That is nice bit of coding but I need to be able to show the page URL's in an IE window because I am processing some date after the server side scripts run...also I need to be able to run on just about any kind of page (except PDF).

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

  • Recently Browsing   0 members

    No registered users viewing this page.

  • Similar Content

    • By UEZ
      This project has been discontinued!
       
      Here a small tool I wrote to update my Sysinternal tools collection without the need to download always the whole package or visiting the site to check for updates. I know that there are several tools available (also some tools written in AutoIt) but here another one for the collection. It was good exercise for me to code it.
       
       
        
       
       
      Some files from cannot be downloaded although they are visible on the web site!
       
      Here the download link of the source code only: AutoIt Sysinternal Tools Synchronizer v0.99.5 build 2020-07-04 beta.7z  (1509 downloads previously)
      -=> Requires AutoIt version 3.3.13.20 or higher / tested on Win8.1 real machine and some VMs: Win7 / Vista / Win10
       
      Compiled exe only: @MediaFire
       
      Just select the Sysinternal Tools folder or create one and press the synchronize button to download the selected items. Click on AutoIt label (near to left upper corner) to open menu.
       
      Special thanks to LarsJ, Melba23 and mesale0077 for their help. 
       
      I've still some ideas to implement which are more gimmick related, so it is not finished yet...
      If you want to add your language please check out #Region Language. Thanks. 
       
      Please report any bug or if you have any suggestions.
       
      The language of the tool tip from each of the executable in the left list view were automatically created using Google translator and weren't checked for correctness.
       
      Br,
      UEZ
    • By adityaparakh
      Hello ,

      A website I am trying to login with my credentials.
      And retrieve the cookie into a text file.
      Unable to do so.
      Is it that certain,  Httponly , type - are not allowed to be fetched.

      Then further ,
      I will be checking every 5 minutes if my session is active , else re-login and re-fetch the cookie.
      For the second part , I will probably fetch some table and see if not in appropriate format do Part 1 : Fetch Cookie - again.
      Any better way , tips would be appreciated.
       
      Thanks
       
    • By Jamestay97
      Hello! Thanks you for looking at my post
      **No source code I'm sorry work related can't copy information**
      I've been using autoit for about 1 year. 
      I'm having trouble automating a click on an internet explorer web page and I've tried a lot of examples from help pages and forums already. The object I'm trying to click on isnt always in the same spot so I can't use mouse click or control click, I have tried to use the different get collection options and clickbyname, or index or get object. I'm just struggling. 
      Description of object I'm trying to click -- 
      HTML Code looks like <a ng-click.. "Click Here" it appears it's just a click able object named "click here" that opens a hidden window by running a script inside the web page. I'm not able to grab the information from the window unless it's open so I have to automate this click somehow. 
       
      I understand it's difficult to assist without having something to look at, I apologize for that sincerely and appreciate and assistance and suggestions. 
    • By FUD
      hello 
      please i need help 
      i need to open link in default browser only one windows without duplicate if i try to open the same link 
       
      thanks 
×
×
  • Create New...