Sign in to follow this  
Followers 0
Herb191

How to detect if a URL is a download link before using _IENavigate?

6 posts in this topic

#1 ·  Posted (edited)

Is there a fast way of check to see if a URL is a download link before I use _IENavigate?

Also, does anyone know how to disable all popup windows (like the are your sure you want to navigate away windows) without disabling scripting?

Thanks

Edited by Herb191

Share this post


Link to post
Share on other sites



#2 ·  Posted (edited)

Hey,

What do the links look like, are they just URI's to a file? Do you know what types of files to expect?

If so you could just use StringRegExp, something like this for example.

$file_url = "http://www.example.com/path/to/file.zip"
If StringRegExp($file_url, "/[^/]+\.(rar|zip)\z") Then
    ConsoleWrite("File URL: " & $file_url & @CRLF) ; do what you need want with the file URL here
EndIf
Edited by Robjong

Share this post


Link to post
Share on other sites

Hey,

What do the links look like, are they just URI's to a file? Do you know what types of files to expect?

If so you could just use StringRegExp, something like this for example.

$file_url = "http://www.example.com/path/to/file.zip"
If StringRegExp($file_url, "/[^/]+\.(rar|zip)\z") Then
    ConsoleWrite("File URL: " & $file_url & @CRLF) ; do what you need want with the file URL here
EndIf

Hi Robjong,

Thanks for the response. I am actually trying to weed out any URL's that are download links. Unfortunately I never know what the URL's are going to look like because I am using a web crawler to get them from completely random websites. I have tried something similar to what you have above but there are thousands of possible download files and my script inevitably finds one and stops working.

Share this post


Link to post
Share on other sites

#4 ·  Posted (edited)

What do you do with valid URL's? show them in an IE window? get the source?

If it is because you want to parse the URL's in a crawler like fashion, or show only html pages for example,

you could check the content-type of the URL by checking the headers...

Here is a rough example:

Global $aAcceptedContentTypes = "text/\w+" ; allows any text content type
;~ Global $aAcceptedContentTypes[2] = ["text/html", "text/plain"] ; if the type string  contains regex meta characters you can escape the string by putting it between \Q and \E(example: \Qtext/foo\E)
 
$url_text = "http://www.autoitscript.com"
$url_file = "http://www.autoitscript.com/cgi-bin/getfile.pl?autoit3/autoit-v3-setup.exe"
 
$result = _CheckContentType($url_text, $aAcceptedContentTypes)
ConsoleWrite("- " & $result & @CRLF)
 
$result = _CheckContentType($url_file, $aAcceptedContentTypes)
ConsoleWrite("- " & $result & @CRLF)
 
 
Func _CheckContentType($sURL, $mContentTypes)
    Local $oHTTP = ObjCreate("winhttp.winhttprequest.5.1")
    $oHTTP.open("HEAD", $sURL)
    $oHTTP.Send()
    If IsArray($mContentTypes) Then
        For $i = 0 To UBound($mContentTypes) - 1
            If StringRegExp($oHTTP.GetAllResponseHeaders, "(?i)content-type:\s*" & $mContentTypes[$i] & ";") Then Return SetError(0, 0, True)
        Next
    ElseIf StringRegExp($oHTTP.GetAllResponseHeaders, "(?i)content-type:\s*" & $mContentTypes & ";") Then
        Return SetError(0, 0, True)
    EndIf
    Return SetError(1, 0, False)
EndFunc

If you want download the files/source you could use 'GET' instead 'HEAD' for the open function and download if it passes the check, this would save you a request.

Edited by Robjong

Share this post


Link to post
Share on other sites

That is nice bit of coding but I need to be able to show the page URL's in an IE window because I am processing some date after the server side scripts run...also I need to be able to run on just about any kind of page (except PDF).

Share this post


Link to post
Share on other sites

In that case the example I provided should help you out since pages are served as HTML?!

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0

  • Similar Content

    • nacerbaaziz
      By nacerbaaziz
      hello
      please i need to link a progress bar with a time can you help me?
      e.g
      i want to set a progress bar for 10 sec
      am waiting for your answers
      thank you.
    • nassausky
      By nassausky
      Hi all,
       
      Anyone have any idea how to close all open tabs except a specific one I manually open.  Assuming I don't know what is open in all the tabs except just the one I want to keep open.
       
      I didn't want to use sendkeys and I was trying to use the following code to list the title (or url) of the 3 open tabs and  after I got that part working I would just close the other 2. This sample only displays the title of the first open tab
      #include <IE.au3> Const $ie_new_in_tab = 0x0800 $oIE = _IECreate("https://www.autoitscript.com") __IENavigate($oIE, "https://www.autoitscript.com/forum/", 1, $ie_new_in_tab) ;(obj,url,wait,param) __IENavigate($oIE, "https://www.google.com/", 1, $ie_new_in_tab) ;(obj,url,wait,param) Local $aIE[1] $aIE[0] = 0 Local $i = 1, $oIE While 1     $oIE = _IEAttach("", "instance", $i)     If @error = $_IEStatus_NoMatch Then ExitLoop     ConsoleWrite(_IEPropertyGet($oIE, "title") & @CRLF)     ReDim $aIE[$i + 1]     $aIE[$i] = $oIE ;each item holds object     $aIE[0] = $i ;first item holds count     $i += 1 WEnd MsgBox($MB_SYSTEMMODAL, "Browsers Found", "Number of browser instances in the array: " & $aIE[0]) ; This doesn't return the list of tabs in the console just the first tab  
      Thanks for any and all help
    • toto22
      By toto22
      I'm trying to click on Java Dropbox using IE. However, I'm running into problems. There is a Dropbox "Please Select" with two options "Buy" and "Sell".
      I'm able to click on a drop box (please see code below) but i'm unable to select "Buy" or "Sell"".
      Local $sMyString = "Please Select" ;############ ENTER ############# Local $oLinks = _IELinkGetCollection($oIE) For $oLink In $oLinks Local $sLinkText = _IEPropertyGet($oLink, "innerText") If StringInStr($sLinkText, $sMyString) Then _IEAction($oLink, "click") ExitLoop EndIf Next  
      Please help
       
         
    • Gowrisankar
      By Gowrisankar
      Hello everyone,
      I'm working on a task where, a PDF file is opened (in IE browser) when I click a link in a website.
      I have to read the first page of the PDF to find particular strings. Can you please share some ideas?
    • Seminko
      By Seminko
      Hey,
      i would like to set a value into an INPUT field.
      Checked the _IEFormElementSetValue function but that does require _IEFormGetObjByName and this is where the problem comes in. The input field I want to write to is not a part of a form tag. It is part of a table.
      <input type="text" class="w2" id="nabidka_vozidel_formular_tach_od" name="nabidka_vozidel_formular_tach_od" onchange="GLOBAL.pocetInzerceNZ(&quot;nabidka_vozidel_formular&quot;,&quot;tach_od&quot;,&quot;&quot;);" autocomplete="off"> I tried this but that didn't work:
      $oDownloadSamples = _IEGetObjById($oIE, "nabidka_vozidel_formular_tach_od") _IEFormElementSetValue($oDownloadSamples, "123") If you want to try the site I'm working with is https://www.tipcars.cz/. There is a menu on the top left hand side and if you click the "vyhledat" button the input fields will show up.
      Thanks