Jump to content
Sign in to follow this  
Herb191

How to detect if a URL is a download link before using _IENavigate?

Recommended Posts

Herb191

Is there a fast way of check to see if a URL is a download link before I use _IENavigate?

Also, does anyone know how to disable all popup windows (like the are your sure you want to navigate away windows) without disabling scripting?

Thanks

Edited by Herb191

Share this post


Link to post
Share on other sites
Robjong

Hey,

What do the links look like, are they just URI's to a file? Do you know what types of files to expect?

If so you could just use StringRegExp, something like this for example.

$file_url = "http://www.example.com/path/to/file.zip"
If StringRegExp($file_url, "/[^/]+\.(rar|zip)\z") Then
    ConsoleWrite("File URL: " & $file_url & @CRLF) ; do what you need want with the file URL here
EndIf
Edited by Robjong

Share this post


Link to post
Share on other sites
Herb191

Hey,

What do the links look like, are they just URI's to a file? Do you know what types of files to expect?

If so you could just use StringRegExp, something like this for example.

$file_url = "http://www.example.com/path/to/file.zip"
If StringRegExp($file_url, "/[^/]+\.(rar|zip)\z") Then
    ConsoleWrite("File URL: " & $file_url & @CRLF) ; do what you need want with the file URL here
EndIf

Hi Robjong,

Thanks for the response. I am actually trying to weed out any URL's that are download links. Unfortunately I never know what the URL's are going to look like because I am using a web crawler to get them from completely random websites. I have tried something similar to what you have above but there are thousands of possible download files and my script inevitably finds one and stops working.

Share this post


Link to post
Share on other sites
Robjong

What do you do with valid URL's? show them in an IE window? get the source?

If it is because you want to parse the URL's in a crawler like fashion, or show only html pages for example,

you could check the content-type of the URL by checking the headers...

Here is a rough example:

Global $aAcceptedContentTypes = "text/\w+" ; allows any text content type
;~ Global $aAcceptedContentTypes[2] = ["text/html", "text/plain"] ; if the type string  contains regex meta characters you can escape the string by putting it between \Q and \E(example: \Qtext/foo\E)
 
$url_text = "http://www.autoitscript.com"
$url_file = "http://www.autoitscript.com/cgi-bin/getfile.pl?autoit3/autoit-v3-setup.exe"
 
$result = _CheckContentType($url_text, $aAcceptedContentTypes)
ConsoleWrite("- " & $result & @CRLF)
 
$result = _CheckContentType($url_file, $aAcceptedContentTypes)
ConsoleWrite("- " & $result & @CRLF)
 
 
Func _CheckContentType($sURL, $mContentTypes)
    Local $oHTTP = ObjCreate("winhttp.winhttprequest.5.1")
    $oHTTP.open("HEAD", $sURL)
    $oHTTP.Send()
    If IsArray($mContentTypes) Then
        For $i = 0 To UBound($mContentTypes) - 1
            If StringRegExp($oHTTP.GetAllResponseHeaders, "(?i)content-type:\s*" & $mContentTypes[$i] & ";") Then Return SetError(0, 0, True)
        Next
    ElseIf StringRegExp($oHTTP.GetAllResponseHeaders, "(?i)content-type:\s*" & $mContentTypes & ";") Then
        Return SetError(0, 0, True)
    EndIf
    Return SetError(1, 0, False)
EndFunc

If you want download the files/source you could use 'GET' instead 'HEAD' for the open function and download if it passes the check, this would save you a request.

Edited by Robjong

Share this post


Link to post
Share on other sites
Herb191

That is nice bit of coding but I need to be able to show the page URL's in an IE window because I am processing some date after the server side scripts run...also I need to be able to run on just about any kind of page (except PDF).

Share this post


Link to post
Share on other sites
Robjong

In that case the example I provided should help you out since pages are served as HTML?!

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

  • Similar Content

    • Seminko
      By Seminko
      Wrote a script that grabs all of the IP addresses from Netflix's IP log, checks the IPs and returns suspicious activity.
      Everything works as it should but only when _IECreate is set to visible. When visible is set to false, it fails to login for some reason.
      Any ideas what might cause it and/or how to circumvent that?
    • nooneclose
      By nooneclose
      I need to send a string of text to this popup and click on the ok button to save it.
      Here is the code I have so far:
      ;Start IE Sleep(7000) $oIE = _IECreate("http://www.google.com") Sleep(500) _IELoadWait($oIE) $hIE = _IEPropertyGet($oIE, "hwnd") ; Get Handle of the IE window Sleep(500) WinSetState($hIE, "", @SW_MAXIMIZE) ;Wait for a browser page load to complete Sleep(3000) _IENavigate($oIE, "https://properURL.com") Sleep(8000) _IELoadWait($oIE) ;Attach to a browser control embedded in another window $oIE = _IEAttach("https://"properURL.com", url") ConsoleWrite('@@ Debug(' & @ScriptLineNumber & ') : $oIE = ' & $oIE & @CRLF & '>Error code: ' & @error & ' Extended code: 0x' & Hex(@extended) & @CRLF) ;### Debug Console Sleep(2000) ;Get the title of the webpage ;Local $wTitle = _IEPropertyGet($oIE, "title") ;MsgBox($MB_SYSTEMMODAL, "Webpage title:", $wTitle) ;Clicks the new button Sleep(3000) _IEAction($nWorkOrderB, "focus") _IEAction($nWorkOrderB, "click") Sleep(5000) ;Store the Element names where the important data will be sent ;Store the long description button Local $wLongDButton = _IEGetObjById($oIE, "m65d795a4-img") ;Store the long Description field id Local $wComments = _IEGetObjById($oIE, "ma6499a9c-rte_iframe") ;Store the ok button id that is in the long description Local $wCommOk = _IEGetObjById($oIE, "m74031266-pb") ;******************************************************************************* ; Send the stored data to the proper field ;******************************************************************************* ;Click the long description button Sleep(300) _IEAction($wLongDButton, "focus") _IEAction($wLongDButton, "click") Sleep(300) ;Sends the Comments Sleep(500) _IEAction($wComments, "focus") _IEAction($wComments, "click") Sleep(500) _IEFormElementSetValue($wComments, "hello darkness my old friend") ;Click the ok button Sleep(500) _IEAction($wCommOk, "focus") _IEAction($wCommOk, "click") Sleep(500)  
      Here is the popup:

    • hemichallenger
      By hemichallenger
      Hello,
      If anyone can help, it would be greatly appreciated. The code is just an example and similar to the issue with an internal webpage. I'm trying to autofill than click the submit button. I get the same error running the script.
      _IEFormElementSetValue, $_IESTATUS_InvalidObjectType
      IEGetObjById, $_IESTATUS_NoMatch
      Is it possible to edit the <textarea></textarea>  field with AutoIt?  If anyone could assist me on how. Than I could have a better understanding and chance in figuring it out on my main script. Thank you
      #include <IE.au3> Local $oIE1 = _IECreate ("https://www.w3schools.com/tags/tryit.asp?filename=tryhtml_textarea") _IELoadWait($oIE1) local $oFormCollection = _IEFormGetCollection($oIE1,0) local $oid = _IEGetObjByid($oFormCollection, "iframeResult") _IEFormElementSetValue($oid, "test") sleep (2000) Local $oIE2 = _IECreate ("https://www.w3schools.com/html/tryit.asp?filename=tryhtml_scripts_intro") _IELoadWait($oIE2) $oForm2 = _IEFormGetCollection($oIE2,0) $oClickMe = _IEGetObjById($oForm2, "demo") _IEAction($oClickMe, "focus") _IEAction($oClickMe, "click")
    • Blueman
      By Blueman
      Hey Guys,
      Hope that you can help me with something, maybe this is a bug in the new version of AUTOIT but first i will check it with you to know for sure.
      I have made a simple GUI with a Embedded IE Object, then i would like to read the HTML with _IEBodyReadHTML(), easy right?
      When i use the old IE.au3 include from a year back or so, it is working fine!
      When i use the new IE.au3 include came with the new installation that is currently available on autoitscript.com it isnt working (i get a result that says; 0).
      Let me show you.
       
      Working Example
      #include <GUIConstantsEx.au3> #include <IE_EmbeddedVersioning.au3> #include <IE_PreVersion.au3> ;Older Version Example() Func Example() ; Create a GUI with various controls. Local $hGUI = GUICreate("Example", 1000, 1000) Local $idOK = GUICtrlCreateButton("OK", 310, 370, 85, 25) Global $oIE_1 = _IECreateEmbedded() ; CREATE IE OBJECT(S) GUICtrlCreateObj($oIE_1, 355, 5, 600, 360) _IENavigate($oIE_1, "https://www.google.nl", 1) Local $CheckHTML_T = _IEBodyReadHTML($oIE_1) ; Display the GUI. GUISetState(@SW_SHOW, $hGUI) MsgBox(48,"",$CheckHTML_T) ; Loop until the user exits. While 1 Switch GUIGetMsg() Case $GUI_EVENT_CLOSE, $idOK ExitLoop EndSwitch WEnd ; Delete the previous GUI and all controls. GUIDelete($hGUI) EndFunc ;==>Example  
      Failing Example
      #include <GUIConstantsEx.au3> #include <IE_EmbeddedVersioning.au3> #include <IE.au3> ;New Version Example() Func Example() ; Create a GUI with various controls. Local $hGUI = GUICreate("Example", 1000, 1000) Local $idOK = GUICtrlCreateButton("OK", 310, 370, 85, 25) Global $oIE_1 = _IECreateEmbedded() ; CREATE IE OBJECT(S) GUICtrlCreateObj($oIE_1, 355, 5, 600, 360) _IENavigate($oIE_1, "https://www.google.nl", 1) Local $CheckHTML_T = _IEBodyReadHTML($oIE_1) ; Display the GUI. GUISetState(@SW_SHOW, $hGUI) MsgBox(48,"",$CheckHTML_T) ; Loop until the user exits. While 1 Switch GUIGetMsg() Case $GUI_EVENT_CLOSE, $idOK ExitLoop EndSwitch WEnd ; Delete the previous GUI and all controls. GUIDelete($hGUI) EndFunc ;==>Example  
      I have attachted all files and i am testing on Windows 10 with the latest SciTe Program (Not compiled). 
      When i compile the script it is showing the same result.
      Thanks guys!
      IE_PreVersion.au3
      IE.au3
      IE_EmbeddedVersioning.au3
×