Jump to content

How to detect if a URL is a download link before using _IENavigate?


Recommended Posts

Is there a fast way of check to see if a URL is a download link before I use _IENavigate?

Also, does anyone know how to disable all popup windows (like the are your sure you want to navigate away windows) without disabling scripting?

Thanks

Edited by Herb191
Link to post
Share on other sites

Hey,

What do the links look like, are they just URI's to a file? Do you know what types of files to expect?

If so you could just use StringRegExp, something like this for example.

$file_url = "http://www.example.com/path/to/file.zip"
If StringRegExp($file_url, "/[^/]+\.(rar|zip)\z") Then
    ConsoleWrite("File URL: " & $file_url & @CRLF) ; do what you need want with the file URL here
EndIf
Edited by Robjong
Link to post
Share on other sites

Hey,

What do the links look like, are they just URI's to a file? Do you know what types of files to expect?

If so you could just use StringRegExp, something like this for example.

$file_url = "http://www.example.com/path/to/file.zip"
If StringRegExp($file_url, "/[^/]+\.(rar|zip)\z") Then
    ConsoleWrite("File URL: " & $file_url & @CRLF) ; do what you need want with the file URL here
EndIf

Hi Robjong,

Thanks for the response. I am actually trying to weed out any URL's that are download links. Unfortunately I never know what the URL's are going to look like because I am using a web crawler to get them from completely random websites. I have tried something similar to what you have above but there are thousands of possible download files and my script inevitably finds one and stops working.

Link to post
Share on other sites

What do you do with valid URL's? show them in an IE window? get the source?

If it is because you want to parse the URL's in a crawler like fashion, or show only html pages for example,

you could check the content-type of the URL by checking the headers...

Here is a rough example:

Global $aAcceptedContentTypes = "text/\w+" ; allows any text content type
;~ Global $aAcceptedContentTypes[2] = ["text/html", "text/plain"] ; if the type string  contains regex meta characters you can escape the string by putting it between \Q and \E(example: \Qtext/foo\E)
 
$url_text = "http://www.autoitscript.com"
$url_file = "http://www.autoitscript.com/cgi-bin/getfile.pl?autoit3/autoit-v3-setup.exe"
 
$result = _CheckContentType($url_text, $aAcceptedContentTypes)
ConsoleWrite("- " & $result & @CRLF)
 
$result = _CheckContentType($url_file, $aAcceptedContentTypes)
ConsoleWrite("- " & $result & @CRLF)
 
 
Func _CheckContentType($sURL, $mContentTypes)
    Local $oHTTP = ObjCreate("winhttp.winhttprequest.5.1")
    $oHTTP.open("HEAD", $sURL)
    $oHTTP.Send()
    If IsArray($mContentTypes) Then
        For $i = 0 To UBound($mContentTypes) - 1
            If StringRegExp($oHTTP.GetAllResponseHeaders, "(?i)content-type:\s*" & $mContentTypes[$i] & ";") Then Return SetError(0, 0, True)
        Next
    ElseIf StringRegExp($oHTTP.GetAllResponseHeaders, "(?i)content-type:\s*" & $mContentTypes & ";") Then
        Return SetError(0, 0, True)
    EndIf
    Return SetError(1, 0, False)
EndFunc

If you want download the files/source you could use 'GET' instead 'HEAD' for the open function and download if it passes the check, this would save you a request.

Edited by Robjong
Link to post
Share on other sites

That is nice bit of coding but I need to be able to show the page URL's in an IE window because I am processing some date after the server side scripts run...also I need to be able to run on just about any kind of page (except PDF).

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Recently Browsing   0 members

    No registered users viewing this page.

  • Similar Content

    • By Pured
      I am looking to create a script which refreshes/reads a webpage every few seconds. My goal is to see if the page has changed, then I will send myself a notification that the webpage has been updated.
       
      However, rather than downloading the entire webpage every single time, is there a way to check when the webpage last updated?
       
      If not, is there away to partially download/read html source until a specific tag is hit?
       
      Goal: I would like to increase my poll rate and not excessively waste data.
    • By mLipok
      In the past there was many questions about how to: "Automatic file upload using without user interaction"

      https://www.autoitscript.com/forum/topic/92907-ie-8-input-namenomfic-typefile-idnomfic/
      https://www.autoitscript.com/forum/topic/116899-cant-automate-input-typefile-tag-in-ie/?tab=comments#comment-815478
      https://www.autoitscript.com/forum/topic/14883-input-typefile/
      https://www.autoitscript.com/forum/topic/188708-how-to-set-the-value-of-an-input-typefile-element/
      https://www.autoitscript.com/forum/topic/91513-how-can-i-auto-set-file-path-for-input-file-in-ie/
      https://www.autoitscript.com/forum/topic/116899-cant-automate-input-typefile-tag-in-ie/
      https://www.autoitscript.com/forum/topic/169190-how-to-script-file-upload-button/
      https://www.autoitscript.com/forum/topic/145327-how-to-deal-with-ie-window-for-upload-a-fileinput-typefile/
      https://www.autoitscript.com/forum/topic/140482-internet-explorer-input-file-problem/
       
      I found solution here: 
      https://stackoverflow.com/questions/33253517/upload-a-file-via-input-input-in-html-form-with-vba
      and:
      https://www.motobit.com/tips/detpg_uploadvbsie/
      And I translate this code to AutoIt3 code:
      ; Upload file using http protocol And multipart/form-data ; v1.01 ; 2001 Antonin Foller, PSTRUH Software Global $oErrorHandler = ObjEvent("AutoIt.Error", _ErrFunc) do_vbsUpload() Func do_vbsUpload() #cs ; We need at least two arguments (File & URL) ConsoleWrite('- ' & @ScriptLineNumber & @CRLF) If $CmdLine[0] < 2 Then InfoEcho() ConsoleWrite('- ' & @ScriptLineNumber & @CRLF) ; Are some required objects missing? If StringInStr(CheckRequirements(), "Error") > 0 Then InfoEcho() ConsoleWrite('- ' & @ScriptLineNumber & @CRLF) Local $s_FileName, $s_DestURL, $s_FieldName $s_FieldName = "FileField" ; Default field name For $i_argCounter = 1 To $CmdLine[0] ConsoleWrite('+ '& $i_argCounter& ' >> ' & $CmdLine[$i_argCounter] & @CRLF) Select Case $i_argCounter = 1 ;~ $s_FileName = $CmdLine[$i_argCounter] $s_FileName = @ScriptFullPath Case $i_argCounter = 2 $s_DestURL = $CmdLine[$i_argCounter] Case $i_argCounter = 3 $s_FieldName = $CmdLine[$i_argCounter] EndSelect Next UploadFile($s_DestURL, $s_FileName, $s_FieldName) #ce UploadFile('http://www.dobeash.com/test.html', @ScriptFullPath, 'fileExample') EndFunc ;==>do_vbsUpload ; ******************* upload - begin ; Upload file using input type=file Func UploadFile($s_DestURL, $s_FileName, $s_FieldName) ; Boundary of fields. ; Be sure this string is Not In the source file Const $Boundary = "---------------------------0123456789012" ; Get source file As a binary data. Local $d_FileContents = GetFile($s_FileName) ; Build multipart/form-data document Local $s_FormData = BuildFormData($d_FileContents, $Boundary, $s_FileName, $s_FieldName) ; Post the data To the destination URL IEPostBinaryRequest($s_DestURL, $s_FormData, $Boundary) EndFunc ;==>UploadFile ; Build multipart/form-data document with file contents And header info Func BuildFormData($d_FileContents, $Boundary, $s_FileName, $s_FieldName) Const $s_ContentType = "application/upload" ; The two parts around file contents In the multipart-form data. Local $s_Pre = "--" & $Boundary & @CRLF & mpFields($s_FieldName, $s_FileName, $s_ContentType) Local $s_Po = @CRLF & "--" & $Boundary & "--" & @CRLF ; Build form data using recordset binary field Const $i_adLongVarBinary = 205 Local $oRS = ObjCreate("ADODB.Recordset") ; https://docs.microsoft.com/en-us/sql/ado/reference/ado-api/append-method-ado?view=sql-server-ver15 $oRS.Fields.Append("b", $i_adLongVarBinary, StringLen($s_Pre) + BinaryLen($d_FileContents) + StringLen($s_Po)) $oRS.Open() $oRS.AddNew() ; Convert Pre string value To a binary data Local $i_LenData = StringLen($s_Pre) $oRS("b").AppendChunk(StringToMB($s_Pre) & StringToBinary(Chr(0))) $s_Pre = $oRS("b").GetChunk($i_LenData) $oRS("b") = "" ; Convert Po string value To a binary data $i_LenData = StringLen($s_Po) $oRS("b").AppendChunk(StringToMB($s_Po) & StringToBinary(Chr(0))) $s_Po = $oRS("b").GetChunk($i_LenData) $oRS("b") = "" ; Join Pre & $d_FileContents & Po binary data $oRS("b").AppendChunk($s_Pre) $oRS("b").AppendChunk($d_FileContents) $oRS("b").AppendChunk($s_Po) $oRS.Update() Local $s_FormData = $oRS("b") $oRS.Close() Return $s_FormData EndFunc ;==>BuildFormData ; sends multipart/form-data To the URL using IE Func IEPostBinaryRequest($s_URL, $s_FormData, $Boundary) ; Create InternetExplorer Local $oIE = ObjCreate("InternetExplorer.Application") ; You can uncoment Next line To see form results $oIE.Visible = True ; Send the form data To $s_URL As POST multipart/form-data request $oIE.Navigate($s_URL, '', '', $s_FormData, _ "Content-Type: multipart/form-data; boundary=" & $Boundary & @CRLF) While $oIE.Busy Wait(1, "Upload To " & $s_URL) WEnd ; Get a result of the script which has received upload ;~ On Error Resume Next Local $s_IE_InnerHTML = $oIE.Document.body.innerHTML MsgBox(0, 'TEST #' & @CRLF & @ScriptLineNumber, $s_IE_InnerHTML) $oIE.Quit() Return $s_IE_InnerHTML EndFunc ;==>IEPostBinaryRequest ; Infrormations In form field header. Func mpFields($s_FieldName, $s_FileName, $s_ContentType) Local $s_MPTemplate = _ ; template For multipart header 'Content-Disposition: form-data; name="{field}";' & _ 'FileName="{file}"' & @CRLF & _ 'Content-Type: {ct}' & @CRLF & @CRLF & _ '' Local $s_Out $s_Out = StringReplace($s_MPTemplate, "{field}", $s_FieldName) $s_Out = StringReplace($s_Out, "{file}", $s_FileName) $s_Out = StringReplace($s_Out, "{ct}", $s_ContentType) Return $s_Out EndFunc ;==>mpFields Func Wait($i_Seconds, $s_Message) MsgBox(64, '', $s_Message, $i_Seconds) EndFunc ;==>Wait ; Returns file contents As a binary data Func GetFile($s_FileName) Local $oStream = ObjCreate("ADODB.Stream") $oStream.Type = 1 ; Binary $oStream.Open() $oStream.LoadFromFile($s_FileName) Local $d_GetFile = $oStream.Read() $oStream.Close() Return $d_GetFile EndFunc ;==>GetFile ; Converts OLE string To multibyte string Func StringToMB($S) Local $I, $B For $I = 1 To StringLen($S) $B &= StringToBinary(Asc(StringMid($S, $I, 1))) Next Return $B EndFunc ;==>StringToMB ; ******************* upload - end ; ******************* Support ; Basic script info Func InfoEcho() Local $sMsg = _ "Upload file using http And multipart/form-data" & @CRLF & _ "Copyright (C) 2001 Antonin Foller, PSTRUH Software" & @CRLF & _ "use" & @CRLF & _ "[cscript|wscript] fupload.vbs file $s_URL [fieldname]" & @CRLF & _ " file ... Local file To upload" & @CRLF & _ " $s_URL ... $s_URL which can accept uploaded data" & @CRLF & _ " fieldname ... Name of the source form field." & @CRLF & _ @CRLF & CheckRequirements() & @CRLF & _ "" ConsoleWrite('! ' & $sMsg & @CRLF) EndFunc ;==>InfoEcho ; Checks If all of required objects are installed Func CheckRequirements() Local $sMsg = _ "This script requires some objects installed To run properly." & @CRLF & _ CheckOneObject("ADODB.Recordset") & @CRLF & _ CheckOneObject("ADODB.Stream") & @CRLF & _ CheckOneObject("InternetExplorer.Application") & @CRLF & _ "" Return $sMsg ; $sMsgBox $sMsg EndFunc ;==>CheckRequirements ; Checks If the one object is installed. Func CheckOneObject($sClassName) Local $sMsg ObjCreate($sClassName) If @error = 0 Then $sMsg = "OK" Else $sMsg = "Error:" & @error EndIf Return $sClassName & " - " & $sMsg EndFunc ;==>CheckOneObject ; ******************* Support - end ; User's COM error function. Will be called if COM error occurs Func _ErrFunc(ByRef $oError) ; Do anything here. ConsoleWrite(@ScriptName & " (" & $oError.scriptline & ") : ==> COM Error intercepted !" & @CRLF & _ @TAB & "err.number is: " & @TAB & @TAB & "0x" & Hex($oError.number) & @CRLF & _ @TAB & "err.windescription:" & @TAB & $oError.windescription & @CRLF & _ @TAB & "err.description is: " & @TAB & $oError.description & @CRLF & _ @TAB & "err.source is: " & @TAB & @TAB & $oError.source & @CRLF & _ @TAB & "err.helpfile is: " & @TAB & $oError.helpfile & @CRLF & _ @TAB & "err.helpcontext is: " & @TAB & $oError.helpcontext & @CRLF & _ @TAB & "err.lastdllerror is: " & @TAB & $oError.lastdllerror & @CRLF & _ @TAB & "err.scriptline is: " & @TAB & $oError.scriptline & @CRLF & _ @TAB & "err.retcode is: " & @TAB & "0x" & Hex($oError.retcode) & @CRLF & @CRLF) EndFunc ;==>_ErrFunc  
      But I miss something and the code not works as intendend.
      Please join and contribute, in solving this issue, as this will be handy for entire community.
      @mLipok
       
      btw.
      I think that this may be realated to ChrB() which I simply translate to StringToBinary()
      Especialy this :
      StringToBinary(Chr(0))) could be the main issue.
      But for now I'm tired and going to sleep.
      Hope maybe tomorrow somebody solve this issue.
       
    • By Deshanur
      Am trying to automate injecting credential on the login form for all kind of Web application for IE. I know how to identify the form name by viewing the source code and using the method - _IEFormGetObjByName($ie, $form_Name).
      I would like to know how to identify or get the form object for the web app where there is no form name tag for example below, for the is I have used - _IEFormGetCollection($ie, 0) to get the form object.
      My Question is does it work for all kind of application "_IEFormGetCollection($ie, 0)" how to identify Index value? is it always 0? is there any better solution?
      The final solution am looking for is find out form object, get the username, password field and inject credential and submit the form.
      How to find out index value? for the forms which does not have form name field.
      $login_form = _IEFormGetCollection($ie, 0) $email_field = _IEFormElementGetObjByName($login_form, $form_UserName) $pass_field = _IEFormElementGetObjByName($login_form, $form_password) $login_button = _IEFormElementGetObjByName($login_form, $form_submitbutton) _IEFormElementSetValue($email_field, $CmdLine[2]) _IEFormElementSetValue($pass_field, $CmdLine[3]) ControlSend($hwnd, "", "[CLASS:Internet Explorer_Server; INSTANCE:1]","{Enter}") OR This works fine if the form has form name. $login_form = _IEFormGetObjByName($ie, $form_Name) $email_field = _IEFormElementGetObjByName($login_form, $form_UserName) $pass_field = _IEFormElementGetObjByName($login_form, $form_password) $login_button = _IEFormElementGetObjByName($login_form, $form_submitbutton) _IEFormElementSetValue($email_field, $CmdLine[2]) _IEFormElementSetValue($pass_field, $CmdLine[3]) ControlSend($hwnd, "", "[CLASS:Internet Explorer_Server; INSTANCE:1]","{Enter}")
    • By UEZ
      This project has been discontinued!
       
      Here a small tool I wrote to update my Sysinternal tools collection without the need to download always the whole package or visiting the site to check for updates. I know that there are several tools available (also some tools written in AutoIt) but here another one for the collection. It was good exercise for me to code it.
       
       
        
       
       
      Some files from the live web site cannot be downloaded although they are visible!
       
      Here the download link of the source code only: AutoIt Sysinternal Tools Synchronizer v0.99.6 build 2020-09-23 beta.7z  (1557 downloads previously)
      -=> Requires AutoIt version 3.3.13.20 or higher / tested on Win8.1 real machine and some VMs: Win7 / Vista / Win10
       
      Compiled exe only: @MediaFire
       
      Just select the Sysinternal Tools folder or create one and press the synchronize button to download the selected items. Click on AutoIt label (near to left upper corner) to open menu.
       
      Special thanks to LarsJ, Melba23 and mesale0077 for their help. 
       
      I've still some ideas to implement which are more gimmick related, so it is not finished yet...
      If you want to add your language please check out #Region Language. Thanks. 
       
      Please report any bug or if you have any suggestions.
       
      The language of the tool tip from each of the executable in the left list view were automatically created using Google translator and weren't checked for correctness.
       
      Br,
      UEZ
    • By devilspride
      The following code creates a IE blank window
      Local $oIE = _IECreate()  
      But when i use Navigate to the URL, it open the URL in Microsoft Edge instead if IE.
      _IENavigate($oIE,$url)  
      What should i do to navigate in IE.
      Complete code :
      #include <MsgBoxConstants.au3> #include <WinAPIFiles.au3> #include <IE.au3> #Region TESTING Local $url = 'https://www.youtube.com' Local $oIE = _IECreate() _IENavigate($oIE,$url) #EndRegion Console Output
      IE.au3 T3.0-2 Error from function _IELoadWait, $_IESTATUS_ClientDisconnected (-2147023174, Browser has been deleted prior to operation.)  
      I have searched the forums but did not find such kind of post.
      Other posts were describing How to use Edge using Web driver selenium.
       
      Edit: I am working in Windows10. Recently many changes have been done by Microsoft to IE and Microsoft Edge. (2020)
      Earlier in 2019 this was working fine.
×
×
  • Create New...