Jump to content

RegExp with URL's


erifash
 Share

Recommended Posts

Here is my situation:

I am given a string that might be a URL and may or may not be in proper format (ex: missing http://, uses backslashes, etc...). How might I go about putting it into the proper format? How would I then check if it is indeed a valid URL?

I am now required to break up that same URL into four basic parts: protocol, domain, path, and file (if one exists). Examples of proper URL division:

$url = "http://www.google.com"
; break up the url
$protocol = "http"
$domain = "www.google.com"
$path = "/"
$file = ""oÝ÷ Ù«­¢+ØÀÌØíÕÉ°ôÅÕ½ÐíÑÀè¼½½¼¹½´½Í½µ½±È½¹½Ñ¡È½¥±¹é¥ÀÅÕ½Ðì(ìɬÕÀÑ¡ÕÉ°(ÀÌØíÁɽѽ½°ôÅÕ½ÐíÑÀÅÕ½Ðì(ÀÌØí½µ¥¸ôÅÕ½Ðí½¼¹½´ÅÕ½Ðì(ÀÌØíÁÑ ôÅÕ½Ðì½Í½µ½±È½¹½Ñ¡ÈÅÕ½Ðì(ÀÌØí¥±ôÅÕ½Ðí¥±¹é¥ÀÅÕ½Ð

If anyone is good with regexp can you please help me with this? Thanks. ;)

Edited by erifash
Link to comment
Share on other sites

EHHHHH SMOKE HAS IT COMING

#include <Array.au3>
Global $url[2]
$url[0] = 'ftp://foo.com/somefolder/another/file.zip'
$url[1] = 'http://www.google.com'
For $i = 0 to 1
$array = StringRegExp($url[$i],"(\w*\:*\/*\/*)(\w*\.*\w+\.\w+)(\/.*\/)*(\/\.*)*",3)
_ArrayDisplay($array,'')
Next
Edited by thatsgreat2345
Link to comment
Share on other sites

  • Moderators

It returns the array element blank if it doesn't exist

#include <array.au3>
;protocol, domain, path, and file
$sURL = "http://www.google.com"
$sURL2 = "ftp://foo.com/somefolder/another/file.zip"
$sPattern = '^(?s)(?i)(http|https|ftp|file)://(.*?/|.*$)(.*/){0,}(.*)$'
$aArray = StringRegExp($sURL, $sPattern , 3)
$aArray2 = StringRegExp($sURL2, $sPattern , 3)
_ArrayDisplay($aArray, '')
_ArrayDisplay($aArray2, '')

Edit:

Oops forgot to make sure it was valid!!

Edited by SmOke_N

Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.

Link to comment
Share on other sites

  • Moderators

HA i win he didn't want the colon

#include <array.au3>
;protocol, domain, path, and file
$sURL = "http://www.google.com"
$sURL2 = "ftp://foo.com/somefolder/another/file.zip"
$sPattern = '^(.*?)\:*//(.*?/|.*$)(.*/){0,}(.*)$'
$aArray = StringRegExp($sURL, $sPattern , 3)
$aArray2 = StringRegExp($sURL2, $sPattern , 3)
_ArrayDisplay($aArray, '')
_ArrayDisplay($aArray2, '')
Errr... You'd better check my example again ;)

Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.

Link to comment
Share on other sites

  • Moderators

Here, I modded it after _PathSplit()... I haven't tested it much obviously, so if someone sees a better way then by all means

#include <array.au3>
$sURL = "http://www.autoitscript.com/forum/index.php?showtopic=36679&st=0&gopid=271246&#entry271246"
Dim $szProtocol, $szDomain, $szPath, $szFile
$TestPath = _URLSplit($sURL, $szProtocol, $szDomain, $szPath, $szFile)
_ArrayDisplay($TestPath, 'Demo _UrlSplit()')

Func _URLSplit($szUrl, ByRef $szProtocol, ByRef $szDomain, ByRef $szPath, ByRef $szFile)
    Local $sSREPattern = '^(?s)(?i)(http|ftp|https|file)://(.*?/|.*$)(.*/){0,}(.*)$'
    Local $aUrlSRE = StringRegExp($szUrl, $sSREPattern, 2)
    If Not IsArray($aUrlSRE) Or UBound($aUrlSRE) - 1 <> 4 Then Return SetError(1, 0, 0)
    If StringRight($aUrlSRE[2], 1) = '/' Then
        $aUrlSRE[2] = StringTrimRight($aUrlSRE[2], 1)
        $aUrlSRE[3] = '/' & $aUrlSRE[3]
    EndIf
    $szProtocol = $aUrlSRE[1]
    $szDomain = $aUrlSRE[2]
    $szPath = $aUrlSRE[3]
    $szFile = $aUrlSRE[4]
    Return $aUrlSRE
EndFunc   ;==>_URLSplit

Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...