Sign in to follow this  
Followers 0
Absmiss

StringRegExp :)

10 posts in this topic

#1 ·  Posted (edited)

Dear users, 
 

Should I take these parts:

Cayce+Family+Clinic
617+S+8th+St
Nashville
TN
37206

Cayce+Family+Clinic - There may be letters ([a-z][A-Z]), numbers, and +.

617+S+8th+St - There may be letters ([a-z][A-Z]), numbers, and +.

Nashville - There may be letters ([a-z][A-Z]), and +. No numbers.

TN - There may be letters ([A-Z]). No numbers or +.

37206 - There may be numbers, and -. No letters +. (ex: 37207-5408)

Thanks for help.  :)

Edited by Absmiss

Share this post


Link to post
Share on other sites



You are wanting to parse the URL? If so, this could be easily accomplished with StringInStr, StringSplit, and StringReplace.

Share this post


Link to post
Share on other sites

#3 ·  Posted (edited)

Yep...you can find the char of the '?'...use stringright to grab everything right of it...then split on %2C...which you might first want to replace with ','.

Try it out, post some code.

Edited by jdelaney

IEbyXPATH-Grab IE DOM objects by XPATH IEscriptRecord-Makings of an IE script recorder ExcelFromXML-Create Excel docs without excel installed GetAllWindowControls-Output all control data on a given window.

Share this post


Link to post
Share on other sites

Unfortunately I've already tried it and nothing to do:

Local $Nome_clinica = _StringBetween($Codice_pagina, "maps?dirflg=d&daddr=", "%2C+")
Local $Array = _StringBetween($Codice_pagina, "%2C+", "%2C+") ; Indirizzo, Città, Regione
Local $CAP = _StringBetween($Codice_pagina, "%2C+", '">')

Local $Indirizzo[0]
Local $City[0]
Local $Regione[0]

For $i = 1 To UBound($Array)
   $j = Round($i / 3, 1)
   $Valore_decimale = $j - Int($j)
   Switch $Valore_decimale
   Case Round(1/3, 1)
      _ArrayAdd($Indirizzo, $Array[$i])
   Case Round(2/3, 1)
      _ArrayAdd($City, $Array[$i])
   Case 0
      _ArrayAdd($Regione, $Array[$i])
   EndSwitch
Next

For $y = 0 To UBound($Nome_clinica) - 1
   GUICtrlSetData($Risultati, $Nome_clinica[$y] & @LF & "Street-address: " & $Indirizzo[$y] & @LF & "Locality: " & $City[$y] & @LF & "Region: " & $Regione[$y] & @LF & "Postal-code: " & $CAP[$y] & @LF & @LF, 1)
Next

$Codice_pagina = Source code of the page.

With StringRegExp I could take any value, put the address, city, etc. in different arrays without using a loop with Round and Int

Share this post


Link to post
Share on other sites

#5 ·  Posted (edited)

I couldn't get a clean regexp for nested groups, that repeat...but this will work for most instance:

#include <Array.au3>
$a = StringRegExp("http://maps.google.com/maps?dirflg=d&daddr=Cayce+Family+Clinic%2C+617+S+8th+St%2C+Nashville%2C+TN%2C+37206", ".*addr\=([\w\+]+)?(?:%2C)?([\w\+]+)?(?:%2C)?([\w\+]+)?(?:%2C)?([\w\+]+)?(?:%2C)?([\w\+]+)?(?:%2C)?([\w\+]+)?(?:%2C)?", 3)

_ArrayDisplay($a)

output

[0]|Cayce+Family+Clinic
[1]|+617+S+8th+St
[2]|+Nashville
[3]|+TN
[4]|+37206
 

I'd have like to do something like this:

".*addr=(([w+]+(?:%2C)?)?)+"

Edited by jdelaney

IEbyXPATH-Grab IE DOM objects by XPATH IEscriptRecord-Makings of an IE script recorder ExcelFromXML-Create Excel docs without excel installed GetAllWindowControls-Output all control data on a given window.

Share this post


Link to post
Share on other sites

#6 ·  Posted (edited)

#include <Array.au3>

$str = "http://maps.google.com/maps?dirflg=d&daddr=Cayce+Family+Clinic%2C+617+S+8th+St%2C+Nashville%2C+TN%2C+37206"

$str = StringRegExpReplace($str, '(.+addr=)', "")
$a = StringSplit($str, '%2C+', 3)

_ArrayDisplay($a)

?

Edit

jdelaney,

$a = StringRegExp($str, "(?<=r=|%2C\+)([\w\+]+)", 3)
Edited by mikell

Share this post


Link to post
Share on other sites

#7 ·  Posted (edited)

Terrific...copying this aside for future templates.  Was never able to figure out the variable collection groups like that.

Cheers

Edited by jdelaney

IEbyXPATH-Grab IE DOM objects by XPATH IEscriptRecord-Makings of an IE script recorder ExcelFromXML-Create Excel docs without excel installed GetAllWindowControls-Output all control data on a given window.

Share this post


Link to post
Share on other sites

It's simply based on a StringSplit alternative  :)

$str = StringRegExpReplace($str, '(.+addr=)', "")
;$a = StringSplit($str, '%2C+', 3)
$a = StringRegExp($str, "(?<=^|%2C\+)([\w\+]+)", 3)

Share this post


Link to post
Share on other sites

I couldn't get a clean regexp for nested groups, that repeat...but this will work for most instance:

#include <Array.au3>
$a = StringRegExp("http://maps.google.com/maps?dirflg=d&daddr=Cayce+Family+Clinic%2C+617+S+8th+St%2C+Nashville%2C+TN%2C+37206", ".*addr\=([\w\+]+)?(?:%2C)?([\w\+]+)?(?:%2C)?([\w\+]+)?(?:%2C)?([\w\+]+)?(?:%2C)?([\w\+]+)?(?:%2C)?([\w\+]+)?(?:%2C)?", 3)

_ArrayDisplay($a)

output

[0]|Cayce+Family+Clinic

[1]|+617+S+8th+St

[2]|+Nashville

[3]|+TN

[4]|+37206

I solved doing as you did. Thank you all for the help!

Share this post


Link to post
Share on other sites

Use mikell's regexp.  It's cleaner, and will return any number of collection groups.  Mine has a max limitation.

#include <Array.au3>
$a = StringRegExp("http://maps.google.com/maps?dirflg=d&daddr=Cayce+Family+Clinic%2C+617+S+8th+St%2C+Nashville%2C+TN%2C+37206", "(?<=r=|%2C\+)([\w\+]+)", 3)
_ArrayDisplay($a)

IEbyXPATH-Grab IE DOM objects by XPATH IEscriptRecord-Makings of an IE script recorder ExcelFromXML-Create Excel docs without excel installed GetAllWindowControls-Output all control data on a given window.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0