Sign in to follow this  
Followers 0
Mecano

StringRegExpReplace and patterns

17 posts in this topic

Hallo forum members,

Question in example below "StringRegExpReplace"  and pattern (?i) works well but (see the info.xml examples)

#include <ButtonConstants.au3>
#include <EditConstants.au3>
#include <GUIConstantsEx.au3>
#include <WindowsConstants.au3>

Opt("MustDeclareVars", 1)

Global $info = @ScriptDir & "\info.xml"

Local $Gui = GUICreate("Info URL", 615, 229, 192, 124)
Local $Input1 = GUICtrlCreateInput("", 32, 48, 481, 21)
Local $Button1 = GUICtrlCreateButton("Ok", 544, 46, 33, 25)
GUISetState(@SW_SHOW)

While 1
    Local $nMsg = GUIGetMsg()
    Switch $nMsg
        Case $GUI_EVENT_CLOSE
            Exit
        Case $Button1
            changeUrl()
    EndSwitch
WEnd

Func changeUrl()
    Local $info_url = GUICtrlRead($Input1)
    Local $File = FileOpen($info, 0)
    If $File <> -1 Then
        Local $info_container = FileRead($File)
        FileClose($File)
        If StringRegExp($info_container, "(?i)<url>(.*)</url>") Then
            $info_container = StringRegExpReplace($info_container, "(?i)<url>(.*)</url>", "<url>" & $info_url & "</url>")
            Local $hFile = FileOpen($info, 130) ; Open file for writing in unicode UTF8 mode
            FileWrite($hFile, $info_container)
            FileClose($hFile)
        EndIf
    EndIf
EndFunc   ;==>changeUrl

info.xml (created by new installation) -> (?i) does not change the url  (?s) will change the url

<?xml version="1.0" encoding="utf-8"?>
<document>
 <address>51.49079, -0.116844</address>
  <name>Info</name>
    <city>London</city>
    <style id="sn_subway">
        <hotSpot x="0" y="0" xunits="pixels" yunits="pixels" />
      <balloonStyle>
        <bgColor>FFFFFFFF</bgColor>
        <textColor>FF000000</textColor>
      </balloonStyle>
    </style>
  <url>
  </url>
</document>

info.xml (after editing) in this case (?i) or (?s) will work both

<?xml version="1.0" encoding="utf-8"?>
<document>
 <address>51.49079, -0.116844</address>
  <name>Info</name>
    <city>London</city>
    <style id="sn_subway">
        <hotSpot x="0" y="0" xunits="pixels" yunits="pixels" />
      <balloonStyle>
        <bgColor>FFFFFFFF</bgColor>
        <textColor>FF000000</textColor>
      </balloonStyle>
    </style>
  <url>http://www.justaurl.com/</url>
</document>

Using (?s) Is this the right way to do it?

Patterns are still a :mad2:  for me

Share this post


Link to post
Share on other sites



#2 ·  Posted (edited)

will have to be

"(?i)<url>(.*?)</url>"
;or
"(?i)<url>([^\<\r\n]*)</url>"

Ciao.

Edited by DXRW4E

apps-odrive.pngdrive_app_badge.png box-logo.png new_logo.png MEGA_Logo.png

Share this post


Link to post
Share on other sites

#3 ·  Posted (edited)

xml dom route:

$sXML = '<?xml version="1.0" encoding="utf-8"?>' & @CRLF & _
"<document>" & @CRLF & _
  "<url>http://www.justaurl.com/</url>" & @CRLF & _
"</document>"
$oXML = ObjCreate("Microsoft.XMLDOM")
$oXML.loadxml($sXML)
$oURL = $oXML.selectSingleNode("//url")
ConsoleWrite($oURL.text & @CRLF)

output is your url

If the XML is a file, load the file via .load, rather than .loadxml, where the path & file name is the string

Edited by jdelaney

IEbyXPATH-Grab IE DOM objects by XPATH IEscriptRecord-Makings of an IE script recorder ExcelFromXML-Create Excel docs without excel installed GetAllWindowControls-Output all control data on a given window.

Share this post


Link to post
Share on other sites

@DXRW4E, thanks for the correction

$info_container = StringRegExpReplace($info_container, "(?i)<url>(.*?)</url>", "<url>" & $info_url & "</url>")

if the node in the xml file is like this:

<url>
  </url>

(?i) won't work, 

(?s) will change the url but is this good to use (?s)?

(?i) will work only if the node in the xml is like this:

<url></url>  or <url>http://www.justaurl.com/</url>

@jdelaney,

Thanks for the answer but I try to learn  StringRegExpReplace

Next thing look in to it in xml dom route:

Share this post


Link to post
Share on other sites

#5 ·  Posted (edited)

@DXRW4E, thanks for the correction

$info_container = StringRegExpReplace($info_container, "(?i)<url>(.*?)</url>", "<url>" & $info_url & "</url>")

if the node in the xml file is like this:

<url>
  </url>

(?i) won't work, 

(?s) will change the url but is this good to use (?s)?

(?i) will work only if the node in the xml is like this:

<url></url>  or <url>http://www.justaurl.com/</url>

"(?si)<url>(?>\s*)(.*?)\s*</url>" 
;or 
"(?si)<url>(?>\s*)([^\<\r\n]*)\s*</url>"

Ciao.

Edited by DXRW4E

apps-odrive.pngdrive_app_badge.png box-logo.png new_logo.png MEGA_Logo.png

Share this post


Link to post
Share on other sites

DXRW4E,

With (?s) the dot matches spaces/newlines so isn't this enough ?

"(?is)<url>(.*?)</url>"

or this ?

$info_container = StringRegExpReplace($info_container, "(?i)<url>([^<]*)", "<url>" & $info_url)

Share this post


Link to post
Share on other sites

#7 ·  Posted (edited)

DXRW4E,

With (?s) the dot matches spaces/newlines so isn't this enough ?

"(?is)<url>(.*?)</url>"

better to use s* for safety reasons, to calculate all possible scenarios them, example

;   <url>
;       http://www.justaurl.com/
;   </url>
or this ?

$info_container = StringRegExpReplace($info_container, "(?i)<url>([^<]*)", "<url>" & $info_url)

 normally is Ok

Ciao.

Edited by DXRW4E

apps-odrive.pngdrive_app_badge.png box-logo.png new_logo.png MEGA_Logo.png

Share this post


Link to post
Share on other sites

Can you please explain ? I don't understand the safety reasons

The expressions work with the case you povided

Share this post


Link to post
Share on other sites

#9 ·  Posted (edited)

try with

<?xml version="1.0" encoding="utf-8"?>
<document>
    <url>
        http://www.justaurl.com/
    </url>
</document>

and see the differences

Ciao.

Edited by DXRW4E

apps-odrive.pngdrive_app_badge.png box-logo.png new_logo.png MEGA_Logo.png

Share this post


Link to post
Share on other sites

#10 ·  Posted (edited)

if you want to format the line, the

$info_container = StringRegExpReplace($info_container, "(?i)<url>\K[^<]*", $info_url)

is OK

but if you want to get correctly the url (String)

$aUrl = StringRegExp($sXmlData, '(?si)<url>(?>\s*"?)([^\<"\r\n]*)"?\s*</url>', 3)
;or
$aUrl = StringRegExp($sXmlData, "(?si)<url>(?>\s*)(.*?)\s*</url>", 3)

is Ok

Ciao.

Edited by DXRW4E

apps-odrive.pngdrive_app_badge.png box-logo.png new_logo.png MEGA_Logo.png

Share this post


Link to post
Share on other sites

OK, here the expected result seems to be

<?xml version="1.0" encoding="utf-8"?>
<document>
    <url>http://www.justaurl.com/</url>
</document>

and for this both expressions work

Effectively getting the url is a different purpose

Share this post


Link to post
Share on other sites

#12 ·  Posted (edited)

This is overly complicated for simple xmldom manipulation, use xmldom :) :

$sXML = '<?xml version="1.0" encoding="utf-8"?>' & @CRLF & _
"<document>" & @CRLF & _
  "<url>" & @CRLF & _
  "</url>" & @CRLF & _
"</document>"
$oXML = ObjCreate("Microsoft.XMLDOM")
$oXML.loadxml($sXML)
ConsoleWrite("original XML=[" & $oXML.xml & "]" & @CRLF)
ConsoleWrite("Original URL=[" & $oXML.selectSingleNode("//url").text & "]" & @CRLF)
$oXML.selectSingleNode("//url").text = "http://www.justaurl.com/" ; changes the url
ConsoleWrite("NEW URL=[" & $oXML.selectSingleNode("//url").text & "]" & @CRLF)
ConsoleWrite("New XML=[" & $oXML.xml & "]" & @CRLF)

output:

original XML=[<?xml version="1.0"?>
<document>
 <url>
 </url>
</document>
]
Original URL=[]
NEW URL=[http://www.justaurl.com/]
New XML=[<?xml version="1.0"?>
<document>
 <url>http://www.justaurl.com/</url>
</document>
]

Edited by jdelaney

IEbyXPATH-Grab IE DOM objects by XPATH IEscriptRecord-Makings of an IE script recorder ExcelFromXML-Create Excel docs without excel installed GetAllWindowControls-Output all control data on a given window.

Share this post


Link to post
Share on other sites

#13 ·  Posted (edited)

@jdelaney

I like your script

maybe it strange but in certain cases the regex is more faster see

Ciao.

Edited by DXRW4E

apps-odrive.pngdrive_app_badge.png box-logo.png new_logo.png MEGA_Logo.png

Share this post


Link to post
Share on other sites

No arguments there.  But I'd rather have reliability over milliseconds of performance.


IEbyXPATH-Grab IE DOM objects by XPATH IEscriptRecord-Makings of an IE script recorder ExcelFromXML-Create Excel docs without excel installed GetAllWindowControls-Output all control data on a given window.

Share this post


Link to post
Share on other sites

#15 ·  Posted (edited)

But the OP said before that he wants to learn regex ...

$aUrl = StringRegExp($info_container, "(?i)<url>\s*([^\s<]*)", 3)

:)

Edited by mikell

Share this post


Link to post
Share on other sites

#16 ·  Posted (edited)

 But I'd rather have reliability over milliseconds of performance.

maybe you'll find it still strange, but do not talk about differences in milliseconds, but in cases of large files it comes to big differences

Ciao.

Edited by DXRW4E

apps-odrive.pngdrive_app_badge.png box-logo.png new_logo.png MEGA_Logo.png

Share this post


Link to post
Share on other sites

"(?is)<url>(.*?)</url>"

Works good

If the xml file is never edit before it is always like this

<url>
  </url>

After editing it, the xml file look like this:

<url>http://www.justaurl.com/</url>

or what ever you put in the inputbox

@jdelaney, like your script as well I will continue later with it,

wil it change the url

<url>http://www.justaurl.com/</url>

 if I put a new url in the inputbox?

@all thank you for the fast response.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0