Jump to content

Recommended Posts

Posted

If I had a string like the following:

<id>    -------------------------- #1
  <name>rbhkamal</name>
  <email></email>
</id>   -------------------------- #2
<session>
 <active>1</active>
 <id>rbhkamal</id> ---------- #3
</session>

How can make the regular expression below match only from #1 to #2? Right now it stops at #3.

<id>(?s)(.+)</id>

To be more precise; I need to know what can I use instead of "(?s)(.+)" to match anything but a specific word (in my case it's </id>).

I tried few failed attempts:

(?:\s?[^<][^\/][^i][^d][^>])+ ----This doesn't work and I don't know why!!!

([^(?:</id>)]*) ---- This one treats "</id>" each character individually.. not as a whole word

Any help is greatly appreciated!

Regards,

RK

"When the power of love overcomes the love of power, the world will know peace"-Jimi Hendrix

Posted

i want everything starting from "<id>" untill you get to the first "</id>".

"When the power of love overcomes the love of power, the world will know peace"-Jimi Hendrix

Posted (edited)

your example looks like xml, the tree looks off though.

look into the XMLDOM object, its methods and properties in particular.

This site has been a good resource for me.

you can start from scratch and build your own stuff like:

$xmlFile = @ScriptDir & "\xmlfile.xml"

$xmldoc = ObjCreate( 'Microsoft.XMLDOM' );create an instance of the xmlDom Object
If Not IsObj($xmldoc) Then exit; check if the object is successfully created
$xmldoc.load($xmlFile)
$xmldoc.async = False



$ROOT = $xmldoc.documentElement; get the root element

ConsoleWrite($ROOT.tagName & " is the Root node of the xml file and has " & $ROOT.childNodes.length & " childNodes" & @CRLF& @CRLF)

;loop through the childnodes
for $i = 0 To $ROOT.childNodes.length -1
    With $ROOT.childNodes($i)
        ConsoleWrite("child " & $i & " has the tagName: " & .tagName & " and has the text: " & .text & @CRLF)
    EndWith
Next

and your xmlfile would be lets say:

<offspring>
    <youngest>Benjamin</youngest>
    <middleChild>the outkasted one</middleChild>
    <oldest>Big Brother</oldest>
</offspring>

or you can look into the XMLDOM wrapper of eltorro XML DOM Wrapper UDF

i want everything starting from "<id>" untill you get to the first "</id>".

if you only want to get the content of a specific tagName then I would go for the getElementsBytagName method

hope this helps...

cheers

Edited by Marcuzzo18

[font="Century Gothic"]quisnam est quantum stultus , balatro vel balatro quisnam insistovolubilis in solum rideo risi risum----------------------------------------------------------------------------------------------------------------------------Portable Command Line Tool[/font]

Posted

I think i got your RegExp if you are still going that route...

Try this pattern

(?s)(?i)<id>(.+?)</id>

#include <Array.au3>

$Test = "<id>"& _
"  <name>rbhkamal</name>"& _
"  <email></email>"& _
"</id>"& _
"<session>"& _
"<active>1</active>"& _
"<id>rbhkamal</id>"& _
"</session>"

$Options = "(?s)(?i)" ; "." match newlines / case-insensitive
$Start = "<id>"
$Mid = "(.+?)"
$End = "</id>"

$Expression = $Options&$Start&$Mid&$End

$Result = StringRegExp($Test, $Expression, 3)
_ArrayDisplay($Result)
Posted (edited)

Thanks Marcuzzo18, that really helped. I just need to adjust the config file to a valid xml format. However, I've come accross this delema multiple times and every time I find a workaround.

I would still like to know how to match for anything except a specific string using regular expressions.

Regards,

RK

Edit: Thanks Paulie that actually works. I don't know how I didn't think of using mode 3... it must the weekend effect. lol

Edited by rbhkamal

"When the power of love overcomes the love of power, the world will know peace"-Jimi Hendrix

Posted (edited)

I would still like to know how to match for anything except a specific string using regular expressions.

The reg expression in my previous post matches everything between the "<id>" and "</id>" tags.

Were you expecting a different result?

EDIT: lol i replied before your edit muttley

Edited by Paulie
Posted

It still is kinda "misterious" to me how this works; I've been testing different options without result (I know I have much to learn about).

IMO the

<id>(.+?)(?s)</id>
should have returned a match for the first section (mode 1) but it didn't; it matched only "rbhkamal".

(?s)<id>(.+?)</id>
works as intended ... I wonder ... why this (?s) in front of the expression makes such a difference because before first <id> is nothing so it should have been working without this ...

... darn regular expressions ... it's always a matter of trial and error for me muttley

SNMP_UDF ... for SNMPv1 and v2c so far, GetBulk and a new example script

wannabe "Unbeatable" Tic-Tac-Toe

Paper-Scissor-Rock ... try to beat it anyway :)

Posted

It still is kinda "misterious" to me how this works; I've been testing different options without result (I know I have much to learn about).

IMO the

<id>(.+?)(?s)</id>
should have returned a match for the first section (mode 1) but it didn't; it matched only "rbhkamal".

(?s)<id>(.+?)</id>
works as intended ... I wonder ... why this (?s) in front of the expression makes such a difference because before first <id> is nothing so it should have been working without this ...

... darn regular expressions ... it's always a matter of trial and error for me muttley

You need to put the flag "(?s)" before using the dot in "(.+?)" so that the "." will match a new line as well as any character. In the case of the first <id>...</id> there are new lines and that's why (.+?) failed. In the second one there are new lines thus (.+?) matched with "rbhkamal".

"When the power of love overcomes the love of power, the world will know peace"-Jimi Hendrix

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...