Sign in to follow this  
Followers 0
ASut

Help needed in StringRegExp

14 posts in this topic

Hi, it's me again.

I want to extract the data between the <test> and &t=a.

The result I want is

1</test>id=1
a</test>id=2
3</Test>id=3

Can someone help to modify to code in order to make it works.

#Include <Array.au3>
$array = StringRegExp("<test>1</test>id=1&t=a <test>X<test>a</test>id=2&t=a <test>3</Test>id=3&t=a", "(?i)(?:>)([^<]+)(?:&t=a)" ,3)
_ArrayDisplay($array)

Share this post


Link to post
Share on other sites



Try this.

#include <Array.au3>

Global $str = "<test>1</test>id=1&t=a <test>X<test>a</test>id=2&t=a <test>3</Test>id=3&t=a"

$str = StringRegExpReplace($str, "(<.+>)([^<>]*\1)", "\1") ; Remove double "<test>".
;ConsoleWrite($str & @LF)

Global $array = StringRegExp($str, "(?i)(?<=<test>)(.+?)(?=&t=a)", 3) ; Capture look ahead of "<test>" and look behind of "&t=a".
_ArrayDisplay($array)

Share this post


Link to post
Share on other sites

Try this.

#include <Array.au3>

Global $str = "<test>1</test>id=1&t=a <test>X<test>a</test>id=2&t=a <test>3</Test>id=3&t=a"

$str = StringRegExpReplace($str, "(<.+>)([^<>]*\1)", "\1") ; Remove double "<test>".
;ConsoleWrite($str & @LF)

Global $array = StringRegExp($str, "(?i)(?<=<test>)(.+?)(?=&t=a)", 3) ; Capture look ahead of "<test>" and look behind of "&t=a".
_ArrayDisplay($array)

Thanks a lot,it works. But how can I remove the result if the id is non a number?

e.g <test>1</test>id=c&t=a

Share this post


Link to post
Share on other sites

#4 ·  Posted (edited)

Thanks a lot,it works. But how can I remove the result if the id is non a number?

e.g <test>1</test>id=c&t=a

Try this, where id=(a non-digit).

#include <Array.au3>

Global $str = "<test>1</test>id=c&t=a <test>X<test>a</test>id=2&t=a <test>3</Test>id=3&t=a"

Global $array = StringRegExp($str, "(?i)(?<=<test>)(.+</.*?id=\D+)(?=&t=a)", 3)
_ArrayDisplay($array)

Or, keep the results when id=(a digit)

#include <Array.au3>

Global $str = "<test>1</test>id=c&t=a <test>X<test>a</test>id=2&t=a <test>3</Test>id=3&t=a"

Global $array = StringRegExp($str, "(?i)(?<=<test>)([^<&]+?</[^&]+?>id=\d+?)(?=&t=a)", 3) ; Capture look ahead of "<test>" and look behind of "&t=a".
_ArrayDisplay($array)
Edited by Malkey

Share this post


Link to post
Share on other sites

Try this, where id=(a non-digit).

#include <Array.au3>

Global $str = "<test>1</test>id=c&t=a <test>X<test>a</test>id=2&t=a <test>3</Test>id=3&t=a"

Global $array = StringRegExp($str, "(?i)(?<=<test>)(.+</.*?id=\D+)(?=&t=a)", 3)
_ArrayDisplay($array)

Or, keep the results when id=(a digit)

#include <Array.au3>

Global $str = "<test>1</test>id=c&t=a <test>X<test>a</test>id=2&t=a <test>3</Test>id=3&t=a"

Global $array = StringRegExp($str, "(?i)(?<=<test>)([^<&]+?</[^&]+?>id=\d+?)(?=&t=a)", 3) ; Capture look ahead of "<test>" and look behind of "&t=a".
_ArrayDisplay($array)

Thanks, that save me a lot of time.

It works fine for me.

ps: RegExp is so hard to understand

Share this post


Link to post
Share on other sites

This is the most revelant reply to your question of post #3.

The results are altered by removing the result that has the id=(not a number).

#include <Array.au3>

Global $str = "<test>1</test>id=c&t=a <test>X<test>a</test>id=2&t=a <test>3</Test>id=3&t=a"

$str = StringRegExpReplace($str, "(<.+>)([^<>]*\1)", "\1") ; Remove double "<test>".
;ConsoleWrite($str & @LF)

Global $array = StringRegExp($str, "(?i)(?<=<test>)(.+?)(?=&t=a)", 3) ; Capture look ahead of "<test>" and look behind of "&t=a".
_ArrayDisplay($array) ;  <- The results

; Remove the result if the id is not a number.
For $i = UBound($array) - 1 To 0 Step -1
    If StringRegExp($array[$i], "(id=\D+?)") Then _ArrayDelete($array, $i)
Next
_ArrayDisplay($array)

Share this post


Link to post
Share on other sites

Thanks for your time.RegExp still is deep mystery for me.

This is the most revelant reply to your question of post #3.

The results are altered by removing the result that has the id=(not a number).

#include <Array.au3>

Global $str = "<test>1</test>id=c&t=a <test>X<test>a</test>id=2&t=a <test>3</Test>id=3&t=a"

$str = StringRegExpReplace($str, "(<.+>)([^<>]*\1)", "\1") ; Remove double "<test>".
;ConsoleWrite($str & @LF)

Global $array = StringRegExp($str, "(?i)(?<=<test>)(.+?)(?=&t=a)", 3) ; Capture look ahead of "<test>" and look behind of "&t=a".
_ArrayDisplay($array) ;  <- The results

; Remove the result if the id is not a number.
For $i = UBound($array) - 1 To 0 Step -1
    If StringRegExp($array[$i], "(id=\D+?)") Then _ArrayDelete($array, $i)
Next
_ArrayDisplay($array)

Share this post


Link to post
Share on other sites

$str = StringRegExpReplace($str, "(<.+>)([^<>]*\1)", "\1") ; Remove double "<test>".

One more quesytion, how can I modify this line in order to remove 2 or more "<test>"?

Share this post


Link to post
Share on other sites

$str = StringRegExpReplace($str, "(<.+>)([^<>]*\1)", "\1") ; Remove double "<test>".

One more quesytion, how can I modify this line in order to remove 2 or more "<test>"?

Can someone help me?

Share this post


Link to post
Share on other sites

#10 ·  Posted (edited)

$str = StringRegExpReplace($str, "(<.+>)([^<>]*\1)", "\1") ; Remove double "<test>".

One more quesytion, how can I modify this line in order to remove 2 or more "<test>"?

This appears to work.

Global $str = "<test>1</test>id=c&t=a <test><test><test>a</test>id=2&t=a <test>6<test>3</Test>id=3&t=a"

$str = StringRegExpReplace($str, "(<.+>)([^<>]*\1)+", "\1") ; Remove double "<test>".
ConsoleWrite($str & @LF)

Edit: Modified the test string in order to remove 2 or more "<test>" as required.

To see the more complicated test string that used to be here see post #12.

Edited by Malkey

Share this post


Link to post
Share on other sites

This appears to work.

Global $str = "<test>1</test>id=c&t=a <test><test>X<test>a</test>id=2&t=a <test>6<test>3</Test>id=3&t=a"

$str = StringRegExpReplace($str, "(<.+>)([^<>]*\1)+", "\1") ; Remove double "<test>".
ConsoleWrite($str & @LF)

Thanks for the help, but it seems don't work.

Global $str = "<test>1</test>id=c&t=a <test><test><test><test>X<test>a</test>id=2&t=a <test>6<test>3</Test>id=3&t=a"

$str = StringRegExpReplace($str, "(<.+>)([^<>]*\1)+", "\1") ; Remove double "<test>".
ConsoleWrite($str & @LF)

Share this post


Link to post
Share on other sites

This may return your desired result.

Global $str = "<test>1</test>id=c&t=a <test><test>X<test><test><test>X<test>a</test>id=2&t=a <test>6<test>6<test>3</Test>id=3&t=a"

#cs
While StringRegExp($str, "(<.+>)\1")
    $str = StringRegExpReplace($str, "(<.+>)([^<>]*\1)+", "\1")
WEnd
#ce

; or

$str = StringRegExpReplace($str, "(<.+>)([^<>]*\1)+", "\1")
$str = StringRegExpReplace($str, "(<.+>)([^<>]*\1)+", "\1")

ConsoleWrite($str & @LF)

Share this post


Link to post
Share on other sites

Thanks for the help, but there is a new problem, if the ID of the $str in ascending order, e.g id=1 to id =4

It will only return the last result.

Global $str = "<test>1</test>id=1&t=a <test>1</test>id=2&t=a <test>1</test>id=3&t=a <test>1</test>id=4&t=a"

#cs
While StringRegExp($str, "(<.+>)\1")
    $str = StringRegExpReplace($str, "(<.+>)([^<>]*\1)+", "\1")
WEnd
#ce

; or

$str = StringRegExpReplace($str, "(<.+>)([^<>]*\1)+", "\1")
$str = StringRegExpReplace($str, "(<.+>)([^<>]*\1)+", "\1")

ConsoleWrite($str & @LF)

The result I want is

<test>1</test>id=1&t=a

<test>1</test>id=2&t=a

<test>1</test>id=3&t=a

<test>1</test>id=4&t=a

insted of

<test>1</test>id=4&t=a

Share this post


Link to post
Share on other sites

Just test it not only happen in ascending order, it aslo happen when it this case

Global $str = "<test>1</test>id=1&t=a <test>1</test>id=12&t=a <test>1</test>id=33&t=a <test>1</test>id=4&t=a"

But for the while loop it works fine

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0