Jump to content

Help needed in StringRegExp


ASut
 Share

Recommended Posts

Hi, it's me again.

I want to extract the data between the <test> and &t=a.

The result I want is

1</test>id=1
a</test>id=2
3</Test>id=3

Can someone help to modify to code in order to make it works.

#Include <Array.au3>
$array = StringRegExp("<test>1</test>id=1&t=a <test>X<test>a</test>id=2&t=a <test>3</Test>id=3&t=a", "(?i)(?:>)([^<]+)(?:&t=a)" ,3)
_ArrayDisplay($array)
Link to comment
Share on other sites

Try this.

#include <Array.au3>

Global $str = "<test>1</test>id=1&t=a <test>X<test>a</test>id=2&t=a <test>3</Test>id=3&t=a"

$str = StringRegExpReplace($str, "(<.+>)([^<>]*\1)", "\1") ; Remove double "<test>".
;ConsoleWrite($str & @LF)

Global $array = StringRegExp($str, "(?i)(?<=<test>)(.+?)(?=&t=a)", 3) ; Capture look ahead of "<test>" and look behind of "&t=a".
_ArrayDisplay($array)
Link to comment
Share on other sites

Try this.

#include <Array.au3>

Global $str = "<test>1</test>id=1&t=a <test>X<test>a</test>id=2&t=a <test>3</Test>id=3&t=a"

$str = StringRegExpReplace($str, "(<.+>)([^<>]*\1)", "\1") ; Remove double "<test>".
;ConsoleWrite($str & @LF)

Global $array = StringRegExp($str, "(?i)(?<=<test>)(.+?)(?=&t=a)", 3) ; Capture look ahead of "<test>" and look behind of "&t=a".
_ArrayDisplay($array)

Thanks a lot,it works. But how can I remove the result if the id is non a number?

e.g <test>1</test>id=c&t=a

Link to comment
Share on other sites

Thanks a lot,it works. But how can I remove the result if the id is non a number?

e.g <test>1</test>id=c&t=a

Try this, where id=(a non-digit).

#include <Array.au3>

Global $str = "<test>1</test>id=c&t=a <test>X<test>a</test>id=2&t=a <test>3</Test>id=3&t=a"

Global $array = StringRegExp($str, "(?i)(?<=<test>)(.+</.*?id=\D+)(?=&t=a)", 3)
_ArrayDisplay($array)

Or, keep the results when id=(a digit)

#include <Array.au3>

Global $str = "<test>1</test>id=c&t=a <test>X<test>a</test>id=2&t=a <test>3</Test>id=3&t=a"

Global $array = StringRegExp($str, "(?i)(?<=<test>)([^<&]+?</[^&]+?>id=\d+?)(?=&t=a)", 3) ; Capture look ahead of "<test>" and look behind of "&t=a".
_ArrayDisplay($array)
Edited by Malkey
Link to comment
Share on other sites

Try this, where id=(a non-digit).

#include <Array.au3>

Global $str = "<test>1</test>id=c&t=a <test>X<test>a</test>id=2&t=a <test>3</Test>id=3&t=a"

Global $array = StringRegExp($str, "(?i)(?<=<test>)(.+</.*?id=\D+)(?=&t=a)", 3)
_ArrayDisplay($array)

Or, keep the results when id=(a digit)

#include <Array.au3>

Global $str = "<test>1</test>id=c&t=a <test>X<test>a</test>id=2&t=a <test>3</Test>id=3&t=a"

Global $array = StringRegExp($str, "(?i)(?<=<test>)([^<&]+?</[^&]+?>id=\d+?)(?=&t=a)", 3) ; Capture look ahead of "<test>" and look behind of "&t=a".
_ArrayDisplay($array)

Thanks, that save me a lot of time.

It works fine for me.

ps: RegExp is so hard to understand

Link to comment
Share on other sites

This is the most revelant reply to your question of post #3.

The results are altered by removing the result that has the id=(not a number).

#include <Array.au3>

Global $str = "<test>1</test>id=c&t=a <test>X<test>a</test>id=2&t=a <test>3</Test>id=3&t=a"

$str = StringRegExpReplace($str, "(<.+>)([^<>]*\1)", "\1") ; Remove double "<test>".
;ConsoleWrite($str & @LF)

Global $array = StringRegExp($str, "(?i)(?<=<test>)(.+?)(?=&t=a)", 3) ; Capture look ahead of "<test>" and look behind of "&t=a".
_ArrayDisplay($array) ;  <- The results

; Remove the result if the id is not a number.
For $i = UBound($array) - 1 To 0 Step -1
    If StringRegExp($array[$i], "(id=\D+?)") Then _ArrayDelete($array, $i)
Next
_ArrayDisplay($array)
Link to comment
Share on other sites

Thanks for your time.RegExp still is deep mystery for me.

This is the most revelant reply to your question of post #3.

The results are altered by removing the result that has the id=(not a number).

#include <Array.au3>

Global $str = "<test>1</test>id=c&t=a <test>X<test>a</test>id=2&t=a <test>3</Test>id=3&t=a"

$str = StringRegExpReplace($str, "(<.+>)([^<>]*\1)", "\1") ; Remove double "<test>".
;ConsoleWrite($str & @LF)

Global $array = StringRegExp($str, "(?i)(?<=<test>)(.+?)(?=&t=a)", 3) ; Capture look ahead of "<test>" and look behind of "&t=a".
_ArrayDisplay($array) ;  <- The results

; Remove the result if the id is not a number.
For $i = UBound($array) - 1 To 0 Step -1
    If StringRegExp($array[$i], "(id=\D+?)") Then _ArrayDelete($array, $i)
Next
_ArrayDisplay($array)

Link to comment
Share on other sites

$str = StringRegExpReplace($str, "(<.+>)([^<>]*\1)", "\1") ; Remove double "<test>".

One more quesytion, how can I modify this line in order to remove 2 or more "<test>"?

This appears to work.

Global $str = "<test>1</test>id=c&t=a <test><test><test>a</test>id=2&t=a <test>6<test>3</Test>id=3&t=a"

$str = StringRegExpReplace($str, "(<.+>)([^<>]*\1)+", "\1") ; Remove double "<test>".
ConsoleWrite($str & @LF)

Edit: Modified the test string in order to remove 2 or more "<test>" as required.

To see the more complicated test string that used to be here see post #12.

Edited by Malkey
Link to comment
Share on other sites

This appears to work.

Global $str = "<test>1</test>id=c&t=a <test><test>X<test>a</test>id=2&t=a <test>6<test>3</Test>id=3&t=a"

$str = StringRegExpReplace($str, "(<.+>)([^<>]*\1)+", "\1") ; Remove double "<test>".
ConsoleWrite($str & @LF)

Thanks for the help, but it seems don't work.

Global $str = "<test>1</test>id=c&t=a <test><test><test><test>X<test>a</test>id=2&t=a <test>6<test>3</Test>id=3&t=a"

$str = StringRegExpReplace($str, "(<.+>)([^<>]*\1)+", "\1") ; Remove double "<test>".
ConsoleWrite($str & @LF)
Link to comment
Share on other sites

This may return your desired result.

Global $str = "<test>1</test>id=c&t=a <test><test>X<test><test><test>X<test>a</test>id=2&t=a <test>6<test>6<test>3</Test>id=3&t=a"

#cs
While StringRegExp($str, "(<.+>)\1")
    $str = StringRegExpReplace($str, "(<.+>)([^<>]*\1)+", "\1")
WEnd
#ce

; or

$str = StringRegExpReplace($str, "(<.+>)([^<>]*\1)+", "\1")
$str = StringRegExpReplace($str, "(<.+>)([^<>]*\1)+", "\1")

ConsoleWrite($str & @LF)
Link to comment
Share on other sites

Thanks for the help, but there is a new problem, if the ID of the $str in ascending order, e.g id=1 to id =4

It will only return the last result.

Global $str = "<test>1</test>id=1&t=a <test>1</test>id=2&t=a <test>1</test>id=3&t=a <test>1</test>id=4&t=a"

#cs
While StringRegExp($str, "(<.+>)\1")
    $str = StringRegExpReplace($str, "(<.+>)([^<>]*\1)+", "\1")
WEnd
#ce

; or

$str = StringRegExpReplace($str, "(<.+>)([^<>]*\1)+", "\1")
$str = StringRegExpReplace($str, "(<.+>)([^<>]*\1)+", "\1")

ConsoleWrite($str & @LF)

The result I want is

<test>1</test>id=1&t=a

<test>1</test>id=2&t=a

<test>1</test>id=3&t=a

<test>1</test>id=4&t=a

insted of

<test>1</test>id=4&t=a

Link to comment
Share on other sites

Just test it not only happen in ascending order, it aslo happen when it this case

Global $str = "<test>1</test>id=1&t=a <test>1</test>id=12&t=a <test>1</test>id=33&t=a <test>1</test>id=4&t=a"

But for the while loop it works fine

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...