Sign in to follow this  
Followers 0
Zedna

RegExp - return values in parenthesis at end of string

9 posts in this topic

#1 ·  Posted (edited)

I have constants A and B and C, in fact they are not one char but whole word
and they can be included at the end of test text in parenthesis
there can be only one of them but also two of them or all three separated by coma in any order
(A) or (A,B ) or (C,A) or (A,B,C) or (A,C,B ) ...
I need to get text in these parenthesis
 

; in comment at end of each line is what I want to get
Test1('some text') ; ''
Test1('some text (something)') ; ''
Test1('some (something) text') ; '' --> ignore other () not at the end
Test1('some (something) (A) text') ; '' --> I want only at end of string
Test1('some (something) text (A)') ; 'A'
Test1('some text (A)') ; A
Test1('some text (B)') ; B
Test1('some text (A,B,C)') ; 'A,B,C'
Test1('some text (A,C,B)') ; 'A,C,B'
Test1('some text (A,C,X)') ; 'A,C' --> not X
Test1('some text (ABC)') ; '' --> missing ,

Func Test1($text)
    $regexp = '.*? \(([A|B|C])\)'
    $ret = StringRegExpReplace($text, $regexp, '$1')
    ConsoleWrite('regexp: ' & $regexp & ' text: ' & $text & ' --> ' & $ret & @CRLF)

    $regexp = '\z\(([A|B|C])\)'
    $ret = StringRegExpReplace($text, $regexp, '$1')
    ConsoleWrite('regexp: ' & $regexp & ' text: ' & $text & ' --> ' & $ret & @CRLF)

    $regexp = '\z\(([A|B|C]{1,3})\)'
    $ret = StringRegExpReplace($text, $regexp, '$1')
    ConsoleWrite('regexp: ' & $regexp & ' text: ' & $text & ' --> ' & $ret & @CRLF)

    $regexp = '\z\(([A|B|C|,]{1,3})\)'
    $ret = StringRegExpReplace($text, $regexp, '$1')
    ConsoleWrite('regexp: ' & $regexp & ' text: ' & $text & ' --> ' & $ret & @CRLF)

    $regexp = '.*?\(([A|B|C]{1,3})\)'
    $ret = StringRegExp($text, $regexp, 3)
    If Not @error Then
        $ret = $ret[0]
    Else
        $ret = ''
    EndIf
    ConsoleWrite('not replace: regexp: ' & $regexp & ' text: ' & $text & ' --> ' & $ret & @CRLF)

    ConsoleWrite(@CRLF)
EndFunc

I don't know how to use z (at end of string) and how to incorporate coma separator into my RegExp expression
these my RexExp expressions are not working even for simple one value

In function Test1() should be only one correct RexExp but I have there more because I want to show some of my attempts.

EDIT: fixed mistake in last RegExp

I hope that for RegExp gurus this will be very easy :-)

Edited by Zedna

Share this post


Link to post
Share on other sites



A and B and C, in fact they are not one char but whole word

So you can't use character class

Try this :)

#Include <Array.au3>

; in comment at end of each line is what I want to get
Test1('some text') ; ''
Test1('some text (something)') ; ''
Test1('some (something) text') ; '' --> ignore other () not at the end
Test1('some (something) (A) text') ; '' --> I want only at end of string
Test1('some (something) text (A)') ; 'A'
Test1('some text (A)') ; A
Test1('some text (B)') ; B
Test1('some text (A,B,C)') ; 'A,B,C'
Test1('some text (A,C,B)') ; 'A,C,B'
Test1('some text (A,C,X)') ; 'A,C' --> not X
Test1('some text (ABC)') ; '' --> missing ,

Func Test1($text)
   Local $ret
   $ret = StringRegExpReplace($text, '(?x).*(  \(.*?\)$  )', "$1")
   If @extended = 0 Then $ret = ""
   $aret = StringRegExp($ret, '(?x)  (?<=\(|,)(A|B|C)(?=,|\))  ', 3)
   $ret = _ArrayToString($aret, ",")
   If $ret = -1 Then $ret = ""
   Msgbox(0,"", "text: " & $text & @crlf & "result: " & $ret)
EndFunc

Share this post


Link to post
Share on other sites

#3 ·  Posted (edited)

subquestion about using z in StringRegExp() - at the end of string

 

I absolutely don't know where and how to place z in RegExp patern.

 

Here is simpler version of my previous example:

I just want to get text contained in parenthesis but only in case that parenthesis are at the end of string

Test1('some text') ; ''
Test1('some text (something)') ; 'something'
Test1('some (something) text') ; '' --> ignore other () not at the end ==> here my RegExp returns 'something' because lack of \z
Test1('some (something) (A) text') ; '' --> I want only at end of string
Test1('some (something) text (A)') ; 'A'
Test1('some text (A)') ; A
Test1('some text (B)') ; B
Test1('some text (A,B,C)') ; 'A,B,C'
Test1('some text (A,C,B)') ; 'A,C,B'
Test1('some text (A,C,X)') ; 'A,C,X'
Test1('some text (ABC)') ; 'ABC'

Func Test1($text)
    $regexp = '.*?\((.*?)\)'
    $ret = StringRegExp($text, $regexp, 3)
    If Not @error Then
        $j = UBound($ret) - 1
        $ret = $ret[$j]
    Else
        $ret = ''
    EndIf
    ConsoleWrite('regexp: ' & $regexp & ' text: ' & $text & ' --> ' & $ret & @CRLF)
EndFunc
Output:

regexp: .*?\((.*?)\) text: some text --> 
regexp: .*?\((.*?)\) text: some text (something) --> something
regexp: .*?\((.*?)\) text: some (something) text --> something
regexp: .*?\((.*?)\) text: some (something) (A) text --> A
regexp: .*?\((.*?)\) text: some (something) text (A) --> A
regexp: .*?\((.*?)\) text: some text (A) --> A
regexp: .*?\((.*?)\) text: some text (B) --> B
regexp: .*?\((.*?)\) text: some text (A,B,C) --> A,B,C
regexp: .*?\((.*?)\) text: some text (A,C,B) --> A,C,B
regexp: .*?\((.*?)\) text: some text (A,C,X) --> A,C,X
regexp: .*?\((.*?)\) text: some text (ABC) --> ABC
 

Please help me to put z correctly into my RegExp to get correct result also in third test text.

Thanks.

Edited by Zedna

Share this post


Link to post
Share on other sites

#4 ·  Posted (edited)

Didn't you look at my code ?

In the first regex used, you can replace $ by z and get nearly the same as yours

Example

#Include <Array.au3>

Global $words[] = [3, "A", "B", "C"]

Test1('some text', $words) ; ''
Test1('some text (something)', $words) ; ''
Test1('some (something) text', $words) ; '' --> ignore other () not at the end
Test1('some (something) (A) text', $words) ; '' --> I want only at end of string
Test1('some (something) text (A)', $words) ; 'A'
Test1('some text (A)', $words) ; A
Test1('some text (B)', $words) ; B
Test1('some text (A,B,C)', $words) ; 'A,B,C'
Test1('some text (A,C,B)', $words) ; 'A,C,B'
Test1('some text (A,C,X)', $words) ; 'A,C' --> not X
Test1('some text (ABC)', $words) ; '' --> missing ,

Func Test1($text, $wds)
   Local $ret, $pattern = '('
   For $i = 1 to $wds[0]-1
      $pattern &= $wds[$i] & '|'
   Next
   $pattern &= $wds[$wds[0]] & ')'

   $ret = StringRegExpReplace($text, '(?x).*(  \(.*?\)\z  )', "$1")
   If @extended = 0 Then $ret = ""
   $aret = StringRegExp($ret, '(?x)  (?<=\(|,)' & $pattern & '(?=,|\))  ', 3)
   $ret = _ArrayToString($aret, ",")
   If $ret = -1 Then $ret = ""
   Msgbox(0,"", "text: " & $text & @crlf & "result: " & $ret)
EndFunc

Edit

In your code it should be like this (w/o the first question mark)

$regexp = '.*\((.*?)\)\z'
Edited by mikell

Share this post


Link to post
Share on other sites

#6 ·  Posted (edited)

This one was not so difficult

The first regex grabs the final parenthesis, you already know it because it's the same as yours

The 2nd one

(?<=\(|,)(A|B|C)(?=,|\))

means :

(?<=(|,) preceded by an opening parenthesis or a comma

(A|B|C) alternation in a capturing group

(?=,|)) followed by a comma or closing parenthesis

Edited by mikell

Share this post


Link to post
Share on other sites

The Reg.Exp. patterns in this example use the "z", an end of string anchor.

Test1('some text') ; ''
Test1('some text (something)') ; 'something'
Test1('some (something) text') ; '' --> ignore other () not at the end ==> here my RegExp returns 'something' because lack of \z
Test1('some (something) (A) text') ; '' --> I want only at end of string
Test1('some (something) text (A)') ; 'A'
Test1('some text (A)') ; A
Test1('some text (B)') ; B
Test1('some text (A,B,C)') ; 'A,B,C'
Test1('some text (A,C,B)') ; 'A,C,B'
Test1('some text (A,C,X)') ; 'A,C,X'
Test1('some text (ABC)') ; 'ABC'

Func Test1($text)
    ;$regexp = '(?<=\()([^()]+)(?=\)\z)' ; Modified mikell's expression a look behind and look forward assertions also works.
    
    $regexp = '\(([^()]+)\)\z' ; \z, \Z, or $ could be used. They are all the same because all the test strings have no vertical white
    ; spaces (line feed characters).  Even if "(?m)" was present, the end of the string, "\z" or "\Z", would be the same as the end of line, "$".
    ; The \z, \Z, or $ anchors the close bracket and the correspondng open bracket to the end of the string.
    
    $ret = StringRegExp($text, $regexp, 3)
    If Not @error Then
        $ret = $ret[0] ; There can be only one element in the array, being the non-open bracket and non-close bracket characters within
        ;                parenthesis at the end of the string.
    Else
        $ret = ''
    EndIf
    ConsoleWrite('regexp: ' & $regexp & ' text: ' & $text & ' --> ' & $ret & @CRLF)
EndFunc   ;==>Test1

#cs Returns:-
regexp: \(([^()]+?)\)\z text: some text -->
regexp: \(([^()]+?)\)\z text: some text (something) --> something
regexp: \(([^()]+?)\)\z text: some (something) text -->
regexp: \(([^()]+?)\)\z text: some (something) (A) text -->
regexp: \(([^()]+?)\)\z text: some (something) text (A) --> A
regexp: \(([^()]+?)\)\z text: some text (A) --> A
regexp: \(([^()]+?)\)\z text: some text (B) --> B
regexp: \(([^()]+?)\)\z text: some text (A,B,C) --> A,B,C
regexp: \(([^()]+?)\)\z text: some text (A,C,B) --> A,C,B
regexp: \(([^()]+?)\)\z text: some text (A,C,X) --> A,C,X
regexp: \(([^()]+?)\)\z text: some text (ABC) --> ABC
#ce

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0

  • Similar Content

    • Robinson1
      By Robinson1
      Well the plan is to use the power of regular expressions engine of AutoIT for patching binary data.
      Something like this: StringRegExp( $BinaryData,  "(?s)\x55\x8B.."
       
      <cut> ... Okay straight to question/problem
      ... certain bytes that are in the range from 0x80 to 0xA0 won't match.
      Hmm seem to be a char encoding problem. In detail these are 27 chars: 0x80, 0x82~8C, 0x8E, 0x91~9C, 0x9E,0x9F
      Here's a small code snippet to explore / explain this problem:
      #include "StringConstants.au3" $TestData = BinaryToString("0x7E7F808182") ;Okay $match = StringRegExp( $TestData ,'\x7E' ,$STR_REGEXPARRAYFULLMATCH) ConsoleWrite('@extended = ' & @extended & ' $match = ' & $match & @CRLF) ;Okay $match = StringRegExp( $TestData ,'\x7F' ,$STR_REGEXPARRAYFULLMATCH) ConsoleWrite('@extended = ' & @extended & ' $match = ' & $match & @CRLF) ;Error no match $match = StringRegExp( $TestData ,'\x80' ,$STR_REGEXPARRAYFULLMATCH) ConsoleWrite('@extended = ' & @extended & ' $match = ' & $match & @CRLF) ;Okay $match = StringRegExp( $TestData ,'\x81' ,$STR_REGEXPARRAYFULLMATCH) ConsoleWrite('@extended = ' & @extended & ' $match = ' & $match & @CRLF) ;Error no match $match = StringRegExp( $TestData ,'\x82' ,$STR_REGEXPARRAYFULLMATCH) ConsoleWrite('@extended = ' & @extended & ' $match = ' & $match & @CRLF) ;~ output: ;~ @extended = 2 $match = ;~ @extended = 3 $match = ;~ @extended = 0 $match = 1 ;~ @extended = 5 $match = ;~ @extended = 0 $match = 1 Hmm what to do? Go back and use the 'numberstring monster' implementation or just omit that range of 'unsafe bytes'. What is the root of this problem?
      Any idea how to fix this?
       
      Update: Okay I know a byte is not a character.
      But StringRegExp operates on String and so character level.
      Okay as long as you stay at Ansi encoding and only use /x00 - /X7F in the search pattern using  StringRegExp works well to search for binary data.
      What bytes can be matched that are in the range from /X7F - /xFF is also depending on the code page.
      So this avoid to search for bytes in the range from 0x80-0xa0 only applies to Germany.
      I just change this country setting:

      to Thai and now near all bytes from /X7F - /xFF fails to match.
    • RichardL
      By RichardL
      Text in a file, read into var with fileread:
      <> <> <> <> < J please look > <> <> <> Hi, 
      I want  a RegExp to select around 'please', back to the previous < and forward to the next >.  I can select the line of text.  Then I add in (?s) and it selects the whole text.  I think I want to make it not greedy, (?U) , that seems to make it ungreedy after, but it still selects all the previous lines.
      $sPattern = "(?s)<.*please.*>" ; 1 $sPattern = "(?s)<(?U).*please.*>" ; 2 $sPattern = "(?s)<(?U).*please(?U).*>" ; 3 $sAry = StringRegExp($sHTML, $sPattern, 3)  
    • JohnNash
      By JohnNash
      I want to rename every new instance of notepad to notepad(random number)
      If I use WinSetTitle ( "notepad", "", "notepad("&$randomnumber&")" )
      this will work pretty good, because if more windows match the search entry it will take the newest. But what if this code runs, but there is no new instance of notepad. It will rename one that was already assigned a number. So I would like to check whether it is already renamed. For example by excluding titles that contain a ")".
      How do I do that. 
      Read this, but that is pretty confusing: http://stackoverflow.com/questions/406230/regular-expression-to-match-line-that-doesnt-contain-a-word?rq=1
    • Mingre
      By Mingre
      #include <Array.au3> ; Script Start - Add your code below here Local $test = "<li>One<li>Inner<li>Innermost</li></li></li>" & _ "<li>Two</li> " $loob = StringRegExp($test, '\Q<li>\E(.*?)\Q</li>\E', 3) _ArrayDisplay($loob, "How to return the One... and Two?") Hello, can somebody help me:
      (1) How can I have the regexp matched the two outermost bullets? Such that:
       
      (2) How can I match the "Innermost" bullet?
      Thanks so much.
    • mLipok
      By mLipok
      #include <Array.au3> If @Compiled Then Exit Global Enum $FUNC_OUTER, $FUNC_NAME, $FUNC_PARAM, $FUNC_INNER _Example() Func _Example() Local $sIncludeDir = StringTrimRight(@AutoItExe, StringLen('AutoIt3.exe')) & 'Include\' Local $aOuterArray = _GetFunctionsToArray($sIncludeDir & 'Color.au3') If Not @error Then For $iOuter_idx = 0 To UBound($aOuterArray) - 1 _ArrayDisplay($aOuterArray[$iOuter_idx], ($aOuterArray[$iOuter_idx])[$FUNC_NAME]) Next EndIf EndFunc ;==>_Example Func _GetFunctionsToArray($sUDF_FileFullPath) Local $sUDFContent = FileRead($sUDF_FileFullPath) Local $aResult = StringRegExp($sUDFContent, '(?is)\RFunc (.*?)\((.*?)\)\v\R(.*?)\REndFunc', $STR_REGEXPARRAYGLOBALFULLMATCH) Return SetError(@error, @extended, $aResult) EndFunc ;==>_GetFunctionsToArray