Jump to content
therks

Regular expression - capture a character when "escaped", split otherwise

Recommended Posts

therks
Posted (edited)

I'm looking for a regex genius, cus I'm stumped when it comes to assertions.

So what I have now, is this regular expression: ([^|=]+)=([^|]+)
It takes a string (user input) of keys=values separated by pipes (ie: "param=value|param=value") and splits them into an array.

Example:

$vParamData = 'example=value|fruit=apple|phrase=Hello world'
$aRegEx = StringRegExp($vParamData, '([^|=]+)=([^|]+)', 3)

; Result
;   [0] => example
;   [1] => value
;   [2] => fruit
;   [3] => apple
;   [4] => phrase
;   [5] => Hello world

So that's working fine, but I'm wondering if there's also a way I could have this capture escaped pipes instead of splitting by them.

ie:

$vParamData = 'pipe test=this \| is a pipe|example=value'
$aRegEx = StringRegExp($vParamData, '([^|=]+)=([^|]+)', 3)

; I'm getting this:
;   [0] => pipe test
;   [1] => this \
;   [2] => example
;   [3] => value

; But I'd like a result like this:
;   [0] => pipe test
;   [1] => this \| is a pipe
;   [2] => example
;   [3] => value

Is there some pattern that would accomplish this, or am I better off parsing it some other way?

Edited by therks

Share this post


Link to post
Share on other sites
iamtheky

there are more efficient ways to do the pieces, but here is replacement method that simply avoids the escaped pipes and then slaps it all back together

#include<array.au3>

$vParamData = 'pipe test=this \| is a pipe|example=value'
$aData = stringsplit(stringreplace(_ArrayToString(stringsplit(stringreplace($vParamData , "\|" , "\*") , "|" , 2) , "=") , "\*" , "\|") , "=" , 2)
_ArrayDisplay($aData)

 

  • Thanks 1

,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-.
|(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/
(_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_)
| | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) (
| | | | |)| | \ / | | | | | |)| | `--. | |) \ | |
`-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_|
'-' '-' (__) (__) (_) (__)

Share this post


Link to post
Share on other sites
BugFix
; mask the pipe with a char combination, i.e. "@#"

$vParamData = 'pipe test=this @# is a pipe|example=value'
$aRegEx = StringRegExp($vParamData, '([^|=]+)=([^|]+)', 3)
For $i = 0 To UBound($aRegEx) -1
    $aRegEx[$i] = StringReplace($aRegEx[$i], '@#', '|') ; now replace with the pipe
Next

_ArrayDisplay($aRegEx)

 

  • Thanks 1

Best Regards BugFix  

Share this post


Link to post
Share on other sites
iamtheky
Posted (edited)

Might be able to pull off reusing the capture....

$vParamData = 'pipe test=this \| is a pipe|example=value'
msgbox(0, '' , stringreplace(stringregexpreplace($vParamData , "(\\\|)|(=)|(\|)" , "$1" & @LF  ) , "\|" & @LF , "\|"))

 

and without the additional stringreplace, but becoming much more fragile (but works for this specific case).

$vParamData = 'pipe test=this \| is a pipe|example=value'

msgbox(0, '' , stringregexpreplace($vParamData , "(\\\|.*)\||=|(=)|(\|)" , "$1" & @LF ))

 

Edited by iamtheky
  • Thanks 1

,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-.
|(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/
(_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_)
| | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) (
| | | | |)| | \ / | | | | | |)| | `--. | |) \ | |
`-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_|
'-' '-' (__) (__) (_) (__)

Share this post


Link to post
Share on other sites
jchd

Another skin:

$vParamData = 'pipe test=this \| is a pipe|example=value|another\|example=five|last \|one = one\|two\|three'
$aRegEx = StringRegExp($vParamData, '(.+?)=((?:\\\||[^\\|])+)(?:\||$)', 3)
_ArrayDisplay($aRegEx)

 

  • Like 1
  • Thanks 1

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites
mikell

:)

#Include <Array.au3>
$vParamData = 'pipe test=this \| is a pipe|example=value|another\|example=five|last \|one = one\|two\|three'
$aRegEx = StringSplit(StringRegExpReplace($vParamData, '(?<!\\)\|', '='), "=", 3)
_ArrayDisplay($aRegEx)

 

  • Thanks 1

Share this post


Link to post
Share on other sites
jchd

:evil:

Real programmers use a single function call, quiche eaters use two.
© Ed Post. https://www.ee.ryerson.ca/~elf/hack/realmen.html

  • Haha 1

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites
mikell

Real Programmers aren't afraid to use GOTOs. :P
So I assume that these 1982 scientists would actually use in AutoIt function calls instead of GOTOs

BTW I'm a fake programmer indeed. Does \Q..\E mean literal quiche eater ? :huh2:

Share this post


Link to post
Share on other sites
jchd

Computed indirect GOTOs, preferably.


This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

  • Similar Content

    • therks
      By therks
      So I have this pattern: 
      ^(?:(\d+)|(\d+):(\d+)|(\d+):(\d+):(\d+))$ And I'm expecting (depending on input) to get a 1, 2 or 3 index array (or @error for invalid input).
      But instead I get this:
      #include <Debug.au3> Func Test($String) _DebugArrayDisplay(StringRegExp($String, '^(?:(\d+)|(\d+):(\d+)|(\d+):(\d+):(\d+))$', 1)) EndFunc Test('10') ; Results (normal, expected): ; Row 0|10 Test('10:20') ; Results (extra blank index): ; Row 0| ; Row 1|10 ; Row 2|20 Test('10:20:30') ; Results (three blank indices): ; Row 0| ; Row 1| ; Row 2| ; Row 3|10 ; Row 4|20 ; Row 5|30 Is this normal? Should I just code around it, or is there a better way to do what I'm looking for?
      I also tried reversing my regex, but it was even uglier results:
      #include <Debug.au3> Func Test($String) _DebugArrayDisplay(StringRegExp($String, '^(?:(\d+):(\d+):(\d+))|(\d+):(\d+)|(\d+)$', 1)) EndFunc Test('10') ; Results (yuck): ; Row 0| ; Row 1| ; Row 2| ; Row 3| ; Row 4| ; Row 5|10 Test('10:20') ; Results (slightly better): ; Row 0| ; Row 1| ; Row 2| ; Row 3|10 ; Row 4|20 Test('10:20:30') ; Results (nice): ; Row 0|10 ; Row 1|20 ; Row 2|30  
    • Deye
      By Deye
      Hi,
      I want to add any needed conditions to the StringRegExp command so it can pull out only  "File.au3", "WinAPIFiles.au3", "Test.bmp" into the array
      #include <FileConstants.au3> #include <MsgBoxConstants.au3> #include 'WinAPIFiles.au3' #include "File.au3" ; Script Start - Add your code below here Local $bFileInstall = False ; Change to True and ammend the file paths accordingly. ; This will install the file C:\Test.bmp to the script location. If $bFileInstall Then FileInstall("C:\Test.bmp", @ScriptDir & "\Test.bmp") $sFile = FileRead(@ScriptFullPath) $aResults = StringRegExp($sFile, "(?i)(FileInstall\s*|include\s*)(.*)", 3) _ArrayDisplay($aResults) Thanks In Advance
      Deye
    • FroVN
      By FroVN
      i have a text : <Name>Jonh</Name>.<Age>15</Age>
      how i can get Jonh and 15 in one stringregexp? pls give me example
    • Chimp
      By Chimp
      regex and iso escape sequences
      Hi, I would like to extract all ISO escape squences embedded in a string and separate them from the rest of the string, still keeping the information about their position, so that, for exemple, a string like this one (or even more complex):
      (the string could start with normal text or iso sequences)
       
      '\u001B[4mUnicorn\u001B[0m' should be 'transformed' in an array like this
      $a[0] = '\u001B[4m' ; first iso escape sequence $a[1] = 'Unicorn' ; normal text $a[2] = '\u001B[4m' ; second iso escape sequence ... and so on (note: the above escape sequence has 'control codes' marked as "\u001B' for the asc "esc" char for exemple and a similar notation is also used for other control chars, but in the real string to be parsed those control chars  are embedded  as a single byte with a value from 01 to 31). at this link (http://artscene.textfiles.com/ansi/) there are many example of real ANSI text files .
      searching on the web I've found some possible solutions that make use of regexp to achieve similar purpose, and above some others, the regexp pattern posted in the following link by kfir (https://stackoverflow.com/questions/14693701/how-can-i-remove-the-ansi-escape-sequences-from-a-string-in-python) seems to be able to catch a wider range of ISO escape sequences (not only color sequences), but my lack of skills on regexp, prevents me from evaluating and testing such patterns
      I would be very grateful if some regexp guru could come to my rescue...
      thanks everybody  for reading...
    • ur
      By ur
      I am trying to identify the window based on the window title and text.
      The title will be the "erwin DM - filename"

      It is working till date, but some operating systems our application is displaying window as "erwin DM - [filename]"
       
      I tried  "erwin DM - *filename*" But this regular expression is not working.
      Any suggestion?
       
      $sModelFile = "C:\Users\Administrator\Documents\My Models\eMovies.erwin" $wdModel = _WinWaitActivate1("erwin DM - "&FileNameOnly($sModelFile),"") Func _WinWaitActivate1($title,$text,$timeout=0);Will Return the window Handler Logging("Waiting for "&$title&":"&$text) $dHandle = WinWait($title,$text,$timeout) if not ($dHandle = 0) then If Not WinActive($title,$text) Then WinActivate($title,$text) return WinWaitActive($title,$text,$timeout) Else Logging("Timeout occured while waiting for the window...") Exit EndIf EndFunc Func FileNameOnly($sFilePath) Local $sDrive = "", $sDir = "", $sFileName = "", $sExtension = "" Local $aPathSplit = _PathSplit($sFilePath, $sDrive, $sDir, $sFileName, $sExtension) ;_ArrayDisplay($aPathSplit, "_PathSplit of " & @ScriptFullPath) return $sFileName EndFunc  
×