Jump to content

Split string with StringRegExpReplace


atvaxn
 Share

Recommended Posts

Hi everybody,

I am new here and I have a little question for a simple problem:

I would like to quickly access a specific token, without using stringsplit()

with this code, I can access to first token [AA]

$sTOKENFULL = "[AA]__(BB)__{CC}__#DD#"
$sTOKEN1 = StringRegExpReplace($sTOKENFULL, '__.*', '')
MsgBox(64, "", $sTOKEN1)

But how can i get the second, third, fourth token?

I hope you understand what I mean.

Thank you all in advance

Link to comment
Share on other sites

$sTOKENFULL = "[AA]__(BB)__{CC}__#DD#"
$aTOKENs = StringRegExp($sTOKENFULL, '(.+?)(?:__|$)', 3)
_ArrayDisplay($aTOKENs)

But why not StrngSplit?

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Link to comment
Share on other sites

6 minutes ago, jchd said:

But why not StringSplit ?

I would also prefer StringSplit :

#include <Array.au3>
#include <StringConstants.au3>

Global $g_sTokenFull  = '[AA]__(BB)__{CC}__#DD#'
Global $g_sTokenDelim = '__'
Global $g_aTokenArr   = StringSplit ($g_sTokenFull, $g_sTokenDelim, $STR_ENTIRESPLIT)

; Display results :
_ArrayDisplay($g_aTokenArr, 'Token Array')

ConsoleWrite('> Token 1 = ' & $g_aTokenArr[1] & @CRLF)
ConsoleWrite('> Token 2 = ' & $g_aTokenArr[2] & @CRLF)
ConsoleWrite('> Token 3 = ' & $g_aTokenArr[3] & @CRLF)
ConsoleWrite('> Token 4 = ' & $g_aTokenArr[4] & @CRLF)

Musashi-C64.png

"In the beginning the Universe was created. This has made a lot of people very angry and been widely regarded as a bad move."

Link to comment
Share on other sites

Jeez, my attempt at regexp generator is definitely a pile of sh*t: it really takes some much more advanced approach than it currently uses to produce simple and fast patterns.  Infering a suitable pattern from given {subject, expected result} requires too much of high level logic, close to something the current hype calls "AI".  I'm giving up this idea.

I didn't take the time to think at that pattern by myself.

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Link to comment
Share on other sites

4 hours ago, atvaxn said:

without using stringsplit()

Regular expressions are very powerful but can also be a burden, if you don't use them regularly. As you can see, the solutions of @Marc and @jchd already differs from each other.

When you are new to this topic, you maybe better avoid it, otherwise you will not be able to understand or extend your own code three months later. Don't use regular expressions just as an end in itself or because they appear to be cool. It's not wrong to work with string operations when they serve their purpose.

"Just my 2 cents" ;)

Musashi-C64.png

"In the beginning the Universe was created. This has made a lot of people very angry and been widely regarded as a bad move."

Link to comment
Share on other sites

hey everybody,

wow, what a great forum. Thank you for so many answers :)
All of them work really great.

Still, is there any way to make it an oneliner?

for example:

MsgBox(64, "", StringRegExpReplace("[AA]__(BB)__{CC}__#DD#", '__.*', ''))

This compact oneliner displays "[AA]" with just one command.
Now I want it to display the second token "(BB)" or the third... and so on.

I actually use StringSplit a lot and love it too, but now I want to understand, if it is also possible to "regex" it without splitting it in an array.
Just for knowledge purpose :)

Thank all

Link to comment
Share on other sites

the last snippet does the job I want.

Unfortunately, it counts the tokens from the right to the left.

; shows [AA]__(BB)__{CC}
MsgBox(64, "", StringRegExpReplace("[AA]__(BB)__{CC}__#DD#", '__[^__]*$', ''))


; shows [AA]__(BB)
MsgBox(64, "", StringRegExpReplace(StringRegExpReplace("[AA]__(BB)__{CC}__#DD#", '__[^__]*$', ''), '__[^__]*$', ''))


; shows the THIRD token, from right to left (BB)
MsgBox(64, "", StringRegExpReplace(StringRegExpReplace(StringRegExpReplace("[AA]__(BB)__{CC}__#DD#", '__[^__]*$', ''), '__[^__]*$', ''), '.*__', ''))

what pattern do I have to use, to make it count from left to right?

Link to comment
Share on other sites

Link to comment
Share on other sites

Just list the delimiter inside [^ ] like this :

MsgBox ($MB_SYSTEMMODAL,"",_ArrayToString (StringRegExp("[AA]__(BB)||{CC};;#DD#", '[^_|;]+',3),", ")) ; show all 4
MsgBox ($MB_SYSTEMMODAL,"",StringRegExp("[AA]__(BB)||{CC};;#DD#", '[^_|;]+',3)[2]) ; show third

 

Link to comment
Share on other sites

thank you, thats also useful to know :)

but I actually mean, how to use an exact set of delimiters:

For example:
String: "[AA]_(BB)__{CC}___#DD#"
Delimiter: "___" (exact three underscores)
Token1: should be "[AA]_(BB)__{CC}"
Token2: should be "#DD#"

It must be something like [^__{3}]+ but it doesnt work

Link to comment
Share on other sites

Use jchd solution :

MsgBox ($MB_SYSTEMMODAL,"",_ArrayToString (StringRegExp("[AA]__(BB)||{CC};;#DD#", '(.+?)(?:$|_{2}|;{2}|\|{2})',3),", ")) ; show all 4
MsgBox ($MB_SYSTEMMODAL,"",_ArrayToString (StringRegExp("[AA]_(BB)__{CC}___#DD#", '(.+?)(?:$|_{3})',3),", ")) ; show all)

 

Link to comment
Share on other sites

thanks @Nine

It works perfectly. I am really happy, that my code works like a charm now :)
This is really great forum. I'm sure, it wasn't my last question :D

1 hour ago, mikell said:

For the fun :>

but helpfull fun :) it somehow also helps me understand Regex a little better. Thanks

Link to comment
Share on other sites

Just a fair warning

Accessing arrays like this is fine for fixed data it is nice and concise but bad for anything dynamic /  supplied from outside the script.

Quote

MsgBox ($MB_SYSTEMMODAL,"",StringRegExp("[AA]__(BB)||{CC};;#DD#", '[^_|;]+',3)[2]) ; show third

 

When the string is not found its going to give you an array error and very little context

Local $aItem = StringRegExp("[AA]__(BB)||{CC};;#DD#", '[^_|;]+',3)
if Not @error then
    MsgBox ($MB_SYSTEMMODAL,"",$aItem[2]) ; show third
else
    ConsoleWrite("Item not found in foo" & @crlf)
endif
;OR
;Local $aItem = StringRegExp("[AA]__(BB)||{CC};;#DD#", '[^_|;]+',3)
;if UBound($aItem) >= 3 then
;    MsgBox ($MB_SYSTEMMODAL,"",$aItem[2]) ; show third
;else
;   ConsoleWrite("Item not found in foo" & @crlf)
;endif

Here we can catch the error,  give a detailed message, try again, etc.

Edited by Bilgus
Link to comment
Share on other sites

i guess this is the opposite of the srer, but all the others looked weird

$str = "[AA]__(BB)_______{CC}__#DD#"
$n = 3

msgbox(0, '' , stringregexp($str , "([A-Z]+)" , 3)[$n -1])

 

,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-.
|(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/
(_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_)
| | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) (
| | | | |)| | \ / | | | | | |)| | `--. | |) \ | |
`-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_|
'-' '-' (__) (__) (_) (__)

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...