Sign in to follow this  
Followers 0
supersonic

RegEx - Trying to 'split string in string'

21 posts in this topic

#1 ·  Posted (edited)

Hi!

RegEx! ... once again. I'm sorry for that... But I don't get it solved on my own... :)

Here's the code I have so far:

#include <Array.au3>

Local $sTmp = ""

$sTmp &= "#4096750#" & @LF
$sTmp &= "#4096756#,DvIFPwMcjKaVa04PlLQxDg1AKf8=" & @LF
$sTmp &= "#4096756#,DvIFPwMcjKaVa04PlLQxDg1AKf8=,EN28JlozIKrsBlTtM8ZKWw9auck=" & @LF
$sTmp &= "#4096756#,DvIFPwMcjKaVa04PlLQxDg1AKf8=,EN28JlozIKrsBlTtM8ZKWw9auck=,FNGcJ7KMw5kCYNcZH24P9sdINiM=" & @LF
$sTmp &= "#4093600#,F47T1OveZnEgmjKE84Nfm+8bQQ4=;BundledBy,#4093601#" & @LF

Local $aTmp = StringRegExp($sTmp, ",(.*)\n|,|;", 3)

_ArrayDisplay($aTmp)

What I like to achieve:

- Read a '@LF'-delimited line, if it contains a ','.

- If a line contains a ',':

a] ... read it to the end of the line (= '\n')

b] ... or read the whole line and return all ','-delimited elements

c] ... or return the element between ',' and ';'.

The result should be an array containing this:

[0] DvIFPwMcjKaVa04PlLQxDg1AKf8=

[1] DvIFPwMcjKaVa04PlLQxDg1AKf8=

[2] EN28JlozIKrsBlTtM8ZKWw9auck=

[3] DvIFPwMcjKaVa04PlLQxDg1AKf8=

[4] EN28JlozIKrsBlTtM8ZKWw9auck=

[5] FNGcJ7KMw5kCYNcZH24P9sdINiM=

[6] F47T1OveZnEgmjKE84Nfm+8bQQ4=

Please, could somebody give me a clue?

Greets,

-supersonic.

Edited by supersonic

Share this post


Link to post
Share on other sites



supersonic,

Don't know 'bout regexp but this is how I would do this (if I understand you correctly):

$arr1 = stringsplit($stmp,@lf)

for $i = 1 to $arr1[0]
    $arr2 = stringsplit($arr1[$i],',')          ; each element is a line
    
        for $j = 1 to $arr2[0]                  ; each element is a comma delimited value within the current line
            ;
            ;  do somethig with the segment...
            ;
        Next
next

kylomas


Forum Rules         Procedure for posting code

"I like pigs.  Dogs look up to us.  Cats look down on us.  Pigs treat us as equals."

- Sir Winston Churchill

Share this post


Link to post
Share on other sites

Try this.

#include <Array.au3>

Local $sTmp = ""

$sTmp &= "#4096750#" & @LF
$sTmp &= "#4096756#,DvIFPwMcjKaVa04PlLQxDg1AKf8=" & @LF
$sTmp &= "#4096756#,DvIFPwMcjKaVa04PlLQxDg1AKf8=,EN28JlozIKrsBlTtM8ZKWw9auck=" & @LF
$sTmp &= "#4096756#,DvIFPwMcjKaVa04PlLQxDg1AKf8=,EN28JlozIKrsBlTtM8ZKWw9auck=,FNGcJ7KMw5kCYNcZH24P9sdINiM=" & @LF
$sTmp &= "#4093600#,F47T1OveZnEgmjKE84Nfm+8bQQ4=;BundledBy,#4093601#" & @LF

Local $aTmp = StringRegExp($sTmp, ",(.+?=)", 3)

_ArrayDisplay($aTmp)

Share this post


Link to post
Share on other sites

malkey,

what happens to the end of the string that is not delimited by a "=" sign?

kylomas


Forum Rules         Procedure for posting code

"I like pigs.  Dogs look up to us.  Cats look down on us.  Pigs treat us as equals."

- Sir Winston Churchill

Share this post


Link to post
Share on other sites

Malkey,

thank you for that... It works fine! :)

But as kylomas stated, how to handle items without a '=' at the end?

Share this post


Link to post
Share on other sites

supersonic,

Try this one - it copes with elements without "=" at the end: :)

#include <Array.au3>

Local $sTmp = ""

$sTmp &= "#4096750#" & @LF
$sTmp &= "#4096756#,DvIFPwMcjKaVa04PlLQxDg1AKf8" & @LF
$sTmp &= "#4096756#,DvIFPwMcjKaVa04PlLQxDg1AKf8,EN28JlozIKrsBlTtM8ZKWw9auck=" & @LF
$sTmp &= "#4096756#,DvIFPwMcjKaVa04PlLQxDg1AKf8=,EN28JlozIKrsBlTtM8ZKWw9auck,FNGcJ7KMw5kCYNcZH24P9sdINiM=" & @LF
$sTmp &= "#4093600#,F47T1OveZnEgmjKE84Nfm+8bQQ4=;BundledBy,#4093601#" & @LF

Local $aTmp = StringRegExp($sTmp, "(?U),(.*)(?=[\v,;^#])", 3)

_ArrayDisplay($aTmp)

The results match your requirements in the first post. :)

M23


Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind._______My UDFs:

Spoiler

ArrayMultiColSort ---- Sort arrays on multiple columns
ChooseFileFolder ---- Single and multiple selections from specified path treeview listing
Date_Time_Convert -- Easily convert date/time formats, including the language used
ExtMsgBox --------- A highly customisable replacement for MsgBox
GUIExtender -------- Extend and retract multiple sections within a GUI
GUIFrame ---------- Subdivide GUIs into many adjustable frames
GUIListViewEx ------- Insert, delete, move, drag, sort, edit and colour ListView items
GUITreeViewEx ------ Check/clear parent and child checkboxes in a TreeView
Marquee ----------- Scrolling tickertape GUIs
NoFocusLines ------- Remove the dotted focus lines from buttons, sliders, radios and checkboxes
Notify ------------- Small notifications on the edge of the display
Scrollbars ----------Automatically sized scrollbars with a single command
StringSize ---------- Automatically size controls to fit text
Toast -------------- Small GUIs which pop out of the notification area

 

Share this post


Link to post
Share on other sites

supersonic,

Let me borrow this thread for a minute...

@malkey and M23 - splitting a file at @CRLF and then splitting each record at some delimiter is a common deal. Do you guys always use regexp for this? I have a feeling that I've been doing this half-ass backward with the multiple splits and iterations.

kylomas


Forum Rules         Procedure for posting code

"I like pigs.  Dogs look up to us.  Cats look down on us.  Pigs treat us as equals."

- Sir Winston Churchill

Share this post


Link to post
Share on other sites

kylomas,

It all depends on the form of the information you wish to extract. By definition you need a pattern of some sort if you want to use an SRE - if it is not "regular" then you are wasting your time! But SREs are not the only way to go - as GEOSoft often remarks, one of tricks of working with SREs is knowing when not to use them - although the little devils are really useful when you do! :)

So do not give up on iterative splits - they still have a place at times. But learning to use SREs (even to my amateurish level) is a valuable skill - and this is a good place to start your education. I do not want to put you off - but be warned, SREs are probably the most difficult things I have ever tried to master in coding. :)

M23


Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind._______My UDFs:

Spoiler

ArrayMultiColSort ---- Sort arrays on multiple columns
ChooseFileFolder ---- Single and multiple selections from specified path treeview listing
Date_Time_Convert -- Easily convert date/time formats, including the language used
ExtMsgBox --------- A highly customisable replacement for MsgBox
GUIExtender -------- Extend and retract multiple sections within a GUI
GUIFrame ---------- Subdivide GUIs into many adjustable frames
GUIListViewEx ------- Insert, delete, move, drag, sort, edit and colour ListView items
GUITreeViewEx ------ Check/clear parent and child checkboxes in a TreeView
Marquee ----------- Scrolling tickertape GUIs
NoFocusLines ------- Remove the dotted focus lines from buttons, sliders, radios and checkboxes
Notify ------------- Small notifications on the edge of the display
Scrollbars ----------Automatically sized scrollbars with a single command
StringSize ---------- Automatically size controls to fit text
Toast -------------- Small GUIs which pop out of the notification area

 

Share this post


Link to post
Share on other sites

Thank you Melba! :)

Share this post


Link to post
Share on other sites

M23,

Yes, of course, my question exactly. The pattern for all flat files is each liine is delimited somehow, normally by @CRLF, sometimes by "0x00".

I think I'm going to do some timings and see what I get.

Thanks for the advice!

kylomas

@supersonic - and thanks for the use of your thread, good luck!


Forum Rules         Procedure for posting code

"I like pigs.  Dogs look up to us.  Cats look down on us.  Pigs treat us as equals."

- Sir Winston Churchill

Share this post


Link to post
Share on other sites

#11 ·  Posted (edited)

Hi!

How is it possible to return the second token only? (Please see 1st post for example.)

Like this:

[0] DvIFPwMcjKaVa04PlLQxDg1AKf8=

[1] DvIFPwMcjKaVa04PlLQxDg1AKf8=

[2] DvIFPwMcjKaVa04PlLQxDg1AKf8=

[3] F47T1OveZnEgmjKE84Nfm+8bQQ4=

It there an "escape sequence" like "if a string was found, go on with the next line"...???

Greets,

-supersonic.

Edited by supersonic

Share this post


Link to post
Share on other sites

supersonic,

I see you have not yet started the tutorial on that website I linked to. :)

Try this: :)

Local $aTmp = StringRegExp($sTmp, "(?U),(.*)(?=[\v,;]).*?\v", 3)

M23


Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind._______My UDFs:

Spoiler

ArrayMultiColSort ---- Sort arrays on multiple columns
ChooseFileFolder ---- Single and multiple selections from specified path treeview listing
Date_Time_Convert -- Easily convert date/time formats, including the language used
ExtMsgBox --------- A highly customisable replacement for MsgBox
GUIExtender -------- Extend and retract multiple sections within a GUI
GUIFrame ---------- Subdivide GUIs into many adjustable frames
GUIListViewEx ------- Insert, delete, move, drag, sort, edit and colour ListView items
GUITreeViewEx ------ Check/clear parent and child checkboxes in a TreeView
Marquee ----------- Scrolling tickertape GUIs
NoFocusLines ------- Remove the dotted focus lines from buttons, sliders, radios and checkboxes
Notify ------------- Small notifications on the edge of the display
Scrollbars ----------Automatically sized scrollbars with a single command
StringSize ---------- Automatically size controls to fit text
Toast -------------- Small GUIs which pop out of the notification area

 

Share this post


Link to post
Share on other sites

#13 ·  Posted (edited)

Melba,

shame on me - you are right! :)

But BIG thank you for helping me out again! :)

Edited by supersonic

Share this post


Link to post
Share on other sites

Melba,

I had to modify the SRE statement to "(?U),(.*)(?=[\v,;]).*?" in order to return all four lines.

So, I removed the last "\v".

Have I done it right?

Share this post


Link to post
Share on other sites

supersonic,

I get:

DvIFPwMcjKaVa04PlLQxDg1AKf8=
DvIFPwMcjKaVa04PlLQxDg1AKf8=
DvIFPwMcjKaVa04PlLQxDg1AKf8=
F47T1OveZnEgmjKE84Nfm+8bQQ4=

when I run the SRE I posted this morning with your initial data:

$sTmp &= "#4096750#" & @LF
$sTmp &= "#4096756#,DvIFPwMcjKaVa04PlLQxDg1AKf8=" & @LF
$sTmp &= "#4096756#,DvIFPwMcjKaVa04PlLQxDg1AKf8=,EN28JlozIKrsBlTtM8ZKWw9auck=" & @LF
$sTmp &= "#4096756#,DvIFPwMcjKaVa04PlLQxDg1AKf8=,EN28JlozIKrsBlTtM8ZKWw9auck=,FNGcJ7KMw5kCYNcZH24P9sdINiM=" & @LF
$sTmp &= "#4093600#,F47T1OveZnEgmjKE84Nfm+8bQQ4=;BundledBy,#4093601#" & @LF

Are you saying that you do not? :)

M23


Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind._______My UDFs:

Spoiler

ArrayMultiColSort ---- Sort arrays on multiple columns
ChooseFileFolder ---- Single and multiple selections from specified path treeview listing
Date_Time_Convert -- Easily convert date/time formats, including the language used
ExtMsgBox --------- A highly customisable replacement for MsgBox
GUIExtender -------- Extend and retract multiple sections within a GUI
GUIFrame ---------- Subdivide GUIs into many adjustable frames
GUIListViewEx ------- Insert, delete, move, drag, sort, edit and colour ListView items
GUITreeViewEx ------ Check/clear parent and child checkboxes in a TreeView
Marquee ----------- Scrolling tickertape GUIs
NoFocusLines ------- Remove the dotted focus lines from buttons, sliders, radios and checkboxes
Notify ------------- Small notifications on the edge of the display
Scrollbars ----------Automatically sized scrollbars with a single command
StringSize ---------- Automatically size controls to fit text
Toast -------------- Small GUIs which pop out of the notification area

 

Share this post


Link to post
Share on other sites

#16 ·  Posted (edited)

Hello Melba,

I'd checked it several more times - I've got only three lines... :)

BUT - I've found the reason for that:

When the last line doesn't end with "@LF" then only three lines return...

With an trailing "@LF" four lines return...

Edited by supersonic

Share this post


Link to post
Share on other sites

supersonic,

Try this one: :)

Local $aTmp = StringRegExp($sTmp, "(?U),(.*)(?=[\v,;]).*?[\v=#]", 3)

Tricky little devils, these SREs! :)

M23


Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind._______My UDFs:

Spoiler

ArrayMultiColSort ---- Sort arrays on multiple columns
ChooseFileFolder ---- Single and multiple selections from specified path treeview listing
Date_Time_Convert -- Easily convert date/time formats, including the language used
ExtMsgBox --------- A highly customisable replacement for MsgBox
GUIExtender -------- Extend and retract multiple sections within a GUI
GUIFrame ---------- Subdivide GUIs into many adjustable frames
GUIListViewEx ------- Insert, delete, move, drag, sort, edit and colour ListView items
GUITreeViewEx ------ Check/clear parent and child checkboxes in a TreeView
Marquee ----------- Scrolling tickertape GUIs
NoFocusLines ------- Remove the dotted focus lines from buttons, sliders, radios and checkboxes
Notify ------------- Small notifications on the edge of the display
Scrollbars ----------Automatically sized scrollbars with a single command
StringSize ---------- Automatically size controls to fit text
Toast -------------- Small GUIs which pop out of the notification area

 

Share this post


Link to post
Share on other sites

M23

The secret here is <pattern>(?:\v|$)+

SmOke_N usually prefers <pattern>(?:\r\n|\r|\n|\z)


George

Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.

Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.***

The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number.

Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else.

"Old age and treachery will always overcome youth and skill!"

Share this post


Link to post
Share on other sites

George,

Thanks as always. :)

M23


Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind._______My UDFs:

Spoiler

ArrayMultiColSort ---- Sort arrays on multiple columns
ChooseFileFolder ---- Single and multiple selections from specified path treeview listing
Date_Time_Convert -- Easily convert date/time formats, including the language used
ExtMsgBox --------- A highly customisable replacement for MsgBox
GUIExtender -------- Extend and retract multiple sections within a GUI
GUIFrame ---------- Subdivide GUIs into many adjustable frames
GUIListViewEx ------- Insert, delete, move, drag, sort, edit and colour ListView items
GUITreeViewEx ------ Check/clear parent and child checkboxes in a TreeView
Marquee ----------- Scrolling tickertape GUIs
NoFocusLines ------- Remove the dotted focus lines from buttons, sliders, radios and checkboxes
Notify ------------- Small notifications on the edge of the display
Scrollbars ----------Automatically sized scrollbars with a single command
StringSize ---------- Automatically size controls to fit text
Toast -------------- Small GUIs which pop out of the notification area

 

Share this post


Link to post
Share on other sites

NP

At some point in time, if and when I get some time, I plan on adding another toolbox Insert Menu category to auto-insert commonly used portions of expressions such as what I gave you. I just need to get the time to add them to the database and to add the menu items.


George

Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.

Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.***

The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number.

Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else.

"Old age and treachery will always overcome youth and skill!"

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0