Jump to content
Sign in to follow this  
Kyan

[Solved] StringRegExp need a little help with some digits

Recommended Posts

Kyan

Hi

I need to read a txt with data from a exprience, and convert it to be readed in excel to build some graphics

thats the the 1st 3 lines of my txt

0,"44,09154","0","0"
0,1,"45,71279","0,002229167","0,0006945928"
0,2,"47,23867","0,0055625","0,001749992"

i tried this without success :(

$reg = StringRegExp($read,'(\d*),"(\d*)","(\d*)","(\d*)"',3)

a comma is not recognized as part of a digit?

i also tried this but again, without any success :s

$reg = StringRegExp($read,'(\d*\,*\d*),"(\d*\,*\d*)","(\d*\,*\d*)","(\d*\,*\d*)"',3)

from what I saw in hex editor, @crlf is for line breaks (in a sheet of excel) and 0x09 represents a new column

so i just need to do it: FileWriteLine("export.txt",$reg[0]&BinaryToString(0x09,4)&$reg[1]&BinaryToString(0x09,4)&$reg[2]&BinaryToString(0x09,4)&$reg[3]&BinaryToString(0x09,4))

thanks in advance :)

Edited by DiOgO

Heroes, there is no such thing

One day I'll discover what IE.au3 has of special for so many users using it.
C'mon there's InetRead and WinHTTP, way better
happy.png

Share this post


Link to post
Share on other sites
PhoenixXL

so basically you just need the numbers separated out..!

Edited by PhoenixXL

My code:

PredictText: Predict Text of an Edit Control Like Scite. Remote Gmail: Execute your Scripts through Gmail. StringRegExp:Share and learn RegExp.

Run As System: A command line wrapper around PSEXEC.exe to execute your apps scripts as System (LSA). Database: An easier approach for _SQ_LITE beginners.

MathsEx: A UDF for Fractions and LCM, GCF/HCF. FloatingText: An UDF for make your text floating. Clipboard Extendor: A clipboard monitoring tool. 

Custom ScrollBar: Scroll Bar made with GDI+, user can use bitmaps instead. RestrictEdit_SRE: Restrict text in an Edit Control through a Regular Expression.

Share this post


Link to post
Share on other sites
PhoenixXL

does this help

$sString = '0,"44,09154","0","0"'& @CR & _
'0,1,"45,71279","0,002229167","0,0006945928"'& @CR & _
'0,2,"47,23867","0,0055625","0,001749992"'

$sPattern = '[,"'']'
;Replace the Commas and the Quotes
$sRegEx = StringRegExpReplace($sString,$sPattern,'')

ConsoleWrite($sRegEx&@CR)

Matching a Digit will not be efficient since we dont know how many commas are there

Indeed some of the Digits are enclosed withing Quotes

Edited by PhoenixXL

My code:

PredictText: Predict Text of an Edit Control Like Scite. Remote Gmail: Execute your Scripts through Gmail. StringRegExp:Share and learn RegExp.

Run As System: A command line wrapper around PSEXEC.exe to execute your apps scripts as System (LSA). Database: An easier approach for _SQ_LITE beginners.

MathsEx: A UDF for Fractions and LCM, GCF/HCF. FloatingText: An UDF for make your text floating. Clipboard Extendor: A clipboard monitoring tool. 

Custom ScrollBar: Scroll Bar made with GDI+, user can use bitmaps instead. RestrictEdit_SRE: Restrict text in an Edit Control through a Regular Expression.

Share this post


Link to post
Share on other sites
PhoenixXL

This also works if you want to do with digits

$sString = '0,"44,09154","0","0",,' & @CR & _
'0,1,"45,71279","0,002229167","0,0006945928"' & @CR & _
'0,2,"47,23867","0,0055625","0,001749992"'

$sPattern = '(?:.*?)(d*[rn]*)'
;Replace the Commas and the Quotes
$sRegEx = StringRegExpReplace($sString,$sPattern,'1')

ConsoleWrite($sRegEx&@CR)

Edited by PhoenixXL

My code:

PredictText: Predict Text of an Edit Control Like Scite. Remote Gmail: Execute your Scripts through Gmail. StringRegExp:Share and learn RegExp.

Run As System: A command line wrapper around PSEXEC.exe to execute your apps scripts as System (LSA). Database: An easier approach for _SQ_LITE beginners.

MathsEx: A UDF for Fractions and LCM, GCF/HCF. FloatingText: An UDF for make your text floating. Clipboard Extendor: A clipboard monitoring tool. 

Custom ScrollBar: Scroll Bar made with GDI+, user can use bitmaps instead. RestrictEdit_SRE: Restrict text in an Edit Control through a Regular Expression.

Share this post


Link to post
Share on other sites
dany

Maybe something like this:

#include <Array.au3>
; [^",]+  Anything excluding double quotes and comma.
; "[^"]+" Anything excluding double quotes within double quotes.
Global $aMatches = StringRegExp('0,"44,09154","0","0"', '([^",]+|"[^"]+")', 3)
_ArrayDisplay($aMatches)

[center]Spiderskank Spiderskank[/center]GetOpt Parse command line options UDF | AU3Text Program internationalization UDF | Identicon visual hash UDF

Share this post


Link to post
Share on other sites
Kyan

First of all, thanks for trying help me out :)

This also works if you want to do with digits

$sString = '0,"44,09154","0","0",,' & @CR & _
'0,1,"45,71279","0,002229167","0,0006945928"' & @CR & _
'0,2,"47,23867","0,0055625","0,001749992"'

$sPattern = '(?:.*?)(d*[rn]*)'
;Replace the Commas and the Quotes
$sRegEx = StringRegExpReplace($sString,$sPattern,'1')

ConsoleWrite($sRegEx&@CR)

not working when i use it with fileread of txt file

Maybe something like this:

#include <Array.au3>
; [^",]+ Anything excluding double quotes and comma.
; "[^"]+" Anything excluding double quotes within double quotes.
Global $aMatches = StringRegExp('0,"44,09154","0","0"', '([^",]+|"[^"]+")', 3)
_ArrayDisplay($aMatches)

is possible to exclude 0x09 characters? do I need to mathe the line break in the pattern? ($, or rn?)

thats a selection of a lines in text

Posted Image

as you see, there's something in front the last double quote, in hex is: 09 09 09 0D 0A (last 2 bytes equals to @crlf if I'm not wrong)

how can I match without caputing it? [^chr(09)+rn] ?


Heroes, there is no such thing

One day I'll discover what IE.au3 has of special for so many users using it.
C'mon there's InetRead and WinHTTP, way better
happy.png

Share this post


Link to post
Share on other sites
kylomas

DiOgO,

Run this and look at the console output. The regexp form malkey is giving you an array element for each comma seperated value. The r's are retained. I do not see a tab (t) anywhere.

#include<array.au3>
$sString = '0,"""44,09154","0","0"'& @CR & _
'0,1,"45,71279","0,002229167","0,0006945928"'& @CR & _
'0,2,"47,23867","0,0055625","0,001749992"'
$sPattern = '([^",]+|"[^"]+")'
$aRegEx = StringRegexp($sString,$sPattern,3)
for $i = 0 to ubound($aRegEx) - 1
 consolewrite(stringformat('%-20s',$aRegEx[$i]))
next
consolewrite(@lf)

Disclaimer - I am trying to learn regexp so take with a pound of salt!

kylomas


Forum Rules         Procedure for posting code

"I like pigs.  Dogs look up to us.  Cats look down on us.  Pigs treat us as equals."

- Sir Winston Churchill

Share this post


Link to post
Share on other sites
Kyan

DiOgO,

Run this and look at the console output. The regexp form malkey is giving you an array element for each comma seperated value. The r's are retained. I do not see a tab (t) anywhere.

#include<array.au3>
$sString = '0,"""44,09154","0","0"'& @CR & _
'0,1,"45,71279","0,002229167","0,0006945928"'& @CR & _
'0,2,"47,23867","0,0055625","0,001749992"'
$sPattern = '([^",]+|"[^"]+")'
$aRegEx = StringRegexp($sString,$sPattern,3)
for $i = 0 to ubound($aRegEx) - 1
consolewrite(stringformat('%-20s',$aRegEx[$i]))
next
consolewrite(@lf)

Disclaimer - I am trying to learn regexp so take with a pound of salt!

kylomas

I don't well understand the way as stringregexp works, so don't worry about if you're trying to learn regexp ;)

your way to do it works for the quoted digits, when the 1st digit have a comma, it fails so to speak

here's a portion of data: http://www33.zippyshare.com/v/49442372/file.html

and here's how I read the txt

#include <array.au3>
$file = FileOpen("exp.txt")
$read = FileRead($file)
$reg = StringRegExp($read,String('([^",]+|"[^"]+")[^'&Chr(09)&'+]'),3)
_arraydisplay($reg)

The problem in that one is, the first unquoted digit is splited by is comma, and the last losts the doble quotes (I prefer no one of them have quotes :) )

EDIT eheh, I managed this one

#include
$file = FileOpen("exp.txt")
$read = FileRead($file)
$reg = StringRegExp($read,'(?:,")([^"]+|"[^"]+")[^x09+rn+]',3)
_ArrayDisplay($reg)

but still need a "|" for the first case (eg: 0,1 at the second line) I guess exist a function to add it to a excel sheet, right?

Edited by DiOgO

Heroes, there is no such thing

One day I'll discover what IE.au3 has of special for so many users using it.
C'mon there's InetRead and WinHTTP, way better
happy.png

Share this post


Link to post
Share on other sites
jchd

This should work for you, returning a 1D array of all fields without quotes. Since your record format uses 4 fields per row, read the array in chunks of 4 elements to be inserted in Excel.

#include
$sString = FileRead("exp.txt")
ConsoleWrite($sString & @LF)
$sPattern = '(d+(?:,d+)?),"(d+(?:,d+)?)","(d+(?:,d+)?)","(d+(?:,d+)?)"'
$aRegEx = StringRegexp($sString,$sPattern,3)
_ArrayDisplay($aRegEx)

BTW, your example file has no tabs (0x09) before the end of each line. That or they got eaten by cyberShrek.


This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites
Kyan

This should work for you, returning a 1D array of all fields without quotes. Since your record format uses 4 fields per row, read the array in chunks of 4 elements to be inserted in Excel.

#include
$sString = FileRead("exp.txt")
ConsoleWrite($sString & @LF)
$sPattern = '(d+(?:,d+)?),"(d+(?:,d+)?)","(d+(?:,d+)?)","(d+(?:,d+)?)"'
$aRegEx = StringRegexp($sString,$sPattern,3)
_ArrayDisplay($aRegEx)

BTW, your example file has no tabs (0x09) before the end of each line. That or they got eaten by cyberShrek.

this one works fine, thx :D

I don't If cyberShrek eaten them, but surely they was in my clipboard when I paste them...btw I add a txt (in zippyshare) with the lines.

EDIT: can you explain the function of this: (?:,d+)?

Edited by DiOgO

Heroes, there is no such thing

One day I'll discover what IE.au3 has of special for so many users using it.
C'mon there's InetRead and WinHTTP, way better
happy.png

Share this post


Link to post
Share on other sites
jchd

Tabs or no, it shouldn't make any difference AFAICT.

Getting a 1D array as result implies a little bit of work to make it into Excel. Indeed it would be easier to obtain directly a 2D (* x 4) result array which you could insert all in once in Excel. But I doubt that writing an ad-hoc _StringRegExp2D function to this very effect would be beneficial.


This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites
Kyan

Tabs or no, it shouldn't make any difference AFAICT.

Getting a 1D array as result implies a little bit of work to make it into Excel. Indeed it would be easier to obtain directly a 2D (* x 4) result array which you could insert all in once in Excel. But I doubt that writing an ad-hoc _StringRegExp2D function to this very effect would be beneficial.

I'll use a for loop to add them, its easy to work with them, with a step of 4...

can you answer to the edit I add above plz?


Heroes, there is no such thing

One day I'll discover what IE.au3 has of special for so many users using it.
C'mon there's InetRead and WinHTTP, way better
happy.png

Share this post


Link to post
Share on other sites
kylomas

DiOgO.

It mean find a comma, followed by one or more digits but do not capture the group. This ocurrs zero or one times.

kylomas

edit: wore out my space bar

Edited by kylomas

Forum Rules         Procedure for posting code

"I like pigs.  Dogs look up to us.  Cats look down on us.  Pigs treat us as equals."

- Sir Winston Churchill

Share this post


Link to post
Share on other sites
jchd

Yes, sorry for missing it.

(?:whatever) is a non-capturing group

(?:whatever)? is an optional non-capturing group

(?:,d+)? is an optional non-capturing group containing a comma followed by one or more digits.

(d+(?:,d+)?) is a capturing group matching a series of one or more digits followed by an optional non-capturing group containing a comma followed by one or more digits.


This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites
Kyan

DiOgO.

It mean find a comma, followed by one or more digits but do not capture the group. This ocurrs zero or one times.

kylomas

edit: wore out my space bar

thx :)

Yes, sorry for missing it.

(?:whatever) is a non-capturing group

(?:whatever)? is an optional non-capturing group

(?:,d+)? is an optional non-capturing group containing a comma followed by one or more digits.

(d+(?:,d+)?) is a capturing group matching a series of one or more digits followed by an optional non-capturing group containing a comma followed by one or more digits.

and this non capturing group is used to match anything inside of the group without capturing it, right? (I just don't get it, if is a non-capturing group, why it captures a number with commas?)

sorry if it is considered basic, but I don't understand much of regexp patterns :s


Heroes, there is no such thing

One day I'll discover what IE.au3 has of special for so many users using it.
C'mon there's InetRead and WinHTTP, way better
happy.png

Share this post


Link to post
Share on other sites
jchd

That's because the non-capturing group is itself included in a capturing group. There is no need to explicitely assign a capture group to the decimal part of those values.


This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites
Kyan

That's because the non-capturing group is itself included in a capturing group. There is no need to explicitely assign a capture group to the decimal part of those values.

I understand for example, in the first capturing group, It will capture everything before a comma fallowed by a double quote, but what is it's function? (of a non-capturing group)

Heroes, there is no such thing

One day I'll discover what IE.au3 has of special for so many users using it.
C'mon there's InetRead and WinHTTP, way better
happy.png

Share this post


Link to post
Share on other sites
kylomas

DiOgO,

but what is it's function? (of a non-capturing group)

To participate in the match, but not capture (return to you) the group.

kylomas

edit: If you look in the Help file under stringregexp there is a link to some PCRE doc. This has helped me, along with some excellent help from the forum.

Edited by kylomas

Forum Rules         Procedure for posting code

"I like pigs.  Dogs look up to us.  Cats look down on us.  Pigs treat us as equals."

- Sir Winston Churchill

Share this post


Link to post
Share on other sites
Kyan

DiOgO,

To participate in the match, but not capture (return to you) the group.

kylomas

edit: If you look in the Help file under stringregexp there is a link to some PCRE doc. This has helped me, along with some excellent help from the forum.

ok, so..is just to help in the match, thanks again for your help, really :)

Heroes, there is no such thing

One day I'll discover what IE.au3 has of special for so many users using it.
C'mon there's InetRead and WinHTTP, way better
happy.png

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

×