Jump to content

StringRegExp capturing outside a group


Kyan
 Share

Go to solution Solved by czardas,

Recommended Posts

I'm trying to list a CSV export from outlook contacts, but I doing something wrong since sometimes it captures a comma or two.

This is what I done so far, I'm reading it from a txt file with everyline corresponding to a contact, $head is the main header from the exported CSV (you can go to mail.live.com, click on "people" from top metro style menu and export it through [more] top menu)

#include <Array.au3>
$head = 'Title,"First Name","Middle Name","Last Name","Suffix","Given Name Yomi","Family Name Yomi","Home Street","Home City","Home State","Home Postal Code","Home Country","Company","Department","Job Title","Office Location","Business Street","Business City","Business State","Business Postal Code","Business Country","Other Street","Other City","Other State","Other Postal Code","Other Country","Assistants Phone","Business Fax","Business Phone","Business Phone 2","Callback","Car Phone","Company Main Phone","Home Fax","Home Phone","Home Phone 2","ISDN","Mobile Phone","Other Fax","Other Phone","Pager","Primary Phone","Radio Phone","TTY/TDD Phone","Telex","Anniversary","Birthday","E-mail Address","E-mail Type","E-mail 2 Address","E-mail 2 Type","E-mail 3 Address","E-mail 3 Type","Notes","Spouse","Web Page"'
$aHead = StringRegExp($head,',"?(\V+?)?"?,',3)
$l=1
Local $Mix[UBound($aHead)][1]
While 1
    $f = FileReadLine(@DesktopDir&"\lis.txt",$l)
    If @error Then ExitLoop
    $aData = StringRegExp($f,'\,(\V+?)?\,',3)
    ;ConsoleWrite("["&$l&"] "&UBound($aHead)&"  "&UBound($aData)&@LF)
    ReDim $Mix[UBound($Mix)][$l+1]
    For $x = 0 To UBound($aHead)-1
        If $l = 1 Then $Mix[$x][0] = $aHead[$x]
        ;ConsoleWrite("#"&$x&@TAB&"["&UBound($Mix)&"]["&UBound($Mix,2)&"]"&@LF)
        If $x <= UBound($aData)-1 Then $Mix[$x][$l] = $aData[$x]
    Next
    $l+=1
WEnd
_ArrayDisplay($Mix)
Exit

lis.txt (example):

,"words1",,"words2",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,"some@hotmail.com","SMTP",,,,,,,
,"words3",,"words4",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,"some2@hotmail.com","SMTP",,,,,,,
,"words5",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,"+123123123123",,,,,,,,,,,,,,,,,,
,"words6",,"words7",,,,,,,,"France",,,,,,,,,,,,,,,,,,,,,,,,,,"+123123123123",,,,,,,,"01-01-1800","01-01-1800",,,,,,,,,

Besides capturing comma's, there's anyway I can return {null} ou "" from stringregexp in order to properly match the exact array row of $aHead?

Thanks in advance ;)

EDIT: Current output:

arraydisplayoutput_outlookcsv.png

Edited by Kyan

Heroes, there is no such thing

One day I'll discover what IE.au3 has of special for so many users using it.
C'mon there's InetRead and WinHTTP, way better
happy.png

Link to comment
Share on other sites

Kyan,

It appears that you are trying to create an array of contacts where;

  1. Heading are in column 0
  2. Each contact's detail is a column

If that is true then this is how I would have done it...

#include <Array.au3>

; @scriptdir & '\lis.csv' was created by exporting Contacts from Win7 Live mail
local $aContactsIn = stringsplit(fileread(@scriptdir & '\lis.csv'),@crlf,3)
local $aContactsOut[29][ubound($aContactsIn)]

; populate header column
local $aHead = stringsplit($aContactsIn[0],',',2)
for $1 = 0 to ubound($aHead) - 1
    $aContactsOut[$1][0] = $aHead[$1]
Next

; populate detail columns (one column for each contact)
local $aTmp
for $1 = 1 to ubound($aContactsIn) - 1
    $aTmp = stringsplit($aContactsIn[$1],',',2)
    for $2 = 0 to ubound($aTmp) - 1
        $aContactsOut[$2][$1] = $aTmp[$2]
    Next
Next

_arraydisplay($aContactsOut)

kylomas

Forum Rules         Procedure for posting code

"I like pigs.  Dogs look up to us.  Cats look down on us.  Pigs treat us as equals."

- Sir Winston Churchill

Link to comment
Share on other sites

  • Solution

Indeed that's easier. I'm always cautious with csv data though. I have been using my own function which strips the enclosing quotes around fields (which may contain commas or new lines within them). I've been using it for a while without encountering problems. It might be worth a try.

;

#include <Array.au3>
#include <CSVSplit.au3>

Local $sFilePath = "list.csv"

Local $hFile = FileOpen($sFilePath)
If $hFile = -1 Then
    MsgBox(0, "", "Unable to open file")
    Exit
EndIf

Local $sCSV = FileRead($hFile)
If @error Then
    MsgBox(0, "", "Unable to read file")
    FileClose($hFile)
    Exit
EndIf
FileClose($hFile)

$aArray = _CSVSplit($sCSV)
_ArrayDisplay($aArray)

_ArrayTranspose($aArray)
_ArrayDisplay($aArray)

;

The reverse function _ArrayToCSV() does not automatically enclose all fields in double quotes. I should add that as an option.

Edited by czardas
Link to comment
Share on other sites

kylomas, your script is working nicely, just need quote removal and will be 5*

mikell, mine _FileReadToArray  function only allow this parameters: _FileReadToArray ( $sFilePath, ByRef $aArray [, $iFlag = 1] ), I can't specify a delimiter, but look awesome doing the same stuff with just a bit of code ;)

 

czardas, I have tried (?:,) before but keeps capturing commas, although your CSVSplit example works like a charm :D

Thank you all for the help, much appreciated  :thumbsup:

Heroes, there is no such thing

One day I'll discover what IE.au3 has of special for so many users using it.
C'mon there's InetRead and WinHTTP, way better
happy.png

Link to comment
Share on other sites

Kyan,

Just for follow up...My listings do not have any quotes...this should suppress any quotes...

#include <Array.au3>

; @scriptdir & '\lis.csv' was created by exporting Contacts from Win7 Live mail
local $aContactsIn = stringsplit(fileread(@scriptdir & '\lis.csv'),@crlf,3)
local $aContactsOut[29][ubound($aContactsIn)]

; populate header column
local $aHead = stringsplit($aContactsIn[0],',',2)
for $1 = 0 to ubound($aHead) - 1
    $aContactsOut[$1][0] = $aHead[$1]
Next

; populate detail columns (one column for each contact)
local $aTmp
for $1 = 1 to ubound($aContactsIn) - 1
    if $aContactsIn[$1] = '' then continueloop
    $aTmp = stringsplit($aContactsIn[$1],',',2)
    for $2 = 0 to ubound($aTmp) - 1
        $aContactsOut[$2][$1] = stringregexpreplace($aTmp[$2],'(?:''|")?([^''"])(?:''|")?','\1')
    Next
Next

_arraydisplay($aContactsOut)

However, it probably makes more sense to use czardas's UDF as he has already visited all of the CSV issues...

kyloomas

Forum Rules         Procedure for posting code

"I like pigs.  Dogs look up to us.  Cats look down on us.  Pigs treat us as equals."

- Sir Winston Churchill

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...