Sign in to follow this  
Followers 0
souldjer777

StringRegExpReplace - Help with CSV file and replacing a comma :)

16 posts in this topic

Good Afternoon Everyone,

I'm really, really, really not understanding how to replace a single comma , in a directory structure within a CSV file...

How do I replace the , within the directory path that I know begins with a ," and ends with a ",

Example:

3/1/15 11:49:11 AM,COMPUTER01,10.0.0.6,User-defined Rules:test .10,deny read,"DeviceHarddiskVolumeShadowCopy3455ApplicationsAPPSTestshareddrivingpredrivingscreeningAArchiveShiller,Ron09INSTRUCTION.TXT",,_,10.110.69.16,C:Program FilesVeritasNetBackupbinbpbkar32.exe,'File' class or access,Notice,OAS,My OrganizationServersServers,1092

Here is the excerpt of the file path I want to remove any commas from in between the ," and the ",

,"DeviceHarddiskVolumeShadowCopy3455ApplicationsAPPSTestshareddrivingpredrivingscreeningAArchiveShiller,Ron09INSTRUCTION.TXT",

Could you please let me know how to write the StringRegExpReplace? I'm burning too much time trying it on my own...  I'll continue to look either way... but if someone out there feels like helping I'd very much appreciate it!

Thank you all!


"Maybe I'm on a road that ain't been paved yet. And maybe I see a sign that ain't been made yet"
Song Title: I guess you could say
Artist: Middle Class Rut

Share this post


Link to post
Share on other sites



$string = '3/1/15 11:49:11 AM,COMPUTER01,10.0.0.6,User-defined Rules:test .10,deny read,"\Device\HarddiskVolumeShadowCopy3455\Applications\APPS\Test\shared\driving\predrivingscreening\AArchive\Shiller,Ron09\INSTRUCTION.TXT",,_,10.110.69.16,C:\Program Files\Veritas\NetBackup\bin\bpbkar32.exe,' & "'File' class or access,Notice,OAS,My Organization\Servers\Servers\,1092"

msgbox(0, '' , stringreplace(stringregexp($string , ',"(.*?)",' , 3)[0] , "," , ""))


,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-.
|(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/
(_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_)
| | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) (
| | | | |)| | \ / | | | | | |)| | `--. | |) \ | |
`-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_|
'-' '-' (__) (__) (_) (__)

Share this post


Link to post
Share on other sites

With StringRegExpReplace :

$string = "3/1/15 11:49:11 AM,COMPUTER01,10.0.0.6,User-defined Rules:test .10,deny read,""\Device\HarddiskVolumeShadowCopy3455\Applications\APPS\Test\shared\driving\predrivingscreening\AArchive\Shiller,Ron09\INSTRUCTION.TXT"",,_,10.110.69.16,C:\Program Files\Veritas\NetBackup\bin\bpbkar32.exe,'File' class or access,Notice,OAS,My Organization\Servers\Servers\,1092"

$replace = StringRegExpReplace($string, '"[^,]+\K,([^"]+")', "$1")
ConsoleWrite($replace)

Share this post


Link to post
Share on other sites

Thank you so much! I greatly appreciate it!!!   :sorcerer:


"Maybe I'm on a road that ain't been paved yet. And maybe I see a sign that ain't been made yet"
Song Title: I guess you could say
Artist: Middle Class Rut

Share this post


Link to post
Share on other sites

#5 ·  Posted (edited)

jguinch, how do you do it if there are several commas ?

The only way I found is

$string = '3/1/15 11:49:11 AM,COMPUTER01,10.0.0.6,User-defined Rules:test .10,deny read,"\Device\HarddiskVolumeShadowCopy3455\Applications\APPS\Test\shared\driving\predrivingscreening\AArchive\Shiller,Ron,09\INSTRUCTION.TXT",,_,10.110.69.16,C:\Program Files\Veritas\NetBackup\bin\bpbkar32.exe,' & "'File' class or access,Notice,OAS,My Organization\Servers\Servers\,1092"

Do
  $string = stringregexpreplace($string, '(,"[^,"]+),(?=[^"]+")', "$1") 
Until @extended = 0
msgbox(0, "", $string)

There is probably an Execute trick ... with many quotes inside  :sweating:

Edit

small correction

Edited by mikell

Share this post


Link to post
Share on other sites

#6 ·  Posted (edited)

The StringRegExp above has a flaw and I'm not sure why, although I'm sure a very good RegExp solution exists.

The way I would do this would be to parse the csv to an array, then you can do whatever you like to the data in any cell. There are a few CSV parsing algorithms on the forum, but the one I wrote a while ago still works. If it ain't broke etc...

;

Local $sFilePath = @ScriptDir & "\test.csv" ; Change this to your own csv file (in the same directory as the script)

Local $hFile = FileOpen($sFilePath)
If $hFile = -1 Then
    MsgBox(0, "", "Unable to open file")
    Exit
EndIf

Local $sCSV = FileRead($hFile)
If @error Then
    MsgBox(0, "", "Unable to read file")
    FileClose($hFile)
    Exit
EndIf
FileClose($hFile)

Local $aCSV = _CSVSplit($sCSV) ; Create the main array
If @error Then
    ConsoleWrite("Error = " & @error & @LF)
    Exit
EndIf

; Replace all commas in every cell.
If Ubound($aCSV, 0) = 2 Then
    For $i = 0 To UBound($aCSV) -1
        For $j = 0 To UBound($aCSV, 2) -1
            $aCSV[$i][$j] = StringReplace($aCSV[$i][$j], ",", "")
        Next
    Next
Else ; CSV only contains one column
    For $i = 0 To UBound($aCSV) -1
        $aCSV[$i] = StringReplace($aCSV[$i], ",", "")
    Next
EndIf

Local $sResult = _ArrayToCSV($aCSV, Default, Default, True, True)

ConsoleWrite($sResult & @LF) ; For testing on a small sample only.

; WRITE THE RESULT TO FILE HERE!


; #FUNCTION# ====================================================================================================================
; Name...........: _ArrayToCSV
; Description ...: Converts a two dimensional array to CSV format
; Syntax.........: _ArrayToCSV ( $aArray [, $sDelim [, $sNewLine [, $bFinalBreak [, $bEnclose ]]]] )
; Parameters ....: $aArray      - The array to convert
;                  $sDelim      - Optional - Delimiter set to comma by default (see comments)
;                  $sNewLine    - Optional - New Line set to @LF by default (see comments)
;                  $bFinalBreak - Set to true in accordance with common practice => CSV Line termination
;                  $bEnclose - Enclose all fields in double quotes. Default = False
; Return values .: Success  - Returns a string in CSV format
;                  Failure  - Sets @error to:
;                 |@error = 1 - First parameter is not a valid array
;                 |@error = 2 - Second parameter is not a valid string
;                 |@error = 3 - Third parameter is not a valid string
;                 |@error = 4 - 2nd and 3rd parameters must be different characters
; Author ........: czardas
; Comments ......; One dimensional arrays are returned as multiline text (without delimiters)
;                ; Some users may need to set the second parameter to semicolon to return the prefered CSV format
;                ; To convert to TSV use @TAB for the second parameter
;                ; Some users may wish to set the third parameter to @CRLF
; ===============================================================================================================================

Func _ArrayToCSV($aArray, $sDelim = Default, $sNewLine = Default, $bFinalBreak = True, $bEnclose = False)
    If Not IsArray($aArray) Or Ubound($aArray, 0) > 2 Or Ubound($aArray) = 0 Then Return SetError(1, 0 ,"")
    If $sDelim = Default Then $sDelim = ","
    If $sDelim = "" Then Return SetError(2, 0 ,"")
    If $sNewLine = Default Then $sNewLine = @LF
    If $sNewLine = "" Then Return SetError(3, 0 ,"")
    If $sDelim = $sNewLine Then Return SetError(4, 0, "")

    Local $iRows = UBound($aArray), $sString = ""
    If Ubound($aArray, 0) = 2 Then ; Check if the array has two dimensions
        Local $iCols = UBound($aArray, 2)
        For $i = 0 To $iRows -1
            For $j = 0 To $iCols -1
                If StringRegExp($aArray[$i][$j], '["\r\n' & $sDelim & ']') Or $bEnclose Then
                    $aArray[$i][$j] = '"' & StringReplace($aArray[$i][$j], '"', '""') & '"'
                EndIf
                $sString &= $aArray[$i][$j] & $sDelim
            Next
            $sString = StringTrimRight($sString, StringLen($sDelim)) & $sNewLine
        Next
    Else ; The delimiter is not needed
        For $i = 0 To $iRows -1
            If StringRegExp($aArray[$i], '["\r\n' & $sDelim & ']') Or $bEnclose Then
                $aArray[$i] = '"' & StringReplace($aArray[$i], '"', '""') & '"'
            EndIf
            $sString &= $aArray[$i] & $sNewLine
        Next
    EndIf
    If Not $bFinalBreak Then $sString = StringTrimRight($sString, StringLen($sNewLine)) ; Delete any newline characters added to the end of the string
    Return $sString
EndFunc ;==> _ArrayToCSV


; #FUNCTION# ====================================================================================================================
; Name...........: _CSVSplit
; Description ...: Converts a string in CSV format to a two dimensional array (see comments)
; Syntax.........: CSVSplit ( $aArray [, $sDelim ] )
; Parameters ....: $aArray  - The array to convert
;                  $sDelim  - Optional - Delimiter set to comma by default (see 2nd comment)
; Return values .: Success  - Returns a two dimensional array or a one dimensional array (see 1st comment)
;                  Failure  - Sets @error to:
;                 |@error = 1 - First parameter is not a valid string
;                 |@error = 2 - Second parameter is not a valid string
;                 |@error = 3 - Could not find suitable delimiter replacements
; Author ........: czardas
; Comments ......; Returns a one dimensional array if the input string does not contain the delimiter string
;                ; Some CSV formats use semicolon as a delimiter instead of a comma
;                ; Set the second parameter to @TAB To convert to TSV
; ===============================================================================================================================

Func _CSVSplit($string, $sDelim = ",") ; Parses csv string input and returns a one or two dimensional array
    If Not IsString($string) Or $string = "" Then Return SetError(1, 0, 0) ; Invalid string
    If Not IsString($sDelim) Or $sDelim = "" Then Return SetError(2, 0, 0) ; Invalid string

    $string = StringRegExpReplace($string, "[\r\n]+\z", "") ; [Line Added] Remove training breaks
    Local $iOverride = 63743, $asDelim[3] ; $asDelim => replacements for comma, new line and double quote
    For $i = 0 To 2
        $asDelim[$i] = __GetSubstitute($string, $iOverride) ; Choose a suitable substitution character
        If @error Then Return SetError(3, 0, 0) ; String contains too many unsuitable characters
    Next
    $iOverride = 0

    Local $aArray = StringRegExp($string, '\A[^"]+|("+[^"]+)|"+\z', 3) ; Split string using double quotes delim - largest match
    $string = ""

    Local $iBound = UBound($aArray)
    For $i = 0 To $iBound -1
        $iOverride += StringInStr($aArray[$i], '"', 0, -1) ; Increment by the number of adjacent double quotes per element
        If Mod ($iOverride +2, 2) = 0 Then ; Acts as an on/off switch
            $aArray[$i] = StringReplace($aArray[$i], $sDelim, $asDelim[0]) ; Replace comma delimeters
            $aArray[$i] = StringRegExpReplace($aArray[$i], "(\r\n)|[\r\n]", $asDelim[1]) ; Replace new line delimeters
        EndIf
        $aArray[$i] = StringReplace($aArray[$i], '""', $asDelim[2]) ; Replace double quote pairs
        $aArray[$i] = StringReplace($aArray[$i], '"', '') ; Delete enclosing double quotes - not paired
        $aArray[$i] = StringReplace($aArray[$i], $asDelim[2], '"') ; Reintroduce double quote pairs as single characters
        $string &= $aArray[$i] ; Rebuild the string, which includes two different delimiters
    Next
    $iOverride = 0

    $aArray = StringSplit($string, $asDelim[1], 2) ; Split to get rows
    $iBound = UBound($aArray)
    Local $aCSV[$iBound][2], $aTemp
    For $i = 0 To $iBound -1
        $aTemp = StringSplit($aArray[$i], $asDelim[0]) ; Split to get row items
        If Not @error Then
            If $aTemp[0] > $iOverride Then
                $iOverride = $aTemp[0]
                ReDim $aCSV[$iBound][$iOverride] ; Add columns to accomodate more items
            EndIf
        EndIf
        For $j = 1 To $aTemp[0]
            If StringLen($aTemp[$j]) Then
                If Not StringRegExp($aTemp[$j], '[^"]') Then ; Field only contains double quotes
                    $aTemp[$j] = StringTrimLeft($aTemp[$j], 1) ; Delete enclosing double quote single char
                EndIf
                $aCSV[$i][$j -1] = $aTemp[$j] ; Populate each row
            EndIf
        Next
    Next

    If $iOverride > 1 Then
        Return $aCSV ; Multiple Columns
    Else
        For $i = 0 To $iBound -1
            If StringLen($aArray[$i]) And (Not StringRegExp($aArray[$i], '[^"]')) Then ; Only contains double quotes
                $aArray[$i] = StringTrimLeft($aArray[$i], 1) ; Delete enclosing double quote single char
            EndIf
        Next
        Return $aArray ; Single column
    EndIf


EndFunc ;==> _CSVSplit

; #INTERNAL_USE_ONLY# ===========================================================================================================
; Name...........: __GetSubstitute
; Description ...: Searches for a character to be used for substitution, ie one not contained within the input string
; Syntax.........: __GetSubstitute($string, ByRef $iCountdown)
; Parameters ....: $string   - The string of characters to avoid
;                  $iCountdown - The first code point to begin checking
; Return values .: Success   - Returns a suitable substitution character not found within the first parameter
;                  Failure   - Sets @error to 1 => No substitution character available
; Author ........: czardas
; Comments ......; This function is connected to the function _CSVSplit and was not intended for general use
;                  $iCountdown is returned ByRef to avoid selecting the same character on subsequent calls to this function
;                  Initially $iCountown should be passed with a value = 63743
; ===============================================================================================================================

Func __GetSubstitute($string, ByRef $iCountdown)
    If $iCountdown < 57344 Then Return SetError(1, 0, "") ; Out of options
    Local $sTestChar
    For $i = $iCountdown To 57344 Step -1
        $sTestChar = ChrW($i)
        $iCountdown -= 1
        If Not StringInStr($string, $sTestChar) Then
            Return $sTestChar
        EndIf
    Next
    Return SetError(1, 0, "") ; Out of options
EndFunc ;==> __GetSubstitute

;

From the following input:

test,"\Device\HarddiskVolumeShadowCopy3455\Applications\APPS\Test\shared\driving\predrivingscreening\AArchive\Shiller,Ron09\INSTRUCTION.TXT",123
next line,",","\Device\HarddiskVolumeShadowCopy3455\Applications\APPS\Test\shared\driving\predrivingscreening\AArchive\Shiller,Ron09\INSTRUCTION.TXT"

;

I get (with the option set for all fields to be enclosed):

"test","\Device\HarddiskVolumeShadowCopy3455\Applications\APPS\Test\shared\driving\predrivingscreening\AArchive\ShillerRon09\INSTRUCTION.TXT","123"
"next line","","\Device\HarddiskVolumeShadowCopy3455\Applications\APPS\Test\shared\driving\predrivingscreening\AArchive\ShillerRon09\INSTRUCTION.TXT"

;

And from the RegExp, I get:

test,"\Device\HarddiskVolumeShadowCopy3455\Applications\APPS\Test\shared\driving\predrivingscreening\AArchive\ShillerRon09\INSTRUCTION.TXT",123
next line,",","\Device\HarddiskVolumeShadowCopy3455\Applications\APPS\Test\shared\driving\predrivingscreening\AArchive\ShillerRon09\INSTRUCTION.TXT"

I'm sure it's possible to adapt the RegExp.

Edited by czardas

Share this post


Link to post
Share on other sites

czardas, what about this one ?

$string = '3/1/15 11:49:11 AM,COMPUTER01,10.0.0.6,User-defined Rules:test .10,deny read,"\Device\HarddiskVolumeShadowCopy3455\Applications\APPS\Test\shared\driving\predrivingscreening\AArchive\Shiller,Ron,09\INSTRUCTION.TXT",,_,10.110.69.16,C:\Program Files\Veritas\NetBackup\bin\bpbkar32.exe,",",' & "'File' class or access,Notice,OAS,My Organization\Servers\Servers\,1092"

Do
  $string = stringregexpreplace($string, '(,"[^,"]*),(?=[^"]*")', "$1") 
Until @extended = 0
msgbox(0, "", $string)

Share this post


Link to post
Share on other sites

mikell, it's very impressive. :) It's still not perfect though. It can't yet handle the first cell. I'm sure you will very quickly be able to amend this.

Share this post


Link to post
Share on other sites

Thanks  :D

$string = '"this,is,a,test",3/1/15 11:49:11 AM,COMPUTER01,10.0.0.6,User-defined Rules:test .10,deny read,"\Device\HarddiskVolumeShadowCopy3455\Applications\APPS\Test\shared\driving\predrivingscreening\AArchive\Shiller,Ron,09\INSTRUCTION.TXT",,_,10.110.69.16,C:\Program Files\Veritas\NetBackup\bin\bpbkar32.exe,",",' & "'File' class or access,Notice,OAS,My Organization\Servers\Servers\,1092"

Do
  $string = stringregexpreplace($string, '((?:^|,)"[^,"]*),(?=[^"]*")', "$1") 
Until @extended = 0
msgbox(0, "", $string)
1 person likes this

Share this post


Link to post
Share on other sites

Another way to delete multiple comas within multiple double quotes.

$string = '"this,is,a,test",3/1/15 11:49:11 AM,COMPUTER01,10.0.0.6,User-defined Rules:test .10,deny read,"\Device\HarddiskVolumeShadowCopy3455\Applications\APPS\Test\shared\driving\predrivingscreening\AArchive\Shiller,Ron,09\INSTRUCTION.TXT",,_,10.110.69.16,C:\Program Files\Veritas\NetBackup\bin\bpbkar32.exe,",",' & "'File' class or access,Notice,OAS,My Organization\Servers\Servers\,1092"
; Result=     "thisisatest",3/1/15 11:49:11 AM,COMPUTER01,10.0.0.6,User-defined Rules:test .10,deny read,"\Device\HarddiskVolumeShadowCopy3455\Applications\APPS\Test\shared\driving\predrivingscreening\AArchive\ShillerRon09\INSTRUCTION.TXT",,_,10.110.69.16,C:\Program Files\Veritas\NetBackup\bin\bpbkar32.exe,"",'File' class or access,Notice,OAS,My Organization\Servers\Servers\,1092

$string = Execute('"' & StringRegExpReplace($string, '(".+?")', '" & StringReplace(""\1"",",","") & "') & '"')

ConsoleWrite($string & @LF)
3 people like this

Share this post


Link to post
Share on other sites

#12 ·  Posted (edited)

A neat trick Malkey. I'm just wondering if using Execute() like this might be limited by the maximum length of a line of code.

Edit

It seems to work on large input, which I didn't expect. I'm still trying to figure out why Execute is able to handle such a long expression. :wacko:

Edited by czardas

Share this post


Link to post
Share on other sites

Thanks everyone for your input! I'm not sure which one I should mark as the answer now... because it appears that we had more questions and multiple solutions! Great work everyone!  :graduated:


"Maybe I'm on a road that ain't been paved yet. And maybe I see a sign that ain't been made yet"
Song Title: I guess you could say
Artist: Middle Class Rut

Share this post


Link to post
Share on other sites

Mark the one that you feel confident works and you most understand ;) ...


Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.

Share this post


Link to post
Share on other sites

I was sure that such a solution exist, but couldn't find it.... so much quotes  :D

Very nice !

Share this post


Link to post
Share on other sites

#16 ·  Posted (edited)

I don't know where the "Solved" part of this Q/A went... I know the forums are going through a modification - but this worked for me:

For $i = UBound($array01) - 1 To 0 Step -1
$array01[$i] = Execute('"' & StringRegExpReplace($array01[$i], '(".+?")', '" & StringReplace(""\1"",",","") & "') & '"')
Next

Sorry, don't see AutoIT formatting / colors either... forums still updating I'll bet.

Edited by souldjer777

"Maybe I'm on a road that ain't been paved yet. And maybe I see a sign that ain't been made yet"
Song Title: I guess you could say
Artist: Middle Class Rut

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0