Jump to content

StringRegExpReplace - Help with CSV file and replacing a comma :)


Go to solution Solved by Malkey,

Recommended Posts

Good Afternoon Everyone,

I'm really, really, really not understanding how to replace a single comma , in a directory structure within a CSV file...

How do I replace the , within the directory path that I know begins with a ," and ends with a ",

Example:

3/1/15 11:49:11 AM,COMPUTER01,10.0.0.6,User-defined Rules:test .10,deny read,"DeviceHarddiskVolumeShadowCopy3455ApplicationsAPPSTestshareddrivingpredrivingscreeningAArchiveShiller,Ron09INSTRUCTION.TXT",,_,10.110.69.16,C:Program FilesVeritasNetBackupbinbpbkar32.exe,'File' class or access,Notice,OAS,My OrganizationServersServers,1092

Here is the excerpt of the file path I want to remove any commas from in between the ," and the ",

,"DeviceHarddiskVolumeShadowCopy3455ApplicationsAPPSTestshareddrivingpredrivingscreeningAArchiveShiller,Ron09INSTRUCTION.TXT",

Could you please let me know how to write the StringRegExpReplace? I'm burning too much time trying it on my own...  I'll continue to look either way... but if someone out there feels like helping I'd very much appreciate it!

Thank you all!

"Maybe I'm on a road that ain't been paved yet. And maybe I see a sign that ain't been made yet"
Song Title: I guess you could say
Artist: Middle Class Rut

Link to comment
Share on other sites

$string = '3/1/15 11:49:11 AM,COMPUTER01,10.0.0.6,User-defined Rules:test .10,deny read,"\Device\HarddiskVolumeShadowCopy3455\Applications\APPS\Test\shared\driving\predrivingscreening\AArchive\Shiller,Ron09\INSTRUCTION.TXT",,_,10.110.69.16,C:\Program Files\Veritas\NetBackup\bin\bpbkar32.exe,' & "'File' class or access,Notice,OAS,My Organization\Servers\Servers\,1092"

msgbox(0, '' , stringreplace(stringregexp($string , ',"(.*?)",' , 3)[0] , "," , ""))

,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-.
|(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/
(_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_)
| | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) (
| | | | |)| | \ / | | | | | |)| | `--. | |) \ | |
`-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_|
'-' '-' (__) (__) (_) (__)

Link to comment
Share on other sites

With StringRegExpReplace :

$string = "3/1/15 11:49:11 AM,COMPUTER01,10.0.0.6,User-defined Rules:test .10,deny read,""\Device\HarddiskVolumeShadowCopy3455\Applications\APPS\Test\shared\driving\predrivingscreening\AArchive\Shiller,Ron09\INSTRUCTION.TXT"",,_,10.110.69.16,C:\Program Files\Veritas\NetBackup\bin\bpbkar32.exe,'File' class or access,Notice,OAS,My Organization\Servers\Servers\,1092"

$replace = StringRegExpReplace($string, '"[^,]+\K,([^"]+")', "$1")
ConsoleWrite($replace)
Link to comment
Share on other sites

jguinch, how do you do it if there are several commas ?

The only way I found is

$string = '3/1/15 11:49:11 AM,COMPUTER01,10.0.0.6,User-defined Rules:test .10,deny read,"\Device\HarddiskVolumeShadowCopy3455\Applications\APPS\Test\shared\driving\predrivingscreening\AArchive\Shiller,Ron,09\INSTRUCTION.TXT",,_,10.110.69.16,C:\Program Files\Veritas\NetBackup\bin\bpbkar32.exe,' & "'File' class or access,Notice,OAS,My Organization\Servers\Servers\,1092"

Do
  $string = stringregexpreplace($string, '(,"[^,"]+),(?=[^"]+")', "$1") 
Until @extended = 0
msgbox(0, "", $string)

There is probably an Execute trick ... with many quotes inside  :sweating:

Edit

small correction

Edited by mikell
Link to comment
Share on other sites

The StringRegExp above has a flaw and I'm not sure why, although I'm sure a very good RegExp solution exists.

The way I would do this would be to parse the csv to an array, then you can do whatever you like to the data in any cell. There are a few CSV parsing algorithms on the forum, but the one I wrote a while ago still works. If it ain't broke etc...

;

Local $sFilePath = @ScriptDir & "\test.csv" ; Change this to your own csv file (in the same directory as the script)

Local $hFile = FileOpen($sFilePath)
If $hFile = -1 Then
    MsgBox(0, "", "Unable to open file")
    Exit
EndIf

Local $sCSV = FileRead($hFile)
If @error Then
    MsgBox(0, "", "Unable to read file")
    FileClose($hFile)
    Exit
EndIf
FileClose($hFile)

Local $aCSV = _CSVSplit($sCSV) ; Create the main array
If @error Then
    ConsoleWrite("Error = " & @error & @LF)
    Exit
EndIf

; Replace all commas in every cell.
If Ubound($aCSV, 0) = 2 Then
    For $i = 0 To UBound($aCSV) -1
        For $j = 0 To UBound($aCSV, 2) -1
            $aCSV[$i][$j] = StringReplace($aCSV[$i][$j], ",", "")
        Next
    Next
Else ; CSV only contains one column
    For $i = 0 To UBound($aCSV) -1
        $aCSV[$i] = StringReplace($aCSV[$i], ",", "")
    Next
EndIf

Local $sResult = _ArrayToCSV($aCSV, Default, Default, True, True)

ConsoleWrite($sResult & @LF) ; For testing on a small sample only.

; WRITE THE RESULT TO FILE HERE!


; #FUNCTION# ====================================================================================================================
; Name...........: _ArrayToCSV
; Description ...: Converts a two dimensional array to CSV format
; Syntax.........: _ArrayToCSV ( $aArray [, $sDelim [, $sNewLine [, $bFinalBreak [, $bEnclose ]]]] )
; Parameters ....: $aArray      - The array to convert
;                  $sDelim      - Optional - Delimiter set to comma by default (see comments)
;                  $sNewLine    - Optional - New Line set to @LF by default (see comments)
;                  $bFinalBreak - Set to true in accordance with common practice => CSV Line termination
;                  $bEnclose - Enclose all fields in double quotes. Default = False
; Return values .: Success  - Returns a string in CSV format
;                  Failure  - Sets @error to:
;                 |@error = 1 - First parameter is not a valid array
;                 |@error = 2 - Second parameter is not a valid string
;                 |@error = 3 - Third parameter is not a valid string
;                 |@error = 4 - 2nd and 3rd parameters must be different characters
; Author ........: czardas
; Comments ......; One dimensional arrays are returned as multiline text (without delimiters)
;                ; Some users may need to set the second parameter to semicolon to return the prefered CSV format
;                ; To convert to TSV use @TAB for the second parameter
;                ; Some users may wish to set the third parameter to @CRLF
; ===============================================================================================================================

Func _ArrayToCSV($aArray, $sDelim = Default, $sNewLine = Default, $bFinalBreak = True, $bEnclose = False)
    If Not IsArray($aArray) Or Ubound($aArray, 0) > 2 Or Ubound($aArray) = 0 Then Return SetError(1, 0 ,"")
    If $sDelim = Default Then $sDelim = ","
    If $sDelim = "" Then Return SetError(2, 0 ,"")
    If $sNewLine = Default Then $sNewLine = @LF
    If $sNewLine = "" Then Return SetError(3, 0 ,"")
    If $sDelim = $sNewLine Then Return SetError(4, 0, "")

    Local $iRows = UBound($aArray), $sString = ""
    If Ubound($aArray, 0) = 2 Then ; Check if the array has two dimensions
        Local $iCols = UBound($aArray, 2)
        For $i = 0 To $iRows -1
            For $j = 0 To $iCols -1
                If StringRegExp($aArray[$i][$j], '["\r\n' & $sDelim & ']') Or $bEnclose Then
                    $aArray[$i][$j] = '"' & StringReplace($aArray[$i][$j], '"', '""') & '"'
                EndIf
                $sString &= $aArray[$i][$j] & $sDelim
            Next
            $sString = StringTrimRight($sString, StringLen($sDelim)) & $sNewLine
        Next
    Else ; The delimiter is not needed
        For $i = 0 To $iRows -1
            If StringRegExp($aArray[$i], '["\r\n' & $sDelim & ']') Or $bEnclose Then
                $aArray[$i] = '"' & StringReplace($aArray[$i], '"', '""') & '"'
            EndIf
            $sString &= $aArray[$i] & $sNewLine
        Next
    EndIf
    If Not $bFinalBreak Then $sString = StringTrimRight($sString, StringLen($sNewLine)) ; Delete any newline characters added to the end of the string
    Return $sString
EndFunc ;==> _ArrayToCSV


; #FUNCTION# ====================================================================================================================
; Name...........: _CSVSplit
; Description ...: Converts a string in CSV format to a two dimensional array (see comments)
; Syntax.........: CSVSplit ( $aArray [, $sDelim ] )
; Parameters ....: $aArray  - The array to convert
;                  $sDelim  - Optional - Delimiter set to comma by default (see 2nd comment)
; Return values .: Success  - Returns a two dimensional array or a one dimensional array (see 1st comment)
;                  Failure  - Sets @error to:
;                 |@error = 1 - First parameter is not a valid string
;                 |@error = 2 - Second parameter is not a valid string
;                 |@error = 3 - Could not find suitable delimiter replacements
; Author ........: czardas
; Comments ......; Returns a one dimensional array if the input string does not contain the delimiter string
;                ; Some CSV formats use semicolon as a delimiter instead of a comma
;                ; Set the second parameter to @TAB To convert to TSV
; ===============================================================================================================================

Func _CSVSplit($string, $sDelim = ",") ; Parses csv string input and returns a one or two dimensional array
    If Not IsString($string) Or $string = "" Then Return SetError(1, 0, 0) ; Invalid string
    If Not IsString($sDelim) Or $sDelim = "" Then Return SetError(2, 0, 0) ; Invalid string

    $string = StringRegExpReplace($string, "[\r\n]+\z", "") ; [Line Added] Remove training breaks
    Local $iOverride = 63743, $asDelim[3] ; $asDelim => replacements for comma, new line and double quote
    For $i = 0 To 2
        $asDelim[$i] = __GetSubstitute($string, $iOverride) ; Choose a suitable substitution character
        If @error Then Return SetError(3, 0, 0) ; String contains too many unsuitable characters
    Next
    $iOverride = 0

    Local $aArray = StringRegExp($string, '\A[^"]+|("+[^"]+)|"+\z', 3) ; Split string using double quotes delim - largest match
    $string = ""

    Local $iBound = UBound($aArray)
    For $i = 0 To $iBound -1
        $iOverride += StringInStr($aArray[$i], '"', 0, -1) ; Increment by the number of adjacent double quotes per element
        If Mod ($iOverride +2, 2) = 0 Then ; Acts as an on/off switch
            $aArray[$i] = StringReplace($aArray[$i], $sDelim, $asDelim[0]) ; Replace comma delimeters
            $aArray[$i] = StringRegExpReplace($aArray[$i], "(\r\n)|[\r\n]", $asDelim[1]) ; Replace new line delimeters
        EndIf
        $aArray[$i] = StringReplace($aArray[$i], '""', $asDelim[2]) ; Replace double quote pairs
        $aArray[$i] = StringReplace($aArray[$i], '"', '') ; Delete enclosing double quotes - not paired
        $aArray[$i] = StringReplace($aArray[$i], $asDelim[2], '"') ; Reintroduce double quote pairs as single characters
        $string &= $aArray[$i] ; Rebuild the string, which includes two different delimiters
    Next
    $iOverride = 0

    $aArray = StringSplit($string, $asDelim[1], 2) ; Split to get rows
    $iBound = UBound($aArray)
    Local $aCSV[$iBound][2], $aTemp
    For $i = 0 To $iBound -1
        $aTemp = StringSplit($aArray[$i], $asDelim[0]) ; Split to get row items
        If Not @error Then
            If $aTemp[0] > $iOverride Then
                $iOverride = $aTemp[0]
                ReDim $aCSV[$iBound][$iOverride] ; Add columns to accomodate more items
            EndIf
        EndIf
        For $j = 1 To $aTemp[0]
            If StringLen($aTemp[$j]) Then
                If Not StringRegExp($aTemp[$j], '[^"]') Then ; Field only contains double quotes
                    $aTemp[$j] = StringTrimLeft($aTemp[$j], 1) ; Delete enclosing double quote single char
                EndIf
                $aCSV[$i][$j -1] = $aTemp[$j] ; Populate each row
            EndIf
        Next
    Next

    If $iOverride > 1 Then
        Return $aCSV ; Multiple Columns
    Else
        For $i = 0 To $iBound -1
            If StringLen($aArray[$i]) And (Not StringRegExp($aArray[$i], '[^"]')) Then ; Only contains double quotes
                $aArray[$i] = StringTrimLeft($aArray[$i], 1) ; Delete enclosing double quote single char
            EndIf
        Next
        Return $aArray ; Single column
    EndIf


EndFunc ;==> _CSVSplit

; #INTERNAL_USE_ONLY# ===========================================================================================================
; Name...........: __GetSubstitute
; Description ...: Searches for a character to be used for substitution, ie one not contained within the input string
; Syntax.........: __GetSubstitute($string, ByRef $iCountdown)
; Parameters ....: $string   - The string of characters to avoid
;                  $iCountdown - The first code point to begin checking
; Return values .: Success   - Returns a suitable substitution character not found within the first parameter
;                  Failure   - Sets @error to 1 => No substitution character available
; Author ........: czardas
; Comments ......; This function is connected to the function _CSVSplit and was not intended for general use
;                  $iCountdown is returned ByRef to avoid selecting the same character on subsequent calls to this function
;                  Initially $iCountown should be passed with a value = 63743
; ===============================================================================================================================

Func __GetSubstitute($string, ByRef $iCountdown)
    If $iCountdown < 57344 Then Return SetError(1, 0, "") ; Out of options
    Local $sTestChar
    For $i = $iCountdown To 57344 Step -1
        $sTestChar = ChrW($i)
        $iCountdown -= 1
        If Not StringInStr($string, $sTestChar) Then
            Return $sTestChar
        EndIf
    Next
    Return SetError(1, 0, "") ; Out of options
EndFunc ;==> __GetSubstitute

;

From the following input:

test,"\Device\HarddiskVolumeShadowCopy3455\Applications\APPS\Test\shared\driving\predrivingscreening\AArchive\Shiller,Ron09\INSTRUCTION.TXT",123
next line,",","\Device\HarddiskVolumeShadowCopy3455\Applications\APPS\Test\shared\driving\predrivingscreening\AArchive\Shiller,Ron09\INSTRUCTION.TXT"

;

I get (with the option set for all fields to be enclosed):

"test","\Device\HarddiskVolumeShadowCopy3455\Applications\APPS\Test\shared\driving\predrivingscreening\AArchive\ShillerRon09\INSTRUCTION.TXT","123"
"next line","","\Device\HarddiskVolumeShadowCopy3455\Applications\APPS\Test\shared\driving\predrivingscreening\AArchive\ShillerRon09\INSTRUCTION.TXT"

;

And from the RegExp, I get:

test,"\Device\HarddiskVolumeShadowCopy3455\Applications\APPS\Test\shared\driving\predrivingscreening\AArchive\ShillerRon09\INSTRUCTION.TXT",123
next line,",","\Device\HarddiskVolumeShadowCopy3455\Applications\APPS\Test\shared\driving\predrivingscreening\AArchive\ShillerRon09\INSTRUCTION.TXT"

I'm sure it's possible to adapt the RegExp.

Edited by czardas
Link to comment
Share on other sites

czardas, what about this one ?

$string = '3/1/15 11:49:11 AM,COMPUTER01,10.0.0.6,User-defined Rules:test .10,deny read,"\Device\HarddiskVolumeShadowCopy3455\Applications\APPS\Test\shared\driving\predrivingscreening\AArchive\Shiller,Ron,09\INSTRUCTION.TXT",,_,10.110.69.16,C:\Program Files\Veritas\NetBackup\bin\bpbkar32.exe,",",' & "'File' class or access,Notice,OAS,My Organization\Servers\Servers\,1092"

Do
  $string = stringregexpreplace($string, '(,"[^,"]*),(?=[^"]*")', "$1") 
Until @extended = 0
msgbox(0, "", $string)
Link to comment
Share on other sites

Thanks  :D

$string = '"this,is,a,test",3/1/15 11:49:11 AM,COMPUTER01,10.0.0.6,User-defined Rules:test .10,deny read,"\Device\HarddiskVolumeShadowCopy3455\Applications\APPS\Test\shared\driving\predrivingscreening\AArchive\Shiller,Ron,09\INSTRUCTION.TXT",,_,10.110.69.16,C:\Program Files\Veritas\NetBackup\bin\bpbkar32.exe,",",' & "'File' class or access,Notice,OAS,My Organization\Servers\Servers\,1092"

Do
  $string = stringregexpreplace($string, '((?:^|,)"[^,"]*),(?=[^"]*")', "$1") 
Until @extended = 0
msgbox(0, "", $string)
Link to comment
Share on other sites

  • Solution

Another way to delete multiple comas within multiple double quotes.

$string = '"this,is,a,test",3/1/15 11:49:11 AM,COMPUTER01,10.0.0.6,User-defined Rules:test .10,deny read,"\Device\HarddiskVolumeShadowCopy3455\Applications\APPS\Test\shared\driving\predrivingscreening\AArchive\Shiller,Ron,09\INSTRUCTION.TXT",,_,10.110.69.16,C:\Program Files\Veritas\NetBackup\bin\bpbkar32.exe,",",' & "'File' class or access,Notice,OAS,My Organization\Servers\Servers\,1092"
; Result=     "thisisatest",3/1/15 11:49:11 AM,COMPUTER01,10.0.0.6,User-defined Rules:test .10,deny read,"\Device\HarddiskVolumeShadowCopy3455\Applications\APPS\Test\shared\driving\predrivingscreening\AArchive\ShillerRon09\INSTRUCTION.TXT",,_,10.110.69.16,C:\Program Files\Veritas\NetBackup\bin\bpbkar32.exe,"",'File' class or access,Notice,OAS,My Organization\Servers\Servers\,1092

$string = Execute('"' & StringRegExpReplace($string, '(".+?")', '" & StringReplace(""\1"",",","") & "') & '"')

ConsoleWrite($string & @LF)
Link to comment
Share on other sites

A neat trick Malkey. I'm just wondering if using Execute() like this might be limited by the maximum length of a line of code.

Edit

It seems to work on large input, which I didn't expect. I'm still trying to figure out why Execute is able to handle such a long expression. :wacko:

Edited by czardas
Link to comment
Share on other sites

Thanks everyone for your input! I'm not sure which one I should mark as the answer now... because it appears that we had more questions and multiple solutions! Great work everyone!  :graduated:

"Maybe I'm on a road that ain't been paved yet. And maybe I see a sign that ain't been made yet"
Song Title: I guess you could say
Artist: Middle Class Rut

Link to comment
Share on other sites

  • Moderators

Mark the one that you feel confident works and you most understand ;) ...

Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.

Link to comment
Share on other sites

  • 1 month later...

I don't know where the "Solved" part of this Q/A went... I know the forums are going through a modification - but this worked for me:

For $i = UBound($array01) - 1 To 0 Step -1
$array01[$i] = Execute('"' & StringRegExpReplace($array01[$i], '(".+?")', '" & StringReplace(""\1"",",","") & "') & '"')
Next

Sorry, don't see AutoIT formatting / colors either... forums still updating I'll bet.

Edited by souldjer777

"Maybe I'm on a road that ain't been paved yet. And maybe I see a sign that ain't been made yet"
Song Title: I guess you could say
Artist: Middle Class Rut

Link to comment
Share on other sites

  • 3 years later...
On 3/2/2015 at 4:48 PM, mikell said:

Thanks  :D

$string = '"this,is,a,test",3/1/15 11:49:11 AM,COMPUTER01,10.0.0.6,User-defined Rules:test .10,deny read,"\Device\HarddiskVolumeShadowCopy3455\Applications\APPS\Test\shared\driving\predrivingscreening\AArchive\Shiller,Ron,09\INSTRUCTION.TXT",,_,10.110.69.16,C:\Program Files\Veritas\NetBackup\bin\bpbkar32.exe,",",' & "'File' class or access,Notice,OAS,My Organization\Servers\Servers\,1092"

Do
  $string = stringregexpreplace($string, '((?:^|,)"[^,"]*),(?=[^"]*")', "$1") 
Until @extended = 0
msgbox(0, "", $string)

Mikell,

I am using the following function with success with the exception of one instance. Anytime there is a period followed by a comma (.,) it is not stripped. Thank you in advance for your help.

$FF = stringregexpreplace($FF, '((?:^|,)"[^,"]*),(?=[^"]*")', "$1")

Link to comment
Share on other sites

I can't reproduce the issue you mention.

Post a non-working short example.

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Link to comment
Share on other sites

The regexp posted which you seem to use is setup to replace commas in strings enclosed in double quotes within the whole subject.

Local $FF = 'abc,"SAPPS,DAVID T., DDS, PC",456,"1,2.,3",1,2.,3', $string

Do
    $FF = StringRegExpReplace($FF, '((?:^|,)"[^,"]*),(?=[^"]*")', "$1")
Until @extended = 0
MsgBox(0, "", $FF)

The string you posted is just a plain AutoIt string.

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...