RegExp - How To Replace Every Other Quote Found?

buymeapc · June 10, 2014

Hi all,

So, I have strings where I'm trying to delimit each value, that's currently separated with spaces, with a pipe. Here are a few of the strings:

"webstratauthentication\enableuserpasswordencryption.sql"02/23/2011  01:17:22 PM         1401  "12312312312312312""A"           "b04ecbc4-e0e2-c14b-d780-656d348b0513"  "\\nas123456n\ct40_packaging\builds\product\productname\hss\authentication\enableuserpasswordencryption.sql"
        "webstratauthentication\enableuserpasswordencryption.sql"02/23/2011  01:17:22 PM         1401  ""                 "A"           "b04ecbc4-e0e2-c14b-d780-656d348b0513"  "\\nas123456n\ct40_packaging\builds\product\productname\hss\authentication\enableuserpasswordencryption.sql"
        "ikernel.dll"                                           04/26/2002  07:48:38 PM            0  ""                 "A"           "d98c1dd4-008f-04b2-e980-0998ecf8427e"  "c:\program files\installshield 10.5\support\build\ikernel.dll"

I'd like to return the strings looking like this:

"webstratauthentication\enableuserpasswordencryption.sql"|02/23/2011|01:17:22 PM|1401|"12312312312312312"|"A"|"b04ecbc4-e0e2-c14b-d780-656d348b0513"|"\\nas123456n\ct40_packaging\builds\product\productname\hss\authentication\enableuserpasswordencryption.sql"
"webstratauthentication\enableuserpasswordencryption.sql"|02/23/2011|01:17:22 PM|1401|""|"A"|"b04ecbc4-e0e2-c14b-d780-656d348b0513"|"\\nas123456n\ct40_packaging\builds\product\productname\hss\authentication\enableuserpasswordencryption.sql"
"ikernel.dll"|04/26/2002|07:48:38 PM|0|""|"A"|"d98c1dd4-008f-04b2-e980-0998ecf8427e"|"c:\program files\installshield 10.5\support\build\ikernel.dll"

I've tried replacing the double spaces with pipes, but that messes up the first string and leaves "12312312312312312""A" without adding the pipe between the two values.

$sLine = StringRegExpReplace($sLine, "[ ]{2,}", "|")

So, I concocted a way to do it with a string-based approach.

; Strip the leading and trailing ws
$sTempLine = StringStripWS($sTempLine, 1 + 2)
; Replace all double spaces with pipes
$sTempLine = StringRegExpReplace($sTempLine, "[ ]{2,}", "|")
If StringInStr($sTempLine, '""A"') > 0 Then
    $sTempLine = StringReplace($sTempLine, '""A"', '"|"A"')
EndIf
; And check if the file name needs a pipe after its second double quote
$iPos = StringInStr($sTempLine, '"', 0, 2)
If StringMid($sTempLine, $iPos, 2) <> '"|' Then
    $sTempLine1 = StringMid($sTempLine, 1, $iPos)
    $sTempLine1 &= "|" & StringMid($sTempLine, ($iPos + 1), (StringLen($sTempLine) - $iPos))
    $sTempLine1 = StringReplace($sTempLine1, '" "', '"|"')
    $sTempLine = $sTempLine1
EndIf
ConsoleWrite("-> " & $sTempLine & @CRLF)

Is there a better regex that I can use to separate these values in the strings with pipes? I thought that it might be easiest to just put a pipe after every second quote...

Thank you for the help.

Edited June 10, 2014 by buymeapc

BrewManNH · June 10, 2014

One of the problems with your text is that there aren't any spaces between "12312312312312312" and "A", or between "webstratauthenticationenableuserpasswordencryption.sql" and 02/23/2011. You might have to do it with 2 passes, as I don't know that a single one will work.

czardas · June 10, 2014

Answering the question in the title. You don't need a regular expression for this. You can use StringSplit() with quote as delimiter. Loop through the array and reinsert quote at the start of every other element.

buymeapc · June 10, 2014

Ok, I found a way to do it through string functions. It works, but it's just not as magical as a regex :idiot:

I was always under the impression that using regex would work much faster than the below code.

#include <String.au3>
Dim $aLine[3], $aGood[3]
$aLine[0] = '       "webstratauthentication\enableuserpasswordencryption.sql"02/23/2011  01:17:22 PM         1401  "12312312312312312""A"           "b04ecbc4-e0e2-c14b-d780-656d348b0513"  "\\nas123456n\ct40_packaging\builds\product\productname\hss\authentication\enableuserpasswordencryption.sql"'
$aLine[1] = '       "webstratauthentication\enableuserpasswordencryption.sql"02/23/2011  01:17:22 PM         1401  ""                 "A"           "b04ecbc4-e0e2-c14b-d780-656d348b0513"  "\\nas123456n\ct40_packaging\builds\product\productname\hss\authentication\enableuserpasswordencryption.sql"'
$aLine[2] = '       "ikernel.dll"                                           04/26/2002  07:48:38 PM            0  ""                 "A"           "d98c1dd4-008f-04b2-e980-0998ecf8427e"  "c:\program files\installshield 10.5\support\build\ikernel.dll"'
$aGood[0] = '"webstratauthentication\enableuserpasswordencryption.sql"|02/23/2011|01:17:22 PM|1401|"12312312312312312"|"A"|"b04ecbc4-e0e2-c14b-d780-656d348b0513"|"\\nas123456n\ct40_packaging\builds\product\productname\hss\authentication\enableuserpasswordencryption.sql"'
$aGood[1] = '"webstratauthentication\enableuserpasswordencryption.sql"|02/23/2011|01:17:22 PM|1401|""|"A"|"b04ecbc4-e0e2-c14b-d780-656d348b0513"|"\\nas123456n\ct40_packaging\builds\product\productname\hss\authentication\enableuserpasswordencryption.sql"'
$aGood[2] = '"ikernel.dll"|04/26/2002|07:48:38 PM|0|""|"A"|"d98c1dd4-008f-04b2-e980-0998ecf8427e"|"c:\program files\installshield 10.5\support\build\ikernel.dll"'

For $i = 0 To UBound($aLine) - 1
    $b = False
    $aLine[$i] = StringStripWS($aLine[$i], 1 + 2)
    $aLine[$i] = StringRegExpReplace($aLine[$i], "[ ]{2,}", "|")
    For $j = 1 To StringLen($aLine[$i])
        If StringMid($aLine[$i], $j, 1) = '"' And $b = False Then
            $b = True
        ElseIf StringMid($aLine[$i], $j, 1) = '"' And $b = True Then
            $b = False
            If StringMid($aLine[$i], ($j + 1), 1) <> '|' Then
                ConsoleWrite("Before (" & $j &"): " & $aLine[$i] & @CRLF)
                If $j <> StringLen($aLine[$i]) Then $aLine[$i] = _StringInsert($aLine[$i], '|', $j)
                ConsoleWrite("After  (" & $j &"): " & $aLine[$i] & @CRLF)
            EndIf
        EndIf
    Next
    If $aLine[$i] <> $aGood[$i] Then
        ConsoleWrite("! " &$aLine[$i] & @CRLF)
    Else
        ConsoleWrite("+ " &$aLine[$i] & @CRLF)
    EndIf
Next

DXRW4E · June 10, 2014

$sData = StringTrimRight(StringRegExpReplace($sData, '\h*+((?:[^"\h]|"[^"]*")+)\h*+', "$1|"), 1)

is just a quick example, certainly pattern can be done better

Ciao.

BrewManNH · June 10, 2014

RegExp is slow, slower than most string functions, but invaluable when you have something tricky to split up/find. Practically useless when the data isn't consistent though. Best advice, learn when NOT to use RegExp.

DXRW4E · June 10, 2014

not true, is completely contrary, today without RegExp, will be unable to do anything, or better to say the ratio of the speed is 1-1000 if you check the files 1-10 Mb, not to mention other after etc etc etc, even users that working in C++11 (Machine Code ehhhh), which (regarding the speed) can certainly do without RegEx, they too today thank RegEx, because thanks to the RegEx also in C++ itself performance have grown much

#cs ----------------------------------------------------------------------------

 AutoIt Version: 3.3.12.0
 Author:         myName

 Script Function:
    Template AutoIt script.

#ce ----------------------------------------------------------------------------

Local $sData, $fTimerDiff, $iReturn
$sData = "shfjsdjhsdhdsdudddbfHJYSJ24472727637addddddjkdaaDJDDJDDGDGL|DEUEIEUY"

$fTimerDiff = TimerInit()
For $i = 1 To 1000000
    $iReturn = StringInStr($sData, "|")
Next
$fTimerDiff = TimerDiff($fTimerDiff)
ConsoleWrite("StringInStr TimerDiff - " & $fTimerDiff & @Lf)

$fTimerDiff = TimerInit()
For $i = 1 To 1000000
    $iReturn = StringRegExp($sData, "(?i)\|")
Next
$fTimerDiff = TimerDiff($fTimerDiff)
ConsoleWrite("StringRegExp TimerDiff - " & $fTimerDiff & @Lf)

;~ >Running AU3Check (3.3.13.0)  from:C:\Program Files (x86)\AutoIt3  input:C:\Users\DXRW4E\Desktop\New AutoIt v3 Script (2).au3
;~ +>21:47:54 AU3Check ended.rc:0
;~ >Running:(3.3.12.0):C:\Program Files (x86)\AutoIt3\autoit3.exe "C:\Users\DXRW4E\Desktop\New AutoIt v3 Script (2).au3"    
;~ --> Press Ctrl+Alt+F5 to Restart or Ctrl+Break to Stop
;~ StringInStr TimerDiff - 9762.88568999148
;~ StringRegExp TimerDiff - 6907.45375393011
;~ +>21:48:10 AutoIt3.exe ended.rc:0
;~ +>21:48:10 AutoIt3Wrapper Finished.
;~ >Exit code: 0    Time: 17.02

Edited June 10, 2014 by DXRW4E

mikell · June 10, 2014

Using 2 passes is much easier

#include <String.au3>
Dim $aLine[3], $aGood[3]
$aLine[0] = '       "webstratauthentication\enableuserpasswordencryption.sql"02/23/2011  01:17:22 PM         1401  "12312312312312312""A"           "b04ecbc4-e0e2-c14b-d780-656d348b0513"  "\\nas123456n\ct40_packaging\builds\product\productname\hss\authentication\enableuserpasswordencryption.sql"'
$aLine[1] = '       "webstratauthentication\enableuserpasswordencryption.sql"02/23/2011  01:17:22 PM         1401  ""                 "A"           "b04ecbc4-e0e2-c14b-d780-656d348b0513"  "\\nas123456n\ct40_packaging\builds\product\productname\hss\authentication\enableuserpasswordencryption.sql"'
$aLine[2] = '       "ikernel.dll"                                           04/26/2002  07:48:38 PM            0  ""                 "A"           "d98c1dd4-008f-04b2-e980-0998ecf8427e"  "c:\program files\installshield 10.5\support\build\ikernel.dll"'
$aGood[0] = '"webstratauthentication\enableuserpasswordencryption.sql"|02/23/2011|01:17:22 PM|1401|"12312312312312312"|"A"|"b04ecbc4-e0e2-c14b-d780-656d348b0513"|"\\nas123456n\ct40_packaging\builds\product\productname\hss\authentication\enableuserpasswordencryption.sql"'
$aGood[1] = '"webstratauthentication\enableuserpasswordencryption.sql"|02/23/2011|01:17:22 PM|1401|""|"A"|"b04ecbc4-e0e2-c14b-d780-656d348b0513"|"\\nas123456n\ct40_packaging\builds\product\productname\hss\authentication\enableuserpasswordencryption.sql"'
$aGood[2] = '"ikernel.dll"|04/26/2002|07:48:38 PM|0|""|"A"|"d98c1dd4-008f-04b2-e980-0998ecf8427e"|"c:\program files\installshield 10.5\support\build\ikernel.dll"'

For $i = 0 To UBound($aLine) - 1
    $aLine[$i] = StringTrimRight(StringTrimLeft(StringRegExpReplace(StringRegExpReplace($aLine[$i], '\h{2,}', "|"), '(?<!["|])"(?!\|)', '"|'), 1), 1)

    If $aLine[$i] <> $aGood[$i] Then
        ConsoleWrite("! " &$aLine[$i] & @CRLF)
    Else
        ConsoleWrite("+ " &$aLine[$i] & @CRLF)
    EndIf
Next

Sign In

RegExp - How To Replace Every Other Quote Found?

Recommended Posts

buymeapc

BrewManNH

czardas

buymeapc

DXRW4E

BrewManNH

DXRW4E

mikell

Create an account or sign in to comment

Create an account

Sign in

Similar Content

Extract hex number from string

Make a metasymbol condition to make operations over the text file line

RegExp Multiline Comments

StringRegExpSplit

RegExp - Remove white spaces before and after coma

Browse

AutoIt Resources

Release

Beta