Sign in to follow this  
Followers 0
kylomas

Use of circumflex in regexp to remove blank lines

24 posts in this topic

Regexp Experts,

I found this regexp pattern from a snippet posted by either Malkey of AZJIO (sorry, don't remember which). After reading the PCRE doc linked to in the help file, AGAIN, I do not understand the use of the circumflex.

local $str1 = fileread($fl_name1)
$str1 = stringregexpreplace($str1,'(?m:^)rn',"")

Could someone break this pattern down (preferrably in the format that M23 and jchd use).

My understanding of the pattern at his point is:

( start group

?m: set group to non-capturing, multiline

^ Anchor line/string at start? (how does it know what start is?)

) close group

rn match on @cr @lf as first two chars after anchor

Thanks,

kylomas


Forum Rules         Procedure for posting code

"I like pigs.  Dogs look up to us.  Cats look down on us.  Pigs treat us as equals."

- Sir Winston Churchill

Share this post


Link to post
Share on other sites



This SRE probably makes more sense to you >>

#cs
    (?m)      ^ and $ match newlines within data.
    ^         Start matching at the beginning of a new line.
    rn      Match @CRLF at the start of a line.
#ce
Local $sData = ''
For $i = 1 To 100
    If Random(0, 1, 1) Then
        $sData &= @CRLF
    Else
        $sData &= 'Random' & @CRLF
    EndIf
Next
ConsoleWrite(StringRegExpReplace($sData, '(?m)^rn', '')) ; Only works with CRLF line endings & not @CR or @LF.
You seemed to have grasped the basic understanding.

_AdapterConnections()_AlwaysRun()_AppMon()_AppMonEx()_BinaryBin()_CheckMsgBox()_CmdLineRaw()_ContextMenu()_ConvertLHWebColor()/_ConvertSHWebColor()_DesktopDimensions()_DisplayPassword()_DotNet_Load()/_DotNet_Unload()_Fibonacci()_FileCompare()_FileCompareContents()_FileNameByHandle()_FilePrefix/SRE()_FindInFile()_GetBackgroundColor()/_SetBackgroundColor()_GetConrolID()_GetCtrlClass()_GetDirectoryFormat()_GetDriveMediaType()_GetFilename()/_GetFilenameExt()_GetHardwareID()_GetIP()_GetIP_Country()_GetOSLanguage()_GetSavedSource()_GetStringSize()_GetSystemPaths()_GetURLImage()_GIFImage()_GoogleWeather()_GUICtrlCreateGroup()_GUICtrlListBox_CreateArray()_GUICtrlListView_CreateArray()_GUICtrlListView_SaveCSV()_GUICtrlListView_SaveHTML()_GUICtrlListView_SaveTxt()_GUICtrlListView_SaveXML()_GUICtrlMenu_Recent()_GUICtrlMenu_SetItemImage()_GUICtrlTreeView_CreateArray()_GUIDisable()_GUIImageList_SetIconFromHandle()_GUIRegisterMsg()_GUISetIcon()_Icon_Clear()/_Icon_Set()_IdleTime()_InetGet()_InetGetGUI()_InetGetProgress()_IPDetails()_IsFileOlder()_IsGUID()_IsHex()_IsPalindrome()_IsRegKey()_IsStringRegExp()_IsSystemDrive()_IsUPX()_IsValidType()_IsWebColor()_Language()_Log()_MicrosoftInternetConnectivity()_MSDNDataType()_PathFull/GetRelative/Split()_PathSplitEx()_PrintFromArray()_ProgressSetMarquee()_ReDim()_RockPaperScissors()/_RockPaperScissorsLizardSpock()_ScrollingCredits_SelfDelete()_SelfRename()_SelfUpdate()_SendTo()_ShellAll()_ShellFile()_ShellFolder()_SingletonHWID()_SingletonPID()_Startup()_StringCompact()_StringIsValid()_StringRegExpMetaCharacters()_StringReplaceWholeWord()_StringStripChars()_Temperature()_TrialPeriod()_UKToUSDate()/_USToUKDate()_WinAPI_Create_CTL_CODE()_WinAPI_CreateGUID()_WMIDateStringToDate()/_DateToWMIDateString()Au3 script parsingAutoIt SearchAutoIt3 PortableAutoIt3WrapperToPragmaAutoItWinGetTitle()/AutoItWinSetTitle()CodingDirToHTML5FileInstallrFileReadLastChars()GeoIP databaseGUI - Only Close ButtonGUI ExamplesGUICtrlDeleteImage()GUICtrlGetBkColor()GUICtrlGetStyle()GUIEventsGUIGetBkColor()Int_Parse() & Int_TryParse()IsISBN()LockFile()Mapping CtrlIDsOOP in AutoItParseHeadersToSciTE()PasswordValidPasteBinPosts Per DayPreExpandProtect GlobalsQueue()Resource UpdateResourcesExSciTE JumpSettings INISHELLHOOKShunting-YardSignature CreatorStack()Stopwatch()StringAddLF()/StringStripLF()StringEOLToCRLF()VSCROLLWM_COPYDATAMore Examples...

Updated: 04/09/2015

Share this post


Link to post
Share on other sites

#3 ·  Posted (edited)

What about this instead?

#cs
    (?m)      ^ and $ match newlines within data.
    ^         Start matching at the beginning of a new line.
    rn      Match @CRLF or @LF or @CR or blank space with a line ending at the start of a line.
#ce
Local $sData = ''
For $i = 1 To 10
    If Random(0, 1, 1) Then
        If Random(0, 1, 1) Then
            $sData &= '                                    ' & @CRLF
        Else
            $sData &= @CR
        EndIf
    Else
        $sData &= 'Random' & @CRLF
    EndIf
Next
ConsoleWrite('Before: ' & @CRLF & $sData & @CRLF)
ConsoleWrite('After: ' & @CRLF & StringRegExpReplace($sData, '(?m)^[ rn]*', ''))

OR by the SRE master >>

ConsoleWrite(_RemoveBlankLines(FileRead(@ScriptFullPath)))

Func _RemoveBlankLines($sString) ; By Robjong.
    Return StringRegExpReplace($sString, '(?m:^s*r?n)|s*z', '')
EndFunc   ;==>_RemoveBlankLines
Edited by guinness

_AdapterConnections()_AlwaysRun()_AppMon()_AppMonEx()_BinaryBin()_CheckMsgBox()_CmdLineRaw()_ContextMenu()_ConvertLHWebColor()/_ConvertSHWebColor()_DesktopDimensions()_DisplayPassword()_DotNet_Load()/_DotNet_Unload()_Fibonacci()_FileCompare()_FileCompareContents()_FileNameByHandle()_FilePrefix/SRE()_FindInFile()_GetBackgroundColor()/_SetBackgroundColor()_GetConrolID()_GetCtrlClass()_GetDirectoryFormat()_GetDriveMediaType()_GetFilename()/_GetFilenameExt()_GetHardwareID()_GetIP()_GetIP_Country()_GetOSLanguage()_GetSavedSource()_GetStringSize()_GetSystemPaths()_GetURLImage()_GIFImage()_GoogleWeather()_GUICtrlCreateGroup()_GUICtrlListBox_CreateArray()_GUICtrlListView_CreateArray()_GUICtrlListView_SaveCSV()_GUICtrlListView_SaveHTML()_GUICtrlListView_SaveTxt()_GUICtrlListView_SaveXML()_GUICtrlMenu_Recent()_GUICtrlMenu_SetItemImage()_GUICtrlTreeView_CreateArray()_GUIDisable()_GUIImageList_SetIconFromHandle()_GUIRegisterMsg()_GUISetIcon()_Icon_Clear()/_Icon_Set()_IdleTime()_InetGet()_InetGetGUI()_InetGetProgress()_IPDetails()_IsFileOlder()_IsGUID()_IsHex()_IsPalindrome()_IsRegKey()_IsStringRegExp()_IsSystemDrive()_IsUPX()_IsValidType()_IsWebColor()_Language()_Log()_MicrosoftInternetConnectivity()_MSDNDataType()_PathFull/GetRelative/Split()_PathSplitEx()_PrintFromArray()_ProgressSetMarquee()_ReDim()_RockPaperScissors()/_RockPaperScissorsLizardSpock()_ScrollingCredits_SelfDelete()_SelfRename()_SelfUpdate()_SendTo()_ShellAll()_ShellFile()_ShellFolder()_SingletonHWID()_SingletonPID()_Startup()_StringCompact()_StringIsValid()_StringRegExpMetaCharacters()_StringReplaceWholeWord()_StringStripChars()_Temperature()_TrialPeriod()_UKToUSDate()/_USToUKDate()_WinAPI_Create_CTL_CODE()_WinAPI_CreateGUID()_WMIDateStringToDate()/_DateToWMIDateString()Au3 script parsingAutoIt SearchAutoIt3 PortableAutoIt3WrapperToPragmaAutoItWinGetTitle()/AutoItWinSetTitle()CodingDirToHTML5FileInstallrFileReadLastChars()GeoIP databaseGUI - Only Close ButtonGUI ExamplesGUICtrlDeleteImage()GUICtrlGetBkColor()GUICtrlGetStyle()GUIEventsGUIGetBkColor()Int_Parse() & Int_TryParse()IsISBN()LockFile()Mapping CtrlIDsOOP in AutoItParseHeadersToSciTE()PasswordValidPasteBinPosts Per DayPreExpandProtect GlobalsQueue()Resource UpdateResourcesExSciTE JumpSettings INISHELLHOOKShunting-YardSignature CreatorStack()Stopwatch()StringAddLF()/StringStripLF()StringEOLToCRLF()VSCROLLWM_COPYDATAMore Examples...

Updated: 04/09/2015

Share this post


Link to post
Share on other sites

guinness,

Thanks for the prompt reply. I'm starting to get it (with much thanks to Rob!!). I keep tripping up on stupid shit, like, now I'm wondering exactly what does the PCRE doc mean by the phrase "new line or string".

Gotta keep this short...am using a keyboard that is a little like timing a race with a sun-dial!

kylomas


Forum Rules         Procedure for posting code

"I like pigs.  Dogs look up to us.  Cats look down on us.  Pigs treat us as equals."

- Sir Winston Churchill

Share this post


Link to post
Share on other sites

guinness

OK, I will re-think about adding it.

_AdapterConnections()_AlwaysRun()_AppMon()_AppMonEx()_BinaryBin()_CheckMsgBox()_CmdLineRaw()_ContextMenu()_ConvertLHWebColor()/_ConvertSHWebColor()_DesktopDimensions()_DisplayPassword()_DotNet_Load()/_DotNet_Unload()_Fibonacci()_FileCompare()_FileCompareContents()_FileNameByHandle()_FilePrefix/SRE()_FindInFile()_GetBackgroundColor()/_SetBackgroundColor()_GetConrolID()_GetCtrlClass()_GetDirectoryFormat()_GetDriveMediaType()_GetFilename()/_GetFilenameExt()_GetHardwareID()_GetIP()_GetIP_Country()_GetOSLanguage()_GetSavedSource()_GetStringSize()_GetSystemPaths()_GetURLImage()_GIFImage()_GoogleWeather()_GUICtrlCreateGroup()_GUICtrlListBox_CreateArray()_GUICtrlListView_CreateArray()_GUICtrlListView_SaveCSV()_GUICtrlListView_SaveHTML()_GUICtrlListView_SaveTxt()_GUICtrlListView_SaveXML()_GUICtrlMenu_Recent()_GUICtrlMenu_SetItemImage()_GUICtrlTreeView_CreateArray()_GUIDisable()_GUIImageList_SetIconFromHandle()_GUIRegisterMsg()_GUISetIcon()_Icon_Clear()/_Icon_Set()_IdleTime()_InetGet()_InetGetGUI()_InetGetProgress()_IPDetails()_IsFileOlder()_IsGUID()_IsHex()_IsPalindrome()_IsRegKey()_IsStringRegExp()_IsSystemDrive()_IsUPX()_IsValidType()_IsWebColor()_Language()_Log()_MicrosoftInternetConnectivity()_MSDNDataType()_PathFull/GetRelative/Split()_PathSplitEx()_PrintFromArray()_ProgressSetMarquee()_ReDim()_RockPaperScissors()/_RockPaperScissorsLizardSpock()_ScrollingCredits_SelfDelete()_SelfRename()_SelfUpdate()_SendTo()_ShellAll()_ShellFile()_ShellFolder()_SingletonHWID()_SingletonPID()_Startup()_StringCompact()_StringIsValid()_StringRegExpMetaCharacters()_StringReplaceWholeWord()_StringStripChars()_Temperature()_TrialPeriod()_UKToUSDate()/_USToUKDate()_WinAPI_Create_CTL_CODE()_WinAPI_CreateGUID()_WMIDateStringToDate()/_DateToWMIDateString()Au3 script parsingAutoIt SearchAutoIt3 PortableAutoIt3WrapperToPragmaAutoItWinGetTitle()/AutoItWinSetTitle()CodingDirToHTML5FileInstallrFileReadLastChars()GeoIP databaseGUI - Only Close ButtonGUI ExamplesGUICtrlDeleteImage()GUICtrlGetBkColor()GUICtrlGetStyle()GUIEventsGUIGetBkColor()Int_Parse() & Int_TryParse()IsISBN()LockFile()Mapping CtrlIDsOOP in AutoItParseHeadersToSciTE()PasswordValidPasteBinPosts Per DayPreExpandProtect GlobalsQueue()Resource UpdateResourcesExSciTE JumpSettings INISHELLHOOKShunting-YardSignature CreatorStack()Stopwatch()StringAddLF()/StringStripLF()StringEOLToCRLF()VSCROLLWM_COPYDATAMore Examples...

Updated: 04/09/2015

Share this post


Link to post
Share on other sites

AZJIO,

Why does this not remove all EOL chars resulting in one long string?

'(rn|r|n){2,}'

kylomas


Forum Rules         Procedure for posting code

"I like pigs.  Dogs look up to us.  Cats look down on us.  Pigs treat us as equals."

- Sir Winston Churchill

Share this post


Link to post
Share on other sites

kylomas,

{2,} - removes two or more repetitions. Can not delete one instance.

Share this post


Link to post
Share on other sites

#10 ·  Posted (edited)

AZJIO,

When I run this

#include <array.au3>

local $str1 = @crlf & '1' & @crlf & '2' & @crlf & '3' & @crlf & '4' & @crlf & '5' & @crlf & @crlf
local $str2 = stringregexpreplace($str1,'(rn|r|n){2,}',"")
consolewrite($str1 & $str2 & @lf)
I get one continuous string...am I doing something wrong?

kylomas

edit: when I change the repetitions to 3 it seems to work...can you explain why?

Edited by kylomas

Forum Rules         Procedure for posting code

"I like pigs.  Dogs look up to us.  Cats look down on us.  Pigs treat us as equals."

- Sir Winston Churchill

Share this post


Link to post
Share on other sites

#11 ·  Posted (edited)

@CRLF equals rn
match rn against your group, first entry rn; a hit!
Next character ?, does the group repeat?
No, doesn't match regex pattern.
Back to start of string.
match rn against your group, second entry r; a hit!
Next character n, does the group repeat?
match n against your group, third entry; a hit!
Two hits satisfies the quantifier {2,}
...
Next character ?, does the group repeat?
No, doesn't satisfy the quantifier {3,}

Edited by dany

[center]Spiderskank Spiderskank[/center]GetOpt Parse command line options UDF | AU3Text Program internationalization UDF | Identicon visual hash UDF

Share this post


Link to post
Share on other sites

#12 ·  Posted (edited)

@kylomas

Yes you missed the replacement. This method as shown by AZJIO in post #5 replaces multiple instances of the line ending with a single line ending.

#include <array.au3>

local $str1 = @crlf & '1' & @crlf & '2' & @crlf & '3' & @crlf & '4' & @crlf & '5' & @crlf & @crlf
local $str2 = stringregexpreplace($str1,'(rn|r|n){2,}',"1")
consolewrite($str1 & $str2 & @lf)
Edited by Bowmore

"Programming today is a race between software engineers striving to build bigger and better idiot-proof programs, and the universe trying to build bigger and better idiots. So far, the universe is winning."- Rick Cook

Share this post


Link to post
Share on other sites

@Bowmore,

Duh!! Thanks

kylomas


Forum Rules         Procedure for posting code

"I like pigs.  Dogs look up to us.  Cats look down on us.  Pigs treat us as equals."

- Sir Winston Churchill

Share this post


Link to post
Share on other sites

Well this was an interesting discussion that caught the attention of many.


_AdapterConnections()_AlwaysRun()_AppMon()_AppMonEx()_BinaryBin()_CheckMsgBox()_CmdLineRaw()_ContextMenu()_ConvertLHWebColor()/_ConvertSHWebColor()_DesktopDimensions()_DisplayPassword()_DotNet_Load()/_DotNet_Unload()_Fibonacci()_FileCompare()_FileCompareContents()_FileNameByHandle()_FilePrefix/SRE()_FindInFile()_GetBackgroundColor()/_SetBackgroundColor()_GetConrolID()_GetCtrlClass()_GetDirectoryFormat()_GetDriveMediaType()_GetFilename()/_GetFilenameExt()_GetHardwareID()_GetIP()_GetIP_Country()_GetOSLanguage()_GetSavedSource()_GetStringSize()_GetSystemPaths()_GetURLImage()_GIFImage()_GoogleWeather()_GUICtrlCreateGroup()_GUICtrlListBox_CreateArray()_GUICtrlListView_CreateArray()_GUICtrlListView_SaveCSV()_GUICtrlListView_SaveHTML()_GUICtrlListView_SaveTxt()_GUICtrlListView_SaveXML()_GUICtrlMenu_Recent()_GUICtrlMenu_SetItemImage()_GUICtrlTreeView_CreateArray()_GUIDisable()_GUIImageList_SetIconFromHandle()_GUIRegisterMsg()_GUISetIcon()_Icon_Clear()/_Icon_Set()_IdleTime()_InetGet()_InetGetGUI()_InetGetProgress()_IPDetails()_IsFileOlder()_IsGUID()_IsHex()_IsPalindrome()_IsRegKey()_IsStringRegExp()_IsSystemDrive()_IsUPX()_IsValidType()_IsWebColor()_Language()_Log()_MicrosoftInternetConnectivity()_MSDNDataType()_PathFull/GetRelative/Split()_PathSplitEx()_PrintFromArray()_ProgressSetMarquee()_ReDim()_RockPaperScissors()/_RockPaperScissorsLizardSpock()_ScrollingCredits_SelfDelete()_SelfRename()_SelfUpdate()_SendTo()_ShellAll()_ShellFile()_ShellFolder()_SingletonHWID()_SingletonPID()_Startup()_StringCompact()_StringIsValid()_StringRegExpMetaCharacters()_StringReplaceWholeWord()_StringStripChars()_Temperature()_TrialPeriod()_UKToUSDate()/_USToUKDate()_WinAPI_Create_CTL_CODE()_WinAPI_CreateGUID()_WMIDateStringToDate()/_DateToWMIDateString()Au3 script parsingAutoIt SearchAutoIt3 PortableAutoIt3WrapperToPragmaAutoItWinGetTitle()/AutoItWinSetTitle()CodingDirToHTML5FileInstallrFileReadLastChars()GeoIP databaseGUI - Only Close ButtonGUI ExamplesGUICtrlDeleteImage()GUICtrlGetBkColor()GUICtrlGetStyle()GUIEventsGUIGetBkColor()Int_Parse() & Int_TryParse()IsISBN()LockFile()Mapping CtrlIDsOOP in AutoItParseHeadersToSciTE()PasswordValidPasteBinPosts Per DayPreExpandProtect GlobalsQueue()Resource UpdateResourcesExSciTE JumpSettings INISHELLHOOKShunting-YardSignature CreatorStack()Stopwatch()StringAddLF()/StringStripLF()StringEOLToCRLF()VSCROLLWM_COPYDATAMore Examples...

Updated: 04/09/2015

Share this post


Link to post
Share on other sites

@all - thank you for you patience and Happy Thanksgiving to our U.S. contingent.

(automatically signed off due to stupid quota exceeded)


Forum Rules         Procedure for posting code

"I like pigs.  Dogs look up to us.  Cats look down on us.  Pigs treat us as equals."

- Sir Winston Churchill

Share this post


Link to post
Share on other sites

@all - thank you for you patience and Happy Thanksgiving to our U.S. contingent.

(automatically signed off due to stupid quota exceeded)

Quota exceeded? Not from the Forum I hope.

_AdapterConnections()_AlwaysRun()_AppMon()_AppMonEx()_BinaryBin()_CheckMsgBox()_CmdLineRaw()_ContextMenu()_ConvertLHWebColor()/_ConvertSHWebColor()_DesktopDimensions()_DisplayPassword()_DotNet_Load()/_DotNet_Unload()_Fibonacci()_FileCompare()_FileCompareContents()_FileNameByHandle()_FilePrefix/SRE()_FindInFile()_GetBackgroundColor()/_SetBackgroundColor()_GetConrolID()_GetCtrlClass()_GetDirectoryFormat()_GetDriveMediaType()_GetFilename()/_GetFilenameExt()_GetHardwareID()_GetIP()_GetIP_Country()_GetOSLanguage()_GetSavedSource()_GetStringSize()_GetSystemPaths()_GetURLImage()_GIFImage()_GoogleWeather()_GUICtrlCreateGroup()_GUICtrlListBox_CreateArray()_GUICtrlListView_CreateArray()_GUICtrlListView_SaveCSV()_GUICtrlListView_SaveHTML()_GUICtrlListView_SaveTxt()_GUICtrlListView_SaveXML()_GUICtrlMenu_Recent()_GUICtrlMenu_SetItemImage()_GUICtrlTreeView_CreateArray()_GUIDisable()_GUIImageList_SetIconFromHandle()_GUIRegisterMsg()_GUISetIcon()_Icon_Clear()/_Icon_Set()_IdleTime()_InetGet()_InetGetGUI()_InetGetProgress()_IPDetails()_IsFileOlder()_IsGUID()_IsHex()_IsPalindrome()_IsRegKey()_IsStringRegExp()_IsSystemDrive()_IsUPX()_IsValidType()_IsWebColor()_Language()_Log()_MicrosoftInternetConnectivity()_MSDNDataType()_PathFull/GetRelative/Split()_PathSplitEx()_PrintFromArray()_ProgressSetMarquee()_ReDim()_RockPaperScissors()/_RockPaperScissorsLizardSpock()_ScrollingCredits_SelfDelete()_SelfRename()_SelfUpdate()_SendTo()_ShellAll()_ShellFile()_ShellFolder()_SingletonHWID()_SingletonPID()_Startup()_StringCompact()_StringIsValid()_StringRegExpMetaCharacters()_StringReplaceWholeWord()_StringStripChars()_Temperature()_TrialPeriod()_UKToUSDate()/_USToUKDate()_WinAPI_Create_CTL_CODE()_WinAPI_CreateGUID()_WMIDateStringToDate()/_DateToWMIDateString()Au3 script parsingAutoIt SearchAutoIt3 PortableAutoIt3WrapperToPragmaAutoItWinGetTitle()/AutoItWinSetTitle()CodingDirToHTML5FileInstallrFileReadLastChars()GeoIP databaseGUI - Only Close ButtonGUI ExamplesGUICtrlDeleteImage()GUICtrlGetBkColor()GUICtrlGetStyle()GUIEventsGUIGetBkColor()Int_Parse() & Int_TryParse()IsISBN()LockFile()Mapping CtrlIDsOOP in AutoItParseHeadersToSciTE()PasswordValidPasteBinPosts Per DayPreExpandProtect GlobalsQueue()Resource UpdateResourcesExSciTE JumpSettings INISHELLHOOKShunting-YardSignature CreatorStack()Stopwatch()StringAddLF()/StringStripLF()StringEOLToCRLF()VSCROLLWM_COPYDATAMore Examples...

Updated: 04/09/2015

Share this post


Link to post
Share on other sites

#17 ·  Posted (edited)

Hi,

I'm late to the party I see, but here is some additional information.

The circumflex/caret has 3 functions in a regular expression, depending on the mode and position in the pattern.

(1) By default it matches the beginning of the subject, same as \A. ($ will match the end of subject, same as \z)

(2) If multiline mode is enabled, by setting the m flag, it matches the beginning of the subject and the start of a new line ($ will match the end of subject and end of line).

(3) If it is used in a character class it only has a special meaning if it is the first character in the class. It will then negate the class, i.e. the class will match any character not in it.

If you want to match a literal circumflex in a pattern it must be escaped (with a backslash, \^), in a character class it only needs to be escaped if it is the first character because it has no special meaning at any other position.

It is also a zero-width expression, meaning it will match the position of a character (start of subject or newline) but not the character itsef, i.e. it will not include/return the character(s) in the match.

; normal mode, beginning of subject anchor
#cs - Match a string of only word characters (word characters are A-Z, a-z, 0-9 and _ (underscore). Equivalent to class [A-Za-z0-9_])
    ^      the start of the subject
    \w+    one or more word characters
    $      the end of the subject
#ce
If StringRegExp("ABCD", '^\w+$') Then
    ConsoleWrite("The string consists only of ""word"" characters." & @LF)
Else
    ConsoleWrite("The string does NOT consist of only ""word"" characters." & @LF)
EndIf

; multiline mode, beginning of subject or newline
#cs - Match full line comments
    (?m)                 m flag, enables multiline mode
    ^                    the beginning of the subject or the beginning of a newline
    \h*;                 0 or more horizontal whitespaces followed by a semicolon
    (?:                  open non-capturing group
        [[:punct:]]\h*   non-alphanumeric priniting character followed by 0 or more horizontal whitespaces
    )?                   close non-capturing group, match group 0 or 1 times (? makes the group optional)
    (                    open capturing group
        [^\r\n]*         match any character except for \r (CR) or \n (LF) 0 or more times (^ negates the class)
    )                    close capturing group
#ce
$aMatches = StringRegExp(FileRead(@ScriptFullPath), '(?m)^\h*;(?:[[:punct:]]\h*)?([^\r\n]*)', 3)
For $i = 0 To UBound($aMatches) - 1 Step 1
    ConsoleWrite("COMMENT: " & $aMatches[$i] & @LF)
Next

To answer your question more directly...

The beginning of a subject/string is always at position 0 (before the first character).

The beginning of a line is either the beginning of the subject or directly after a newline (CRLF/CR/LF).

Example:

#include <Array.au3>

; subject
$sString = "This is an example subject." & @LF & "Made up of two lines."

; ^ singleline mode, match the first 4 characters of the subject
#cs
    ^       start of the subject
    ....    followed by 4 characters
#ce
$aMatches = StringRegExp($sString, "^....", 3)
_ArrayDisplay($aMatches, "Singleline")

; ^ multiline mode, match the first 4 characters of a line
#cs
    (?m)    m flag, enables multiline mode
    ^       start of the subject or line
    ....    followed by 4 characters
#ce
$aMatches = StringRegExp($sString, "(?m)^....", 3)
_ArrayDisplay($aMatches, "Multiline")

; singleline mode reproducer, match the first 4 characters of a subject
#cs
    (....)      capture 4 characters that are not newline characters
    [\s\S]*     match any character (space and non-space) 0 or more times
#ce
$aMatches = StringRegExp($sString, "(....)[\s\S]*", 3)
_ArrayDisplay($aMatches, "Singleline Reproducer")

; multiline mode reproducer, match the first 4 characters of a line
#cs
    (?:                  open non-capturing group
        \A|\r\n|\r|\n    match the start of the subject or newline characters
    )                    close non-capturing group
    (....)               capture 4 characters that are not newline characters
#ce
$aMatches = StringRegExp($sString, "(?:\A|\r\n|\r|\n)(....)", 3)
_ArrayDisplay($aMatches, "Multiline Reproducer")

; advanced multiline mode reproducer, match the first 4 characters of a line
#cs
    (?<=                 open positive lookbehind
        \A|\r\n|\r|\n    match the start of the subject or newline characters
    )                    close positive lookbehind
    ....                 match (and capture as global match) 4 characters that are not newline characters
#ce
$aMatches = StringRegExp($sString, "(?<=\A|\r\n|\r|\n)....", 3)
_ArrayDisplay($aMatches, "Advanced Multiline Reproducer")

Edit: Fixed spacing of comments

Edit2: added "word" character description

Edit3: added more direct answer

Edited by Robjong
1 person likes this

Share this post


Link to post
Share on other sites

Robjong,

Thanks, this

To answer your question more directly...

The beginning of a subject/string is always at position 0 (before the first character).

The beginning of a line is either the beginning of the subject or directly after a newline (CRLF/CR/LF).

adds more clarity. I may get good at this yet, at least it is no longer voodoo, just Klingon.

kylomas


Forum Rules         Procedure for posting code

"I like pigs.  Dogs look up to us.  Cats look down on us.  Pigs treat us as equals."

- Sir Winston Churchill

Share this post


Link to post
Share on other sites

#19 ·  Posted (edited)

Let me resurect this --even later than late-- for a while as there is something incorrect here.

In the example given:

local $str1 = @crlf & '1' & @crlf & '2' & @crlf & '3' & @crlf & '4' & @crlf & '5' & @crlf & @crlf
local $str2 = stringregexpreplace($str1,'(\r\n|\r|\n){2,}',"\1")
consolewrite($str1 & $str2 & @lf)

The issue I see is that the very first empty line is not removed (there is only one CRLF by its own).

Instead, this works to remove all empty lines:

local $str1 = @crlf & '1' & @crlf & '2' & @crlf & '3' & @crlf & '4' & @crlf & '5' & @crlf & @crlf
local $str2 = StringRegExpReplace($str1,'(*BSR_ANYCRLF)(^\R|\R(?=\R))',"")
consolewrite($str1 & @LF & '--------------' & @LF & $str2 & '------------' & @lf)

Here I've chosen to leave the last line '5' with its @CRLF (which is technically wrong!). If this is not what's wanted, then one can use:

local $str1 = @crlf & '1' & @crlf & @crlf & '2' & @crlf & @CRLF & @crlf & '3' & @crlf & '4' & @crlf & '5' & @crlf & @crlf
local $str2 = StringRegExpReplace($str1,'(*BSR_ANYCRLF)(^\R|\R(?=\R)|\R\z)', "")
ConsoleWrite($str1 & '--------------' & @LF & $str2 & '--------------' & @lf)

As a sidenote it's well remembering that ^ and $ (as well as \A \a, \Z \z) are all assertions that "match" at specific positions. These sequences can never be used to capture anything.

\R is different from this point of view in that it actually matches newlines sequences (depending on *BSR_xxx option) and that you can capture \R as the example shows. Final note: by default, \R matches any Unicode newline character or char sequence.

Edited by jchd

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites

#20 ·  Posted (edited)

Hi all, another quick way $iFlag = 2 or $iFlag = 4

or

$str = "ect ect ect"
$str = StringTrimLeft(StringRegExpReplace(@LF & StringRegExpReplace($str, 'r(?!n)', @CRLF), 'ns*(?=n)', ""), 1)

about speed RegExp does not like too much "|" (Or. The expression on one side or the other can be matched.)

(rn|r|n)
;or
(*BSR_ANYCRLF)(^R|R(?=R)|Rz)
;ect ect ect
;ect ect ect

I mean takes a reference point for strength (always takes a reference point to go fast), precisely the

; n Not (rn|n|r) ect ect
$str = StringRegExpReplace(@LF & StringRegExpReplace($str, 'r(?!n)', @CRLF), 'n(r?n)+', "")
or
$str = StringRegExpReplace(@LF & StringRegExpReplace($str, 'r(?!n)', @CRLF), 'ns*(?=n)', "")

almost x3 times faster than the

local $str2 = StringRegExpReplace($str1,'(*BSR_ANYCRLF)(^R|R(?=R))',"")
;or
local $str2 = stringregexpreplace($str1,'(rn|r|n){2,}',"1")
;ect ect ect
;ect ect ect

Ciao.

Edited by DXRW4E

apps-odrive.pngdrive_app_badge.png box-logo.png new_logo.png MEGA_Logo.png

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0