develogy Posted October 18, 2016 Share Posted October 18, 2016 I have created a function to remove a block of text from a file based on start and ending delimiters. It works, however it leaves extra line feeds in between the two sections that are then rejoined together. Here is what I have so far: expandcollapse popup#include <Date.au3> #include <Array.au3> #include <File.au3> #include <FileConstants.au3> #include <MsgBoxConstants.au3> #include <WinAPIFiles.au3> #include <StringConstants.au3> $file = @ScriptDir & "\test.txt" RemoveTxtBlock($file, "#<<Start>>", "#<<End>>") Func RemoveTxtBlock($file, $sectionStart, $sectionEnd) Local $hFileOpen = FileOpen($file, $FO_READ) If $hFileOpen = -1 Then MsgBox($MB_SYSTEMMODAL, "", "An error occurred when reading/writing the file.") Return False Else $currentContent = FileRead($hFileOpen) If StringInStr($currentContent, $sectionStart) Then $PreContent = StringSplit ( $currentContent, $sectionStart , $STR_NOCOUNT + $STR_ENTIRESPLIT) ;_ArrayDisplay($PreContent,"PreContent", Default, 8) $contentStart = $PreContent[0] $PostContent = StringSplit ( $currentContent, $sectionEnd , $STR_NOCOUNT + $STR_ENTIRESPLIT) ;_ArrayDisplay($PostContent,"PostContent", Default, 8) $contentEnd = $PostContent[1] ; Close the handle returned by FileOpen. FileClose($hFileOpen) ; Re-Open File in Overwrite Mode Local $hFileOpen = FileOpen($file, $FO_READ + $FO_OVERWRITE) If $hFileOpen = -1 Then MsgBox($MB_SYSTEMMODAL, "", "An error occurred when writing the file.") Return False Else FileWrite($hFileOpen, $contentStart & @CRLF & $contentEnd) FileClose($hFileOpen) EndIf Else ;; Start Delimiter Not Found, Nothing ToDo Return True EndIf EndIf EndFunc Any ideas, how to get rid of the extra line breaks? Also if any of you "masters" have ideas for making this more efficient, I am definitely interested. Link to comment Share on other sites More sharing options...
Anoop Posted October 19, 2016 Share Posted October 19, 2016 7 hours ago, develogy said: FileWrite($hFileOpen, $contentStart & @CRLF & $contentEnd) Is @CRLF making the issue here? Please refer https://www.autoitscript.com/autoit3/docs/macros.htm#%40CRLF Link to comment Share on other sites More sharing options...
Malkey Posted October 19, 2016 Share Posted October 19, 2016 Although Regular Expressions (RE) are much maligned by the novice to RE's, they are a useful tool for manipulating text. Local $file = "test.txt" Local $sStart = "#<<Start>>" Local $sEnd = "#<<End>>" Local $sModifiedText = StringRegExpReplace(FileRead($file), "(?is)(\s*" & $sStart & ".*?" & $sEnd & ")", "") ; The meaning of the RE pattern from left to right. ; "(?is)" means the rest of the RE pattern is case insensitive and the dot, ".", will match all characters in the test string including newline characters. ; "( " is the start of the capture group (in this case specifically stating "(...)", a capture group, is not necessary - works without brackets.); ; "\s*" will match all spaces and newline (white spaces) if they are present. The "*" is a qualifier and stands for zero or many times the previous character. ; " & $sStart & ".*?" & $sEnd & ")" means match and capture everything between, and including the variables in $sStart and $sEnd, and close the group, ")". ; The question mark makes the preceeding quantifier (or repetition specifier), "*" non-greedy. So that the first occurrence of the variable in $sEnd will be matched. Without the question mark, "*" is by default greedy and the last occurrence of the variable in $sEnd would be matched. ; So if there were two "#<<Start>> text #<<End>>" in the test string, the greedy ".*" would match the first "#<<Start>>" and every character to the last "#<<End>>" ; Everything matched within the brackets is replaced with "", nothing. ; For further reading see the StringRegExp function in Autoit help. FileDelete($file) FileWrite($file, $sModifiedText) ; If $sStart and $sEnd are not in the file, the file does not change. Link to comment Share on other sites More sharing options...
develogy Posted October 20, 2016 Author Share Posted October 20, 2016 22 hours ago, Anoop said: Is @CRLF making the issue here? Please refer https://www.autoitscript.com/autoit3/docs/macros.htm#%40CRLF No, I tried removing that, and it still leaves 2 blank lines, presumably because the Delimiters are on lines by themselves and likely followed by line feed. Link to comment Share on other sites More sharing options...
develogy Posted October 20, 2016 Author Share Posted October 20, 2016 6 hours ago, Malkey said: Although Regular Expressions (RE) are much maligned by the novice to RE's, they are a useful tool for manipulating text. Local $file = "test.txt" Local $sStart = "#<<Start>>" Local $sEnd = "#<<End>>" Local $sModifiedText = StringRegExpReplace(FileRead($file), "(?is)(\s*" & $sStart & ".*?" & $sEnd & ")", "") ; The meaning of the RE pattern from left to right. ; "(?is)" means the rest of the RE pattern is case insensitive and the dot, ".", will match all characters in the test string including newline characters. ; "( " is the start of the capture group (in this case specifically stating "(...)", a capture group, is not necessary - works without brackets.); ; "\s*" will match all spaces and newline (white spaces) if they are present. The "*" is a qualifier and stands for zero or many times the previous character. ; " & $sStart & ".*?" & $sEnd & ")" means match and capture everything between, and including the variables in $sStart and $sEnd, and close the group, ")". ; The question mark makes the preceeding quantifier (or repetition specifier), "*" non-greedy. So that the first occurrence of the variable in $sEnd will be matched. Without the question mark, "*" is by default greedy and the last occurrence of the variable in $sEnd would be matched. ; So if there were two "#<<Start>> text #<<End>>" in the test string, the greedy ".*" would match the first "#<<Start>>" and every character to the last "#<<End>>" ; Everything matched within the brackets is replaced with "", nothing. ; For further reading see the StringRegExp function in Autoit help. FileDelete($file) FileWrite($file, $sModifiedText) ; If $sStart and $sEnd are not in the file, the file does not change. I know next to nothing about RE's, but like the idea. Could this be extended to include whitespace (i.e.. line feeds) as part of the delimiter if they exist. Link to comment Share on other sites More sharing options...
Malkey Posted October 20, 2016 Share Posted October 20, 2016 2 hours ago, develogy said: ..... Could this be extended to include whitespace (i.e.. line feeds) as part of the delimiter if they exist. Yes. I put "\s* in the RE pattern. It could be put in the start delimiter or end delimiter. You could also look at "\h", "\v", or "\R". Also, the replace parameter of the StringRegExpReplace function could be "" (nothing), " " (a space) , or @CRLF, depending on how you want the finished text to look. Link to comment Share on other sites More sharing options...
MuffinMan Posted October 20, 2016 Share Posted October 20, 2016 If your #<<End>> delimiter always has a CRLF after it, replacing your FileWrite line with this will remove it, although I am sure there are much better ways to accomplish this. FileWrite($hFileOpen, StringLeft($contentStart,StringLen($contentstart)-1) & $contentEnd) Link to comment Share on other sites More sharing options...
develogy Posted October 21, 2016 Author Share Posted October 21, 2016 Malkey, Your RegEx works perfect. Though I have no idea how it works. I am going to study it to try to understand it Thanks Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now