develogy

Assistance with removing extra line feeds

8 posts in this topic

I have created a function to remove a block of text from a file based on start and ending delimiters. It works, however it leaves extra line feeds in between the two sections that are then rejoined together.

Here is what I have so far:

#include <Date.au3>
#include <Array.au3>
#include <File.au3>
#include <FileConstants.au3>
#include <MsgBoxConstants.au3>
#include <WinAPIFiles.au3>
#include <StringConstants.au3>

$file = @ScriptDir & "\test.txt"
RemoveTxtBlock($file, "#<<Start>>", "#<<End>>")

Func RemoveTxtBlock($file, $sectionStart, $sectionEnd)
    Local $hFileOpen = FileOpen($file, $FO_READ)
    If $hFileOpen = -1 Then
        MsgBox($MB_SYSTEMMODAL, "", "An error occurred when reading/writing the file.")
        Return False
    Else
        $currentContent = FileRead($hFileOpen)
        If StringInStr($currentContent, $sectionStart) Then
            $PreContent = StringSplit ( $currentContent, $sectionStart , $STR_NOCOUNT + $STR_ENTIRESPLIT)
                ;_ArrayDisplay($PreContent,"PreContent", Default, 8)
            $contentStart = $PreContent[0]
            $PostContent = StringSplit ( $currentContent, $sectionEnd , $STR_NOCOUNT + $STR_ENTIRESPLIT)
                ;_ArrayDisplay($PostContent,"PostContent", Default, 8)
            $contentEnd = $PostContent[1]
            ; Close the handle returned by FileOpen.
            FileClose($hFileOpen)
            ; Re-Open File in Overwrite Mode
            Local $hFileOpen = FileOpen($file, $FO_READ + $FO_OVERWRITE)
            If $hFileOpen = -1 Then
                MsgBox($MB_SYSTEMMODAL, "", "An error occurred when writing the file.")
                Return False
            Else
                FileWrite($hFileOpen, $contentStart & @CRLF & $contentEnd)
                FileClose($hFileOpen)
            EndIf
        Else
            ;; Start Delimiter Not Found, Nothing ToDo
            Return True
        EndIf
    EndIf
EndFunc

Any ideas, how to get rid of the extra line breaks? 

Also if any of you "masters" have ideas for making this more efficient, I am definitely interested.

Share this post


Link to post
Share on other sites



Although Regular Expressions (RE)  are much maligned by the novice to RE's, they are a useful tool for manipulating text.

Local $file = "test.txt"
Local $sStart = "#<<Start>>"
Local $sEnd = "#<<End>>"

Local $sModifiedText = StringRegExpReplace(FileRead($file), "(?is)(\s*" & $sStart & ".*?" & $sEnd & ")", "")
; The meaning of the RE pattern from left to right.
; "(?is)" means the rest of the RE pattern is case insensitive and the dot, ".", will match all characters in the test string including newline characters.
; "( " is the start of the capture group (in this case specifically stating "(...)", a capture group, is not necessary - works without brackets.);
; "\s*" will match all spaces and newline (white spaces) if they are present. The "*" is a qualifier and stands for zero or many times the previous character.
; " & $sStart & ".*?" & $sEnd & ")" means match and capture everything between, and including the variables in $sStart and $sEnd, and close the group, ")".
;           The question mark makes the preceeding quantifier (or repetition specifier), "*" non-greedy. So that the first occurrence of the variable in $sEnd will be matched.  Without the question mark, "*" is by default greedy and the last occurrence of the variable in $sEnd would be matched.
;           So if there were two "#<<Start>> text #<<End>>" in the test string, the greedy ".*" would match the first "#<<Start>>" and every character to the last "#<<End>>"
; Everything matched within the brackets is replaced with "", nothing.
; For further reading see the StringRegExp function in Autoit help.

FileDelete($file)
FileWrite($file, $sModifiedText) ; If $sStart and $sEnd are not in the file, the file does not change.

 

Share this post


Link to post
Share on other sites
22 hours ago, Anoop said:

Is @CRLF making the issue here? Please refer https://www.autoitscript.com/autoit3/docs/macros.htm#%40CRLF

 

No, I tried removing that, and it still leaves 2 blank lines, presumably because the Delimiters are on lines by themselves and likely followed by line feed.

Share this post


Link to post
Share on other sites
6 hours ago, Malkey said:

Although Regular Expressions (RE)  are much maligned by the novice to RE's, they are a useful tool for manipulating text.

Local $file = "test.txt"
Local $sStart = "#<<Start>>"
Local $sEnd = "#<<End>>"

Local $sModifiedText = StringRegExpReplace(FileRead($file), "(?is)(\s*" & $sStart & ".*?" & $sEnd & ")", "")
; The meaning of the RE pattern from left to right.
; "(?is)" means the rest of the RE pattern is case insensitive and the dot, ".", will match all characters in the test string including newline characters.
; "( " is the start of the capture group (in this case specifically stating "(...)", a capture group, is not necessary - works without brackets.);
; "\s*" will match all spaces and newline (white spaces) if they are present. The "*" is a qualifier and stands for zero or many times the previous character.
; " & $sStart & ".*?" & $sEnd & ")" means match and capture everything between, and including the variables in $sStart and $sEnd, and close the group, ")".
;           The question mark makes the preceeding quantifier (or repetition specifier), "*" non-greedy. So that the first occurrence of the variable in $sEnd will be matched.  Without the question mark, "*" is by default greedy and the last occurrence of the variable in $sEnd would be matched.
;           So if there were two "#<<Start>> text #<<End>>" in the test string, the greedy ".*" would match the first "#<<Start>>" and every character to the last "#<<End>>"
; Everything matched within the brackets is replaced with "", nothing.
; For further reading see the StringRegExp function in Autoit help.

FileDelete($file)
FileWrite($file, $sModifiedText) ; If $sStart and $sEnd are not in the file, the file does not change.

 

I know next to nothing about RE's, but like the idea.  Could this be extended to include whitespace (i.e.. line feeds) as part of the delimiter if they exist.

Share this post


Link to post
Share on other sites
2 hours ago, develogy said:

.....  Could this be extended to include whitespace (i.e.. line feeds) as part of the delimiter if they exist.

Yes.

I put "\s* in the RE pattern. It could be put in the start delimiter or end delimiter.  You could also look at "\h", "\v", or "\R".
Also, the replace parameter of the StringRegExpReplace function could be "" (nothing), " " (a space) , or @CRLF, depending on how you want the finished text to look.

Share this post


Link to post
Share on other sites

If your #<<End>> delimiter always has a CRLF after it, replacing your FileWrite line with this will remove it, although I am sure there are much better ways to accomplish this.

FileWrite($hFileOpen, StringLeft($contentStart,StringLen($contentstart)-1) & $contentEnd)

 

Share this post


Link to post
Share on other sites

Malkey,

Your RegEx works perfect. Though I have no idea how it works. I am going to study it to try to understand it ;)

Thanks

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now