Jump to content

Assistance with removing extra line feeds


 Share

Recommended Posts

I have created a function to remove a block of text from a file based on start and ending delimiters. It works, however it leaves extra line feeds in between the two sections that are then rejoined together.

Here is what I have so far:

#include <Date.au3>
#include <Array.au3>
#include <File.au3>
#include <FileConstants.au3>
#include <MsgBoxConstants.au3>
#include <WinAPIFiles.au3>
#include <StringConstants.au3>

$file = @ScriptDir & "\test.txt"
RemoveTxtBlock($file, "#<<Start>>", "#<<End>>")

Func RemoveTxtBlock($file, $sectionStart, $sectionEnd)
    Local $hFileOpen = FileOpen($file, $FO_READ)
    If $hFileOpen = -1 Then
        MsgBox($MB_SYSTEMMODAL, "", "An error occurred when reading/writing the file.")
        Return False
    Else
        $currentContent = FileRead($hFileOpen)
        If StringInStr($currentContent, $sectionStart) Then
            $PreContent = StringSplit ( $currentContent, $sectionStart , $STR_NOCOUNT + $STR_ENTIRESPLIT)
                ;_ArrayDisplay($PreContent,"PreContent", Default, 8)
            $contentStart = $PreContent[0]
            $PostContent = StringSplit ( $currentContent, $sectionEnd , $STR_NOCOUNT + $STR_ENTIRESPLIT)
                ;_ArrayDisplay($PostContent,"PostContent", Default, 8)
            $contentEnd = $PostContent[1]
            ; Close the handle returned by FileOpen.
            FileClose($hFileOpen)
            ; Re-Open File in Overwrite Mode
            Local $hFileOpen = FileOpen($file, $FO_READ + $FO_OVERWRITE)
            If $hFileOpen = -1 Then
                MsgBox($MB_SYSTEMMODAL, "", "An error occurred when writing the file.")
                Return False
            Else
                FileWrite($hFileOpen, $contentStart & @CRLF & $contentEnd)
                FileClose($hFileOpen)
            EndIf
        Else
            ;; Start Delimiter Not Found, Nothing ToDo
            Return True
        EndIf
    EndIf
EndFunc

Any ideas, how to get rid of the extra line breaks? 

Also if any of you "masters" have ideas for making this more efficient, I am definitely interested.

Link to comment
Share on other sites

Although Regular Expressions (RE)  are much maligned by the novice to RE's, they are a useful tool for manipulating text.

Local $file = "test.txt"
Local $sStart = "#<<Start>>"
Local $sEnd = "#<<End>>"

Local $sModifiedText = StringRegExpReplace(FileRead($file), "(?is)(\s*" & $sStart & ".*?" & $sEnd & ")", "")
; The meaning of the RE pattern from left to right.
; "(?is)" means the rest of the RE pattern is case insensitive and the dot, ".", will match all characters in the test string including newline characters.
; "( " is the start of the capture group (in this case specifically stating "(...)", a capture group, is not necessary - works without brackets.);
; "\s*" will match all spaces and newline (white spaces) if they are present. The "*" is a qualifier and stands for zero or many times the previous character.
; " & $sStart & ".*?" & $sEnd & ")" means match and capture everything between, and including the variables in $sStart and $sEnd, and close the group, ")".
;           The question mark makes the preceeding quantifier (or repetition specifier), "*" non-greedy. So that the first occurrence of the variable in $sEnd will be matched.  Without the question mark, "*" is by default greedy and the last occurrence of the variable in $sEnd would be matched.
;           So if there were two "#<<Start>> text #<<End>>" in the test string, the greedy ".*" would match the first "#<<Start>>" and every character to the last "#<<End>>"
; Everything matched within the brackets is replaced with "", nothing.
; For further reading see the StringRegExp function in Autoit help.

FileDelete($file)
FileWrite($file, $sModifiedText) ; If $sStart and $sEnd are not in the file, the file does not change.

 

Link to comment
Share on other sites

6 hours ago, Malkey said:

Although Regular Expressions (RE)  are much maligned by the novice to RE's, they are a useful tool for manipulating text.

Local $file = "test.txt"
Local $sStart = "#<<Start>>"
Local $sEnd = "#<<End>>"

Local $sModifiedText = StringRegExpReplace(FileRead($file), "(?is)(\s*" & $sStart & ".*?" & $sEnd & ")", "")
; The meaning of the RE pattern from left to right.
; "(?is)" means the rest of the RE pattern is case insensitive and the dot, ".", will match all characters in the test string including newline characters.
; "( " is the start of the capture group (in this case specifically stating "(...)", a capture group, is not necessary - works without brackets.);
; "\s*" will match all spaces and newline (white spaces) if they are present. The "*" is a qualifier and stands for zero or many times the previous character.
; " & $sStart & ".*?" & $sEnd & ")" means match and capture everything between, and including the variables in $sStart and $sEnd, and close the group, ")".
;           The question mark makes the preceeding quantifier (or repetition specifier), "*" non-greedy. So that the first occurrence of the variable in $sEnd will be matched.  Without the question mark, "*" is by default greedy and the last occurrence of the variable in $sEnd would be matched.
;           So if there were two "#<<Start>> text #<<End>>" in the test string, the greedy ".*" would match the first "#<<Start>>" and every character to the last "#<<End>>"
; Everything matched within the brackets is replaced with "", nothing.
; For further reading see the StringRegExp function in Autoit help.

FileDelete($file)
FileWrite($file, $sModifiedText) ; If $sStart and $sEnd are not in the file, the file does not change.

 

I know next to nothing about RE's, but like the idea.  Could this be extended to include whitespace (i.e.. line feeds) as part of the delimiter if they exist.

Link to comment
Share on other sites

2 hours ago, develogy said:

.....  Could this be extended to include whitespace (i.e.. line feeds) as part of the delimiter if they exist.

Yes.

I put "\s* in the RE pattern. It could be put in the start delimiter or end delimiter.  You could also look at "\h", "\v", or "\R".
Also, the replace parameter of the StringRegExpReplace function could be "" (nothing), " " (a space) , or @CRLF, depending on how you want the finished text to look.

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...