Jump to content

AutoIt delete all lines from a *.txt file except some


Recommended Posts

you can do these things in this order, at a frequency befitting your project.

_FileReadToArray
_ArrayFindAll
_ArrayDelete
_FileWriteFromArray

 

,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-.
|(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/
(_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_)
| | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) (
| | | | |)| | \ / | | | | | |)| | `--. | |) \ | |
`-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_|
'-' '-' (__) (__) (_) (__)

Link to comment
Share on other sites

This working example checks a specified .txt file every 30 secs.  If a search string is not found in any of the file's lines, those lines are deleted.  So the only lines remaining in the file all contain the "searched for" string.

HotKeySet("{ESC}", "Terminate") ; Press Esc key to exit script.

Global $sFileName = "NewMailTest.txt"
Global $sSearch4 = "new mail"
_FileDeleteLines()
AdlibRegister("_FileDeleteLines", 30 * 1000) ; Run the function, _FileDeleteLines(), every 30 secs

While Sleep(100)
WEnd


Func _FileDeleteLines()
    Local $sFileContents = FileRead($sFileName)
    Local $hFileOpen = FileOpen($sFileName, 2) ; $FO_OVERWRITE (2) = Write mode (erase previous contents)
    
    ; RE Replace pattern also used here: https://www.autoitscript.com/forum/topic/191336-split-csv/?do=findComment&comment=1372474
    FileWrite($hFileOpen, StringRegExpReplace($sFileContents, '(?m)^.*\Q' & $sSearch4 & '\E.*\R?(*SKIP)(?!)|^.*\R*', "")) ; Delete all lines that do not have the contents of $sSearch4 present.
    ; For explanation of the RE pattern, see the StringRegExp() function in the AutoIt help file.:, or,
    ; http://www.pcre.org/original/doc/html/pcrepattern.html#SEC24
    
    FileClose($hFileOpen)
EndFunc   ;==>_FileDeleteLines

Func Terminate()
    AdlibUnRegister("_FileDeleteLines")
    Exit
EndFunc   ;==>Terminate

 Here is the "NewMailTest.txt"  test file used.   Where only the lines 1, 4, and 6 remain in the file.

The above script searches for "new mail", and not "new" and "mail" separately in the same line.

Line 1 new mail xxx.
Line 2 new space mail
Line 3 new line
Line 4 new mail
Line 5 another line
Line 6 new mail yyy

 

Link to comment
Share on other sites

That post was lacking the niceties, he didnt even offer to paste it into the OPs Scite and push F5 for them?

,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-.
|(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/
(_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_)
| | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) (
| | | | |)| | \ / | | | | | |)| | `--. | |) \ | |
`-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_|
'-' '-' (__) (__) (_) (__)

Link to comment
Share on other sites

Alright, seems logical.

Now, if I'd want to find multiple queries in a txt file, and delete the other lines. Tried this :

HotKeySet("{ESC}", "Terminate") ; Press Esc key to exit script.

Global $sFileName = "281478146260429_chatlog.txt"
Global $sSearch4 = "tips you"
Global $sSearch5 = "bank credits"
Global $sSearch6 = "trade with you"
Global $sSearch7 = "you tell"
Global $sSearch8 = "to a duel"
_FileDeleteLines()
AdlibRegister("_FileDeleteLines", 10 * 1000) ; Run the function, _FileDeleteLines(), every 30 secs

While Sleep(100)
WEnd


Func _FileDeleteLines()
    Local $sFileContents = FileRead($sFileName)
    Local $hFileOpen = FileOpen($sFileName, 2) ; $FO_OVERWRITE (2) = Write mode (erase previous contents)

    ; RE Replace pattern also used here: https://www.autoitscript.com/forum/topic/191336-split-csv/?do=findComment&comment=1372474
    FileWrite($hFileOpen, StringRegExpReplace($sFileContents, '(?m)^.*\Q' & $sSearch4 & $sSearch5 & $sSearch6 & $sSearch7 & $sSearch8 &  '\E.*\R?(*SKIP)(?!)|^.*\R*', "")) ; Delete all lines that do not have the contents of $sSearch4 present.
    ; For explanation of the RE pattern, see the StringRegExp() function in the AutoIt help file.:, or,
    ; http://www.pcre.org/original/doc/html/pcrepattern.html#SEC24

    FileClose($hFileOpen)
EndFunc   ;==>_FileDeleteLines

Func Terminate()
    AdlibUnRegister("_FileDeleteLines")
    Exit
EndFunc   ;==>Terminate

Doesn't seem to work. Can anyone point out my mistakes? I must certainly have a few errors in there.

Link to comment
Share on other sites

PD,

Try it like this...(not tested)

HotKeySet("{ESC}", "Terminate") ; Press Esc key to exit script.

Global $sFileName = "281478146260429_chatlog.txt"

Global $asearch[5]

$asearch[0] = "tips you"
$asearch[1] = "bank credits"
$asearch[2] = "trade with you"
$asearch[3] = "you tell"
$asearch[4] = "to a duel"

_FileDeleteLines()

AdlibRegister("_FileDeleteLines", 10 * 1000) ; Run the function, _FileDeleteLines(), every 30 secs

While Sleep(100)
WEnd


Func _FileDeleteLines()
    Local $sFileContents = FileRead($sFileName)
    Local $hFileOpen = FileOpen($sFileName, 2) ; $FO_OVERWRITE (2) = Write mode (erase previous contents)

    For $1 = 0 To UBound($asearch) - 1
        $sFileContents = StringRegExpReplace($sFileContents, '(?m)^.*\Q' & $asearch[$1] & '\E.*\R?(*SKIP)(?!)|^.*\R*', ""))
    Next

    FileWrite($hFileOpen, $sFileContents)
    FileClose($hFileOpen)
EndFunc   ;==>_FileDeleteLines

Func Terminate()
    AdlibUnRegister("_FileDeleteLines")
    Exit
EndFunc   ;==>Terminate

kylomas

Edited by kylomas

Forum Rules         Procedure for posting code

"I like pigs.  Dogs look up to us.  Cats look down on us.  Pigs treat us as equals."

- Sir Winston Churchill

Link to comment
Share on other sites

8 minutes ago, kylomas said:

PD,

Try it like this...

HotKeySet("{ESC}", "Terminate") ; Press Esc key to exit script.

Global $sFileName = "281478146260429_chatlog.txt"
Global $aSearch[0] = "tips you"
Global $aSearch[1] = "bank credits"
Global $aSearch[2] = "trade with you"
Global $aSearch[3] = "you tell"
Global $aSearch[4] = "to a duel"
_FileDeleteLines()
AdlibRegister("_FileDeleteLines", 10 * 1000) ; Run the function, _FileDeleteLines(), every 30 secs

While Sleep(100)
WEnd


Func _FileDeleteLines()
    Local $sFileContents = FileRead($sFileName)
    Local $hFileOpen = FileOpen($sFileName, 2) ; $FO_OVERWRITE (2) = Write mode (erase previous contents)

    for $1 = 0 to ubound($aSearch) - 1
        $sFileContents = StringRegExpReplace($sFileContents, '(?m)^.*\Q' & $aSearch[$1] &  '\E.*\R?(*SKIP)(?!)|^.*\R*', ""))
    next

    filewrite($hFileOpen,$sFileContents)
    FileClose($hFileOpen)
EndFunc   ;==>_FileDeleteLines

Func Terminate()
    AdlibUnRegister("_FileDeleteLines")
    Exit
EndFunc   ;==>Terminate

kylomas

Doesn't seem to do anything at all...

Link to comment
Share on other sites

Here is another method for multiple searches.

HotKeySet("{ESC}", "Terminate") ; Press Esc key to exit script.

Global $sFileName = "281478146260429_chatlog.txt"
Global $sSearch4 = "tips you"
Global $sSearch5 = "bank credits"
Global $sSearch6 = "trade with you"
Global $sSearch7 = "you tell"
Global $sSearch8 = "to a duel"
_FileDeleteLines()
AdlibRegister("_FileDeleteLines", 10 * 1000) ; Run the function, _FileDeleteLines(), every 30 secs

While Sleep(100)
WEnd


Func _FileDeleteLines()
    Local $sFileContents = FileRead($sFileName)
    Local $hFileOpen = FileOpen($sFileName, 2) ; $FO_OVERWRITE (2) = Write mode (erase previous contents)

    ; RE Replace pattern is brilliantly created here: https://www.autoitscript.com/forum/topic/191336-split-csv/?do=findComment&comment=1372474
    FileWrite($hFileOpen, StringRegExpReplace($sFileContents, _
            '(?m)^.*(\Q' & $sSearch4 & "\E|\Q" & $sSearch5 & "\E|\Q" & $sSearch6 & "\E|\Q" & $sSearch7 & "\E|\Q" & $sSearch8 & _
            '\E.*\R?)(*SKIP)(?!)|^.*\R*', ""))
    ; This part of the RE pattern :-
    ; "(\Q' & $sSearch4 & "\E|\Q" & $sSearch5 & "\E|\Q" & $sSearch6 & "\E|\Q" & $sSearch7 & "\E|\Q" & $sSearch8 & '\E.*\R?)"
    ; "(....)" The first open bracket and the last closing bracket encase all the searches into one group.
    ; "|" means "or". So each search is separated and connected with an "or".
    ; "\Q'....\E" means all characters between "\Q" and "\E" are taken as literal characters.
    ; Meaning of this part of the RE pattern :-
    ; Find (match) the literal characters in $sSearch4 or $sSearch5 or $sSearch6 or $sSearch7 or $sSearch8.

    FileClose($hFileOpen)
EndFunc   ;==>_FileDeleteLines

Func Terminate()
    AdlibUnRegister("_FileDeleteLines")
    Exit
EndFunc   ;==>Terminate

 

Link to comment
Share on other sites

PD,

You couldn't fix that?   Try this...

HotKeySet("{ESC}", "Terminate") ; Press Esc key to exit script.

Global $sFileName = "281478146260429_chatlog.txt"

Global $asearch[5]

$asearch[0] = "tips you"
$asearch[1] = "bank credits"
$asearch[2] = "trade with you"
$asearch[3] = "you tell"
$asearch[4] = "to a duel"

_FileDeleteLines()

AdlibRegister("_FileDeleteLines", 10 * 1000) ; Run the function, _FileDeleteLines(), every 30 secs

While Sleep(100)
WEnd


Func _FileDeleteLines()
    Local $sFileContents = FileRead($sFileName)
    Local $hFileOpen = FileOpen($sFileName, 2) ; $FO_OVERWRITE (2) = Write mode (erase previous contents)

    For $1 = 0 To UBound($asearch) - 1
        $sFileContents = StringRegExpReplace($sFileContents, '(?m)^.*\Q' & $asearch[$1] & '\E.*\R?(*SKIP)(?!)|^.*\R*', "")
    Next

    FileWrite($hFileOpen, $sFileContents)
    FileClose($hFileOpen)
EndFunc   ;==>_FileDeleteLines

Func Terminate()
    AdlibUnRegister("_FileDeleteLines")
    Exit
EndFunc   ;==>Terminate

kylomas

 

Forum Rules         Procedure for posting code

"I like pigs.  Dogs look up to us.  Cats look down on us.  Pigs treat us as equals."

- Sir Winston Churchill

Link to comment
Share on other sites

@kylomas

A little explanation about (*SKIP)(*FAIL) - same as (*SKIP)(*F) or (*SKIP)(?!) - in this post
:)

 

Edit
For this I usually use a compact version of Malkey's script, something like this

$filter = "tips you;bank credits;trade with you;you tell;to a duel"

Func _DeleteLinesNotContaining($sFileName, $filter)
   Local $sFileContents = FileRead($sFileName)
   $filter = "\Q" & StringReplace($filter, ";", "\E|\Q") & "\E"
   $res = StringRegExpReplace($sFileContents, '(?m)^.*(' & $filter & ').*\R?(*SKIP)(*F)|^.*\R?', ""))
   ; etc
   ; other stuff
EndFunc

 

Edited by mikell
Link to comment
Share on other sites

The two following quotes are from here.

In the "Verbs that act after backtracking " section, we have:-
"....where (*SKIP) was encountered. (*SKIP) signifies that whatever text was matched leading up to it cannot be part of a successful match."

In the "Lookahead assertions" section we have :-
"If you want to force a matching failure at some point in a pattern, the most convenient way to do it is with (?!) because an empty string always matches, so an assertion that requires there not to be an empty string must always fail. The backtracking control verb (*FAIL) or (*F) is a synonym for (?!). "

The successfully matched text before the backtracking control verb, "(*SKIP)", is forced into a matching failure  using "(*FAIL)", or "(*F)", or "(?!)".   
As the RE pattern is applied to the rest of the test string from left to right and top to bottom, all the text that does  not match the pre-"*SKIP", or pre-"|"  part of the RE pattern is match with the Post-"|", or, the "post-or" part of the RE pattern.
Using StringRegExpReplace() function, all text that is matched is replaced with "" (nothing).  That is, the matched text is deleted, and the forced unmatched text is not deleted.

The only reason I answered this post was to further investigate mikell's "brilliantly created" RE pattern from here.  Thanks mikell.

 

@kylomas

The problem with the For-Next loop is after the first search, or loop, the lines that do not have the first search string in them are deleted.

For a line to survive all the searches in the loop, that line would need to have all search strings in that particular line.

 

Link to comment
Share on other sites

@Malkey...mikell...thanks for the explanations...
 

Quote

 

@kylomas

The problem with the For-Next loop is after the first search, or loop, the lines that do not have the first search string in them are deleted.

For a line to survive all the searches in the loop, that line would need to have all search strings in that particular line.

 

Yes, not real bright responding to a post without understanding the sre!!!:drool:

Edited by kylomas

Forum Rules         Procedure for posting code

"I like pigs.  Dogs look up to us.  Cats look down on us.  Pigs treat us as equals."

- Sir Winston Churchill

Link to comment
Share on other sites

Alright, was wondering. Is it possible to add a code in there to have /: in the clipboard everytime it loops?

On 12/1/2017 at 11:27 PM, Malkey said:

Here is another method for multiple searches.

HotKeySet("{ESC}", "Terminate") ; Press Esc key to exit script.

Global $sFileName = "281478146260429_chatlog.txt"
Global $sSearch4 = "tips you"
Global $sSearch5 = "bank credits"
Global $sSearch6 = "trade with you"
Global $sSearch7 = "you tell"
Global $sSearch8 = "to a duel"
_FileDeleteLines()
AdlibRegister("_FileDeleteLines", 10 * 1000) ; Run the function, _FileDeleteLines(), every 30 secs

While Sleep(100)
WEnd


Func _FileDeleteLines()
    Local $sFileContents = FileRead($sFileName)
    Local $hFileOpen = FileOpen($sFileName, 2) ; $FO_OVERWRITE (2) = Write mode (erase previous contents)

    ; RE Replace pattern is brilliantly created here: https://www.autoitscript.com/forum/topic/191336-split-csv/?do=findComment&comment=1372474
    FileWrite($hFileOpen, StringRegExpReplace($sFileContents, _
            '(?m)^.*(\Q' & $sSearch4 & "\E|\Q" & $sSearch5 & "\E|\Q" & $sSearch6 & "\E|\Q" & $sSearch7 & "\E|\Q" & $sSearch8 & _
            '\E.*\R?)(*SKIP)(?!)|^.*\R*', ""))
    ; This part of the RE pattern :-
    ; "(\Q' & $sSearch4 & "\E|\Q" & $sSearch5 & "\E|\Q" & $sSearch6 & "\E|\Q" & $sSearch7 & "\E|\Q" & $sSearch8 & '\E.*\R?)"
    ; "(....)" The first open bracket and the last closing bracket encase all the searches into one group.
    ; "|" means "or". So each search is separated and connected with an "or".
    ; "\Q'....\E" means all characters between "\Q" and "\E" are taken as literal characters.
    ; Meaning of this part of the RE pattern :-
    ; Find (match) the literal characters in $sSearch4 or $sSearch5 or $sSearch6 or $sSearch7 or $sSearch8.

    FileClose($hFileOpen)
EndFunc   ;==>_FileDeleteLines

Func Terminate()
    AdlibUnRegister("_FileDeleteLines")
    Exit
EndFunc   ;==>Terminate

 

 

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...