Jump to content
PlatinumDruggie

AutoIt delete all lines from a *.txt file except some

Recommended Posts

PlatinumDruggie

Hi, I am new to this app, and I would like some pointers.

I'd like AutoIt to constantly scan a *.txt file, delete every line that doesn't contain the words new mail. How would I go around doing that?

 

Share this post


Link to post
Share on other sites
iamtheky

you can do these things in this order, at a frequency befitting your project.

_FileReadToArray
_ArrayFindAll
_ArrayDelete
_FileWriteFromArray

 


,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-.
|(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/
(_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_)
| | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) (
| | | | |)| | \ / | | | | | |)| | `--. | |) \ | |
`-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_|
'-' '-' (__) (__) (_) (__)

Share this post


Link to post
Share on other sites
Malkey

This working example checks a specified .txt file every 30 secs.  If a search string is not found in any of the file's lines, those lines are deleted.  So the only lines remaining in the file all contain the "searched for" string.

HotKeySet("{ESC}", "Terminate") ; Press Esc key to exit script.

Global $sFileName = "NewMailTest.txt"
Global $sSearch4 = "new mail"
_FileDeleteLines()
AdlibRegister("_FileDeleteLines", 30 * 1000) ; Run the function, _FileDeleteLines(), every 30 secs

While Sleep(100)
WEnd


Func _FileDeleteLines()
    Local $sFileContents = FileRead($sFileName)
    Local $hFileOpen = FileOpen($sFileName, 2) ; $FO_OVERWRITE (2) = Write mode (erase previous contents)
    
    ; RE Replace pattern also used here: https://www.autoitscript.com/forum/topic/191336-split-csv/?do=findComment&comment=1372474
    FileWrite($hFileOpen, StringRegExpReplace($sFileContents, '(?m)^.*\Q' & $sSearch4 & '\E.*\R?(*SKIP)(?!)|^.*\R*', "")) ; Delete all lines that do not have the contents of $sSearch4 present.
    ; For explanation of the RE pattern, see the StringRegExp() function in the AutoIt help file.:, or,
    ; http://www.pcre.org/original/doc/html/pcrepattern.html#SEC24
    
    FileClose($hFileOpen)
EndFunc   ;==>_FileDeleteLines

Func Terminate()
    AdlibUnRegister("_FileDeleteLines")
    Exit
EndFunc   ;==>Terminate

 Here is the "NewMailTest.txt"  test file used.   Where only the lines 1, 4, and 6 remain in the file.

The above script searches for "new mail", and not "new" and "mail" separately in the same line.

Line 1 new mail xxx.
Line 2 new space mail
Line 3 new line
Line 4 new mail
Line 5 another line
Line 6 new mail yyy

 

Share this post


Link to post
Share on other sites
mikell
1 hour ago, Malkey said:
RE Replace pattern also used here: https://www.autoitscript.com/forum/topic/191336-split-csv/?do=findComment&comment=1372474

"brilliantly created here" would have been nicer :P:D

Share this post


Link to post
Share on other sites
iamtheky

That post was lacking the niceties, he didnt even offer to paste it into the OPs Scite and push F5 for them?


,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-.
|(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/
(_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_)
| | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) (
| | | | |)| | \ / | | | | | |)| | `--. | |) \ | |
`-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_|
'-' '-' (__) (__) (_) (__)

Share this post


Link to post
Share on other sites
PlatinumDruggie

Alright, seems logical.

Now, if I'd want to find multiple queries in a txt file, and delete the other lines. Tried this :

HotKeySet("{ESC}", "Terminate") ; Press Esc key to exit script.

Global $sFileName = "281478146260429_chatlog.txt"
Global $sSearch4 = "tips you"
Global $sSearch5 = "bank credits"
Global $sSearch6 = "trade with you"
Global $sSearch7 = "you tell"
Global $sSearch8 = "to a duel"
_FileDeleteLines()
AdlibRegister("_FileDeleteLines", 10 * 1000) ; Run the function, _FileDeleteLines(), every 30 secs

While Sleep(100)
WEnd


Func _FileDeleteLines()
    Local $sFileContents = FileRead($sFileName)
    Local $hFileOpen = FileOpen($sFileName, 2) ; $FO_OVERWRITE (2) = Write mode (erase previous contents)

    ; RE Replace pattern also used here: https://www.autoitscript.com/forum/topic/191336-split-csv/?do=findComment&comment=1372474
    FileWrite($hFileOpen, StringRegExpReplace($sFileContents, '(?m)^.*\Q' & $sSearch4 & $sSearch5 & $sSearch6 & $sSearch7 & $sSearch8 &  '\E.*\R?(*SKIP)(?!)|^.*\R*', "")) ; Delete all lines that do not have the contents of $sSearch4 present.
    ; For explanation of the RE pattern, see the StringRegExp() function in the AutoIt help file.:, or,
    ; http://www.pcre.org/original/doc/html/pcrepattern.html#SEC24

    FileClose($hFileOpen)
EndFunc   ;==>_FileDeleteLines

Func Terminate()
    AdlibUnRegister("_FileDeleteLines")
    Exit
EndFunc   ;==>Terminate

Doesn't seem to work. Can anyone point out my mistakes? I must certainly have a few errors in there.

Share this post


Link to post
Share on other sites
kylomas

PD,

Try it like this...(not tested)

HotKeySet("{ESC}", "Terminate") ; Press Esc key to exit script.

Global $sFileName = "281478146260429_chatlog.txt"

Global $asearch[5]

$asearch[0] = "tips you"
$asearch[1] = "bank credits"
$asearch[2] = "trade with you"
$asearch[3] = "you tell"
$asearch[4] = "to a duel"

_FileDeleteLines()

AdlibRegister("_FileDeleteLines", 10 * 1000) ; Run the function, _FileDeleteLines(), every 30 secs

While Sleep(100)
WEnd


Func _FileDeleteLines()
    Local $sFileContents = FileRead($sFileName)
    Local $hFileOpen = FileOpen($sFileName, 2) ; $FO_OVERWRITE (2) = Write mode (erase previous contents)

    For $1 = 0 To UBound($asearch) - 1
        $sFileContents = StringRegExpReplace($sFileContents, '(?m)^.*\Q' & $asearch[$1] & '\E.*\R?(*SKIP)(?!)|^.*\R*', ""))
    Next

    FileWrite($hFileOpen, $sFileContents)
    FileClose($hFileOpen)
EndFunc   ;==>_FileDeleteLines

Func Terminate()
    AdlibUnRegister("_FileDeleteLines")
    Exit
EndFunc   ;==>Terminate

kylomas

Edited by kylomas
  • Like 1

Forum Rules         Procedure for posting code

"I like pigs.  Dogs look up to us.  Cats look down on us.  Pigs treat us as equals."

- Sir Winston Churchill

Share this post


Link to post
Share on other sites
PlatinumDruggie
8 minutes ago, kylomas said:

PD,

Try it like this...

HotKeySet("{ESC}", "Terminate") ; Press Esc key to exit script.

Global $sFileName = "281478146260429_chatlog.txt"
Global $aSearch[0] = "tips you"
Global $aSearch[1] = "bank credits"
Global $aSearch[2] = "trade with you"
Global $aSearch[3] = "you tell"
Global $aSearch[4] = "to a duel"
_FileDeleteLines()
AdlibRegister("_FileDeleteLines", 10 * 1000) ; Run the function, _FileDeleteLines(), every 30 secs

While Sleep(100)
WEnd


Func _FileDeleteLines()
    Local $sFileContents = FileRead($sFileName)
    Local $hFileOpen = FileOpen($sFileName, 2) ; $FO_OVERWRITE (2) = Write mode (erase previous contents)

    for $1 = 0 to ubound($aSearch) - 1
        $sFileContents = StringRegExpReplace($sFileContents, '(?m)^.*\Q' & $aSearch[$1] &  '\E.*\R?(*SKIP)(?!)|^.*\R*', ""))
    next

    filewrite($hFileOpen,$sFileContents)
    FileClose($hFileOpen)
EndFunc   ;==>_FileDeleteLines

Func Terminate()
    AdlibUnRegister("_FileDeleteLines")
    Exit
EndFunc   ;==>Terminate

kylomas

Doesn't seem to do anything at all...

Share this post


Link to post
Share on other sites
kylomas

PD,

Refresh my post....

kylomas


Forum Rules         Procedure for posting code

"I like pigs.  Dogs look up to us.  Cats look down on us.  Pigs treat us as equals."

- Sir Winston Churchill

Share this post


Link to post
Share on other sites
PlatinumDruggie
3 minutes ago, kylomas said:

PD,

Refresh my post....

kylomas

The updated version says unbalanced brackets in expression.

Share this post


Link to post
Share on other sites
Malkey

Here is another method for multiple searches.

HotKeySet("{ESC}", "Terminate") ; Press Esc key to exit script.

Global $sFileName = "281478146260429_chatlog.txt"
Global $sSearch4 = "tips you"
Global $sSearch5 = "bank credits"
Global $sSearch6 = "trade with you"
Global $sSearch7 = "you tell"
Global $sSearch8 = "to a duel"
_FileDeleteLines()
AdlibRegister("_FileDeleteLines", 10 * 1000) ; Run the function, _FileDeleteLines(), every 30 secs

While Sleep(100)
WEnd


Func _FileDeleteLines()
    Local $sFileContents = FileRead($sFileName)
    Local $hFileOpen = FileOpen($sFileName, 2) ; $FO_OVERWRITE (2) = Write mode (erase previous contents)

    ; RE Replace pattern is brilliantly created here: https://www.autoitscript.com/forum/topic/191336-split-csv/?do=findComment&comment=1372474
    FileWrite($hFileOpen, StringRegExpReplace($sFileContents, _
            '(?m)^.*(\Q' & $sSearch4 & "\E|\Q" & $sSearch5 & "\E|\Q" & $sSearch6 & "\E|\Q" & $sSearch7 & "\E|\Q" & $sSearch8 & _
            '\E.*\R?)(*SKIP)(?!)|^.*\R*', ""))
    ; This part of the RE pattern :-
    ; "(\Q' & $sSearch4 & "\E|\Q" & $sSearch5 & "\E|\Q" & $sSearch6 & "\E|\Q" & $sSearch7 & "\E|\Q" & $sSearch8 & '\E.*\R?)"
    ; "(....)" The first open bracket and the last closing bracket encase all the searches into one group.
    ; "|" means "or". So each search is separated and connected with an "or".
    ; "\Q'....\E" means all characters between "\Q" and "\E" are taken as literal characters.
    ; Meaning of this part of the RE pattern :-
    ; Find (match) the literal characters in $sSearch4 or $sSearch5 or $sSearch6 or $sSearch7 or $sSearch8.

    FileClose($hFileOpen)
EndFunc   ;==>_FileDeleteLines

Func Terminate()
    AdlibUnRegister("_FileDeleteLines")
    Exit
EndFunc   ;==>Terminate

 

  • Like 1

Share this post


Link to post
Share on other sites
kylomas

PD,

You couldn't fix that?   Try this...

HotKeySet("{ESC}", "Terminate") ; Press Esc key to exit script.

Global $sFileName = "281478146260429_chatlog.txt"

Global $asearch[5]

$asearch[0] = "tips you"
$asearch[1] = "bank credits"
$asearch[2] = "trade with you"
$asearch[3] = "you tell"
$asearch[4] = "to a duel"

_FileDeleteLines()

AdlibRegister("_FileDeleteLines", 10 * 1000) ; Run the function, _FileDeleteLines(), every 30 secs

While Sleep(100)
WEnd


Func _FileDeleteLines()
    Local $sFileContents = FileRead($sFileName)
    Local $hFileOpen = FileOpen($sFileName, 2) ; $FO_OVERWRITE (2) = Write mode (erase previous contents)

    For $1 = 0 To UBound($asearch) - 1
        $sFileContents = StringRegExpReplace($sFileContents, '(?m)^.*\Q' & $asearch[$1] & '\E.*\R?(*SKIP)(?!)|^.*\R*', "")
    Next

    FileWrite($hFileOpen, $sFileContents)
    FileClose($hFileOpen)
EndFunc   ;==>_FileDeleteLines

Func Terminate()
    AdlibUnRegister("_FileDeleteLines")
    Exit
EndFunc   ;==>Terminate

kylomas

 


Forum Rules         Procedure for posting code

"I like pigs.  Dogs look up to us.  Cats look down on us.  Pigs treat us as equals."

- Sir Winston Churchill

Share this post


Link to post
Share on other sites
kylomas

@Malkey - Just read the doc for *SKIP.  What little I thought I knew of regex has been blown to shit.  Can you give a brief explanation of the pattern you are using?

kylomas


Forum Rules         Procedure for posting code

"I like pigs.  Dogs look up to us.  Cats look down on us.  Pigs treat us as equals."

- Sir Winston Churchill

Share this post


Link to post
Share on other sites
PlatinumDruggie

Malkey's script works fine! Thanks for the replies guys, much appreciated.

Share this post


Link to post
Share on other sites
kylomas

PD,

NP, glad you got what you need.  Just curious if you tried the shit I cobbled together?

kylomas


Forum Rules         Procedure for posting code

"I like pigs.  Dogs look up to us.  Cats look down on us.  Pigs treat us as equals."

- Sir Winston Churchill

Share this post


Link to post
Share on other sites
mikell

@kylomas

A little explanation about (*SKIP)(*FAIL) - same as (*SKIP)(*F) or (*SKIP)(?!) - in this post
:)

 

Edit
For this I usually use a compact version of Malkey's script, something like this

$filter = "tips you;bank credits;trade with you;you tell;to a duel"

Func _DeleteLinesNotContaining($sFileName, $filter)
   Local $sFileContents = FileRead($sFileName)
   $filter = "\Q" & StringReplace($filter, ";", "\E|\Q") & "\E"
   $res = StringRegExpReplace($sFileContents, '(?m)^.*(' & $filter & ').*\R?(*SKIP)(*F)|^.*\R?', ""))
   ; etc
   ; other stuff
EndFunc

 

Edited by mikell

Share this post


Link to post
Share on other sites
Malkey

The two following quotes are from here.

In the "Verbs that act after backtracking " section, we have:-
"....where (*SKIP) was encountered. (*SKIP) signifies that whatever text was matched leading up to it cannot be part of a successful match."

In the "Lookahead assertions" section we have :-
"If you want to force a matching failure at some point in a pattern, the most convenient way to do it is with (?!) because an empty string always matches, so an assertion that requires there not to be an empty string must always fail. The backtracking control verb (*FAIL) or (*F) is a synonym for (?!). "

The successfully matched text before the backtracking control verb, "(*SKIP)", is forced into a matching failure  using "(*FAIL)", or "(*F)", or "(?!)".   
As the RE pattern is applied to the rest of the test string from left to right and top to bottom, all the text that does  not match the pre-"*SKIP", or pre-"|"  part of the RE pattern is match with the Post-"|", or, the "post-or" part of the RE pattern.
Using StringRegExpReplace() function, all text that is matched is replaced with "" (nothing).  That is, the matched text is deleted, and the forced unmatched text is not deleted.

The only reason I answered this post was to further investigate mikell's "brilliantly created" RE pattern from here.  Thanks mikell.

 

@kylomas

The problem with the For-Next loop is after the first search, or loop, the lines that do not have the first search string in them are deleted.

For a line to survive all the searches in the loop, that line would need to have all search strings in that particular line.

 

  • Like 2

Share this post


Link to post
Share on other sites
mikell
7 hours ago, Malkey said:

The only reason I answered this post was to further investigate

It was a good idea, there are not so much ways to "match if not like this "
BTW your comments are much more complete than mine  :)
 

Share this post


Link to post
Share on other sites
kylomas

@Malkey...mikell...thanks for the explanations...
 

Quote

 

@kylomas

The problem with the For-Next loop is after the first search, or loop, the lines that do not have the first search string in them are deleted.

For a line to survive all the searches in the loop, that line would need to have all search strings in that particular line.

 

Yes, not real bright responding to a post without understanding the sre!!!:drool:

Edited by kylomas

Forum Rules         Procedure for posting code

"I like pigs.  Dogs look up to us.  Cats look down on us.  Pigs treat us as equals."

- Sir Winston Churchill

Share this post


Link to post
Share on other sites
PlatinumDruggie

Alright, was wondering. Is it possible to add a code in there to have /: in the clipboard everytime it loops?

On 12/1/2017 at 11:27 PM, Malkey said:

Here is another method for multiple searches.

HotKeySet("{ESC}", "Terminate") ; Press Esc key to exit script.

Global $sFileName = "281478146260429_chatlog.txt"
Global $sSearch4 = "tips you"
Global $sSearch5 = "bank credits"
Global $sSearch6 = "trade with you"
Global $sSearch7 = "you tell"
Global $sSearch8 = "to a duel"
_FileDeleteLines()
AdlibRegister("_FileDeleteLines", 10 * 1000) ; Run the function, _FileDeleteLines(), every 30 secs

While Sleep(100)
WEnd


Func _FileDeleteLines()
    Local $sFileContents = FileRead($sFileName)
    Local $hFileOpen = FileOpen($sFileName, 2) ; $FO_OVERWRITE (2) = Write mode (erase previous contents)

    ; RE Replace pattern is brilliantly created here: https://www.autoitscript.com/forum/topic/191336-split-csv/?do=findComment&comment=1372474
    FileWrite($hFileOpen, StringRegExpReplace($sFileContents, _
            '(?m)^.*(\Q' & $sSearch4 & "\E|\Q" & $sSearch5 & "\E|\Q" & $sSearch6 & "\E|\Q" & $sSearch7 & "\E|\Q" & $sSearch8 & _
            '\E.*\R?)(*SKIP)(?!)|^.*\R*', ""))
    ; This part of the RE pattern :-
    ; "(\Q' & $sSearch4 & "\E|\Q" & $sSearch5 & "\E|\Q" & $sSearch6 & "\E|\Q" & $sSearch7 & "\E|\Q" & $sSearch8 & '\E.*\R?)"
    ; "(....)" The first open bracket and the last closing bracket encase all the searches into one group.
    ; "|" means "or". So each search is separated and connected with an "or".
    ; "\Q'....\E" means all characters between "\Q" and "\E" are taken as literal characters.
    ; Meaning of this part of the RE pattern :-
    ; Find (match) the literal characters in $sSearch4 or $sSearch5 or $sSearch6 or $sSearch7 or $sSearch8.

    FileClose($hFileOpen)
EndFunc   ;==>_FileDeleteLines

Func Terminate()
    AdlibUnRegister("_FileDeleteLines")
    Exit
EndFunc   ;==>Terminate

 

 

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×