Jump to content

delete lines containing


way1000
 Share

Recommended Posts

i have a text file with 1 million lines where i have to remove 800k lines. i want to remove the 800k lines and keep the 200k lines but the line order has to remain the same. lines removed should become empty lines so there should still be 1 million lines but only 200k with text

Link to comment
Share on other sites

49 minutes ago, gruntydatsun said:

you're going to go way over the size limit for variables in autoit

It depends on the size of the lines ;)

;#cs
$txt = ""
For $i = 0 to 1500000
   $txt &= "this is the line of text #" & $i & @crlf
Next
FileWrite("1.txt", $txt)
;#ce

$txt = FileRead("1.txt")
; remove text from lines ending with 12, 14, 16
$new = StringRegExpReplace($txt, '(?m)^.*1[246]$', "")
FileWrite("new.txt", $new)

 

Link to comment
Share on other sites

  • 1 year later...

If you don't want to grab the matches into an array using the usual StringRegExp, to get the result as a string you have to introduce in the StringRegExpReplace a kind of negation to say : If the lines do NOT contain "items4"  then fire them
Here is a way :

$txt = "line1,items1,testtext1" & @crlf & _ 
    "line2,items4,testtext2" & @crlf & _ 
    "line3,items3,testtext3" & @crlf & _ 
    "line4,items4,testtext4" & @crlf & _ 
    "line5,items5,testtext5" & @crlf & _ 
    "line6,items6,testtext6" & @crlf

$in = "items4"

$res = StringRegExpReplace($txt, '(?m)^(.*\Q' & $in & '\E.*(*SKIP)(*F)|.*)$\R?', "")
Msgbox(0,"", $res)

In this alternation, the left side first matches the lines containing "items4", then (*SKIP)(*F) says 'No no, I don't want this", then all other lines (not containing "items4") are matched by the right side of the alternation and replaced by ""

Edit
This example doesn't replace fired lines with blank lines. To get blank lines just remove \R? (which means optional newline sequence)

:)

Edited by mikell
Link to comment
Share on other sites

Hi to both of you :)
Mikell, to blank all lines except the "items4", I just tried a "negative lookahead" (my 1st one !).  Do you think it's correct  ?
Based on your example :

$txt = "line1,items1,testtext1" & @crlf & _
    "line2,items4,testtext2" & @crlf & _
    "line3,items3,testtext3" & @crlf & _
    "line4,items4,testtext4" & @crlf & _
    "line5,items5,testtext5" & @crlf & _
    "line6,items6,testtext6" & @crlf

$res = StringRegExpReplace($txt, '(?m)^(?!.*items4).*$', "...")
Msgbox(0, "Dots dots", $res)

5c4876629a2fb_dotsdots.jpg.519a3cc39a72bed74f690446c222285b.jpg

The 3 dots "..." are here just to make blank lines clearly visible in the image. Replace with "" when desired

Link to comment
Share on other sites

Link to comment
Share on other sites

Hi Nine :)
Let's hope Mikell, Jchd or another RegExp guru will bring you the explanation you desire
I was lucky enough to have this "negative lookahead" working, after I read that "you can use any regular expression inside the lookahead (note that this is not the case with lookbehind)"

A complementary question could be, in the preceding example :
Why a negative lookahead (?!.* doesn't return the same results as .*(?!
when a positive lookahead (?=.* returns the same results as .*(?=

 

Edited by pixelsearch
Link to comment
Share on other sites

 fun for adding mulitple criteria as well ( i think, this could be all wrong).

$txt = "line1,items1,testtext1" & @crlf & _
    "line2,items4,testtext2" & @crlf & _
    "line3,items3,testtext3" & @crlf & _
    "line4,items4,testtext4" & @crlf & _
    "line5,items5,testtext5" & @crlf & _
    "line6,items6,testtext6" & @crlf

;~ $in = "(items4)"
$in = "(items4|items5)"

$s = StringRegExpReplace($txt , "(line.*" & $in & ".*?)\s" , " ")

msgbox(0, '' , $s)

 

,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-.
|(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/
(_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_)
| | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) (
| | | | |)| | \ / | | | | | |)| | `--. | |) \ | |
`-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_|
'-' '-' (__) (__) (_) (__)

Link to comment
Share on other sites

@mikell Thank you!!! Exactly what I was trying to do. Appreciate the explanation too.

Edit: you maybe interested to know with your help I was able to create a script that runs through 365 files, each with over 200,000 lines of data, and pinpoint all the key information into a separate file (about 300,000 lines long). With a program execution time of about 5mins. Man I love AutoIT and it's community!

Thanks everyone else as well :)

Edited by AnonymousX
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...