Sign in to follow this  
Followers 0
Aeterna

How to remove everything but "x"

17 posts in this topic

I have a .txt file

B9191

P274852

B6262

P66

B7142325

P9649

B862615

P42813

P379443

Pretty much looks like this for 80,000 lines. I want to remove everything that isn't a P or a B while keeping the 1 letter per line structure. How can I do this? Thx in advance!

Share this post


Link to post
Share on other sites



I have a .txt file

B9191

P274852

B6262

P66

B7142325

P9649

B862615

P42813

P379443

Pretty much looks like this for 80,000 lines. I want to remove everything that isn't a P or a B while keeping the 1 letter per line structure. How can I do this? Thx in advance!

Say your text is in a file called testreppb.txt , then

$s = fileread("testreppb.txt")

$s =  StringRegExpReplace($s,"[^P,B,\r]","")
filewrite("converted.txt",$s)

will write the result to converted.txt


Serial port communications UDF Includes functions for binary transmission and reception.printing UDF Useful for graphs, forms, labels, reports etc.Add User Call Tips to SciTE for functions in UDFs not included with AutoIt and for your own scripts.Functions with parameters in OnEvent mode and for Hot Keys One function replaces GuiSetOnEvent, GuiCtrlSetOnEvent and HotKeySet.UDF IsConnected2 for notification of status of connected state of many urls or IPs, without slowing the script.

Share this post


Link to post
Share on other sites

Be more precise, the question is not clear. Do you want to keep all lines begining by P or B or do you want to keep only B and P? Do you want to keep the structure of lines (CR/LF )?

Try something like

While 1
    $line = FileReadLine($file)
    If @error = -1 Then ExitLoop
    If $line[1]='B' or $line[1]='P' then FileWriteLine($g,$line)
Wend

It should produce a new file with lines begining by P or B only.

Share this post


Link to post
Share on other sites

Be more precise, the question is not clear. Do you want to keep all lines begining by P or B or do you want to keep only B and P? Do you want to keep the structure of lines (CR/LF )?

Try something like

While 1
    $line = FileReadLine($file)
    If @error = -1 Then ExitLoop
    If $line[1]='B' or $line[1]='P' then FileWriteLine($g,$line)
Wend

It should produce a new file with lines begining by P or B only.

Well the OP seemed pretty precise to me. Everything expect P or B but keep one letter per line. But if were're talking precise, then what is $file and what is $g? And since FileReadLine returns a line of text then $line[1] will cause an error because you mean StringLeft($line,1).

Serial port communications UDF Includes functions for binary transmission and reception.printing UDF Useful for graphs, forms, labels, reports etc.Add User Call Tips to SciTE for functions in UDFs not included with AutoIt and for your own scripts.Functions with parameters in OnEvent mode and for Hot Keys One function replaces GuiSetOnEvent, GuiCtrlSetOnEvent and HotKeySet.UDF IsConnected2 for notification of status of connected state of many urls or IPs, without slowing the script.

Share this post


Link to post
Share on other sites

It looked to me, "erase lines that don't have P or B, and make sure that there is only one letter the rest numbers."

I think clarification is important.

Share this post


Link to post
Share on other sites

Sorry if theres any confusion.

the resulting file should look like this

B

P

B

B

B

P

P

P

and so on. There are a few lines with just numbers, those lines need to be removed completely. And I think because I'm going to try to parse this text file later, that all spaces should be removed too. Hope this clarified!

Share this post


Link to post
Share on other sites

OK, this should work :mellow:

CODE
$s = "B9191" & @CRLF & _

"P274852" & @CRLF & _

"B6262" & @CRLF & _

"P66" & @CRLF & _

"B7142325" & @CRLF & _

"P9649" & @CRLF & _

"B862615" & @CRLF & _

"s862615" & @CRLF & _

"P42813" & @CRLF & _

"P379443" & @CRLF

$s = StringReplace(StringRegExpReplace($s,"(?s)([^PB\r\n])",""),@CRLF&@CRLF,@CRLF)

MsgBox(0, '', $s)


*GERMAN* [note: you are not allowed to remove author / modified info from my UDFs]My UDFs:[_SetImageBinaryToCtrl] [_TaskDialog] [AutoItObject] [Animated GIF (GDI+)] [ClipPut for Image] [FreeImage] [GDI32 UDFs] [GDIPlus Progressbar] [Hotkey-Selector] [Multiline Inputbox] [MySQL without ODBC] [RichEdit UDFs] [SpeechAPI Example] [WinHTTP]UDFs included in AutoIt: FTP_Ex (as FTPEx), _WinAPI_SetLayeredWindowAttributes

Share this post


Link to post
Share on other sites

OK, this should work :mellow:

CODE
$s = "B9191" & @CRLF & _

"P274852" & @CRLF & _

"B6262" & @CRLF & _

"P66" & @CRLF & _

"B7142325" & @CRLF & _

"P9649" & @CRLF & _

"B862615" & @CRLF & _

"s862615" & @CRLF & _

"P42813" & @CRLF & _

"P379443" & @CRLF

$s = StringReplace(StringRegExpReplace($s,"(?s)([^PB\r\n])",""),@CRLF&@CRLF,@CRLF)

MsgBox(0, '', $s)

If you read the first line, I said this goes on for about 80,000 lines. So setting the variable like that wouldnt help right?

Share this post


Link to post
Share on other sites

Well, this is just for testing :mellow: Just replace the $s = ... with $s=FileRead("file").

$s=FileRead("file")
$s = StringReplace(StringRegExpReplace($s,"(?s)([^PB\r\n])",""),@CRLF&@CRLF,@CRLF)
MsgBox(0, '', $s)
Now the only problem could be maximum string lentgh of 2147483647 characters :(

*GERMAN* [note: you are not allowed to remove author / modified info from my UDFs]My UDFs:[_SetImageBinaryToCtrl] [_TaskDialog] [AutoItObject] [Animated GIF (GDI+)] [ClipPut for Image] [FreeImage] [GDI32 UDFs] [GDIPlus Progressbar] [Hotkey-Selector] [Multiline Inputbox] [MySQL without ODBC] [RichEdit UDFs] [SpeechAPI Example] [WinHTTP]UDFs included in AutoIt: FTP_Ex (as FTPEx), _WinAPI_SetLayeredWindowAttributes

Share this post


Link to post
Share on other sites

This doesn't work with his additional information

There are a few lines with just numbers, those lines need to be removed completely.


*GERMAN* [note: you are not allowed to remove author / modified info from my UDFs]My UDFs:[_SetImageBinaryToCtrl] [_TaskDialog] [AutoItObject] [Animated GIF (GDI+)] [ClipPut for Image] [FreeImage] [GDI32 UDFs] [GDIPlus Progressbar] [Hotkey-Selector] [Multiline Inputbox] [MySQL without ODBC] [RichEdit UDFs] [SpeechAPI Example] [WinHTTP]UDFs included in AutoIt: FTP_Ex (as FTPEx), _WinAPI_SetLayeredWindowAttributes

Share this post


Link to post
Share on other sites

#include <File.au3>

$File1 = @ScriptDir & "\Log.txt"
$File2 = @ScriptDir & "\New_Log.txt"


While Not @error
 $Line = FileReadLine($File1)
 If @error Then ExitLoop
 $Line = StringLeft($Line, 1)
 If IsNumber($Line) Then ContinueLoop
 If IsString($Line) Then FileWriteLine($File2, $Line)
WEnd

Made a small edit, not sure if it's any better or worse, but it looks like it would work better. Looked like yours would be writing over itself and only adding new lines if there was a number in between.


Giggity

Share this post


Link to post
Share on other sites

Thank you guys for all your help, we're getting close hehe!

Some segments look like this

P,1,5,2,0,9,6,9,0

24

31

6

45

P,4,9,3,1,x,5,4,x

P,5,9,9,4,2,1,3,5

B,8,1,5,3,x,0,1,x

the end result should be

1 Letter Per Line

Only B's and P's remain

T's, #'s, and ","s should be removed.

There should be no blank lines.

Share this post


Link to post
Share on other sites

Thank you guys for all your help, we're getting close hehe!

Some segments look like this

the end result should be

1 Letter Per Line

Only B's and P's remain

T's, #'s, and ","s should be removed.

There should be no blank lines.

This will do what you want I think, but if there is a line with both P and B it will be replaced by P.

#include <file.au3>
Dim $array
$s = _FileReadToArray("testreppb.txt", $array);read the text file to an array of lines


$file = FileOpen("converted.txt", 2);open the file to write the results to
For $n = 1 To $array[0]
    If StringInStr($array[$n], "P") Then
        FileWriteLine($file, "P")
    Else
        If StringInStr($array[$n], "B") Then FileWriteLine($file, "B")
    EndIf

Next
FileClose($file)

Serial port communications UDF Includes functions for binary transmission and reception.printing UDF Useful for graphs, forms, labels, reports etc.Add User Call Tips to SciTE for functions in UDFs not included with AutoIt and for your own scripts.Functions with parameters in OnEvent mode and for Hot Keys One function replaces GuiSetOnEvent, GuiCtrlSetOnEvent and HotKeySet.UDF IsConnected2 for notification of status of connected state of many urls or IPs, without slowing the script.

Share this post


Link to post
Share on other sites

perfect!

Thank you very much!

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0