Jump to content

Function to Read through and Delete Lines in an Output File


Go to solution Solved by SOLVE-SMART,

Recommended Posts

Hello to All!,

 

I am new to this Forum, and still somewhat new to AutoIt. I just would like to know how to make a function I can put in my script that can do the following:

I have an output file that is generated (.dat) with six columns, and an unspecified number of rows. Sometimes I get a zero and/or blank columns (see attached). 

I would like this function to read through and delete the the rows with zeros and blank lines after my script finishes running.

Any help would be appreciated!

OutputData.dat

Link to comment
Share on other sites

You can do that with FileRead, StringRegexReplace and FileWrite. See help.

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Link to comment
Share on other sites

You could use FileReadToArray to get all lines into an array, then use StringSplit with @TAB to get all fields into another array, from then you would only have to delete undesired rows, and save it to new file _FileWriteFromArray.

Link to comment
Share on other sites

Hi @SoftWearInGinEar, welcome to the forum 👋 .

these are the first five rows of you attached OutputData.dat file:

Run 2023/03/09                  
Column 1    Column 2    Column 3    Column 4    Column 5    Column 6
7:14:34 1.05    250 25  0.011%  162.7
7:15:11 1.29    250.1   24.98   0.011%  162.7
7:15:47 1.53    250.2   24.96   0.011%  162.7

Is the first line intended or did you add this line manually? I ask because it's not a valid part of such CSV format.
Please keep this in mind when you start to tryout a CSV UDF (like "parseCSV.au3" or others).

Or just stick to the suggestions of the previous speakers (@jchd and @Nine).

Best regards
Sven

Stay innovative!

Spoiler

🌍 Au3Forums

🎲 AutoIt (en) Cheat Sheet

📊 AutoIt limits/defaults

💎 Code Katas: [...] (comming soon)

🎭 Collection of GitHub users with AutoIt projects

🐞 False-Positives

🔮 Me on GitHub

💬 Opinion about new forum sub category

📑 UDF wiki list

✂ VSCode-AutoItSnippets

📑 WebDriver FAQs

👨‍🏫 WebDriver Tutorial (coming soon)

Link to comment
Share on other sites

Just now, ioa747 said:

the OutputData.dat how many blank lines have it?

For the specific case only seven lines which match either a zero => pattern: [TAB]0[TAB] or [TAB][LineEnd] 😉 .
I hope @SoftWearInGinEar will find out a way to do this on his own. That's why I didn't came up with a concrete code snippet.

A small but valid RegEx pattern for the file would be:

'(\t\t|\t$)'

Best regards
Sven

Stay innovative!

Spoiler

🌍 Au3Forums

🎲 AutoIt (en) Cheat Sheet

📊 AutoIt limits/defaults

💎 Code Katas: [...] (comming soon)

🎭 Collection of GitHub users with AutoIt projects

🐞 False-Positives

🔮 Me on GitHub

💬 Opinion about new forum sub category

📑 UDF wiki list

✂ VSCode-AutoItSnippets

📑 WebDriver FAQs

👨‍🏫 WebDriver Tutorial (coming soon)

Link to comment
Share on other sites

1 hour ago, SOLVE-SMART said:

Is the first line intended or did you add this line manually? I ask because it's not a valid part of such CSV format.

Why is that ?
If not mistaken, there are 5 Tabs per line in any lines of OP's output, no matter the 1st line got only its 1st column filled. The problem would be if some lines had not 5 Tabs in them, but I didn't find any.

1391312099_5Tabsperline.png.68e2d857605410c477e180021e95b651.png

2 hours ago, SoftWearInGinEar said:

I would like this function to read through and delete the the rows with zeros and blank lines (...)

I don't see any blank line in OP's output (e.g totally empty) but maybe there could be (?)
The question is : what should be done with the 2nd line below ?

956784662_blanklastcolumn.png.244d364182e5b3b93354919389b7ba11.png

Will you delete it because the last column is empty ?
I guess you will because you don't want any column to be empty (or 0)

2 hours ago, SoftWearInGinEar said:

Sometimes I get a zero and/or blank columns (see attached). 

 but it's always good to have OP's confirmation :)

Edit: and we should ask too, will you delete it if  the value in last column was... 0, as in this altered line ?

56077748_alteredlineadded0inlastcolumn.png.8ebba2ff8443cfa9f02372f5c7858735.png

Edited by pixelsearch
Link to comment
Share on other sites

10 minutes ago, pixelsearch said:

Why is that ?
If not mistaken, there are 5 Tabs per line in any lines of OP's output, no matter the 1st line got only its 1st column filled. The problem would be if some lines had not 5

Good catch @pixelsearch, thanks, I missed this.
I saw the second line which seems to be the CSV header and I thought directly what about the first one 😅 .

My understanding is/was:

  • when a line contains a zero as column value, then delete this line
  • when not all column values of a line containing values, delete this too

Anyway, you're absolutly right about your last question 👍 .
 

14 minutes ago, pixelsearch said:

[...] will you delete it if  the value in last column was... 0, as in this added line ?

Best regards
Sven

Stay innovative!

Spoiler

🌍 Au3Forums

🎲 AutoIt (en) Cheat Sheet

📊 AutoIt limits/defaults

💎 Code Katas: [...] (comming soon)

🎭 Collection of GitHub users with AutoIt projects

🐞 False-Positives

🔮 Me on GitHub

💬 Opinion about new forum sub category

📑 UDF wiki list

✂ VSCode-AutoItSnippets

📑 WebDriver FAQs

👨‍🏫 WebDriver Tutorial (coming soon)

Link to comment
Share on other sites

You're welcome @SOLVE-SMART

imho checking for 0 in any column seems a bit radical to eliminate a row, but sure OP knows better than us :)

For example, If you look at column 4, its values start from 25 to 20.78 in the last row, a slow decreasing process that could reach 0 in a few hours.

If we look at column 1 (which seems to indicate time) it increases from 7:14:34 to 9:59:21 in the last row . If this is a continuous process running 24 hours, then Column 4 could indicate a value of 0 in a few hours ... and the concerned line will be deleted, when it shouldn't.

End of "script:D

For the record, this post was just for fun, as I got no idea of the meaning of any column or the time spent before the output is reset etc... today I'm in a funny mood, for a change !

Link to comment
Share on other sites

Even this might not be relevant for the author of the thread (we will see), but I struggle with the correct RegEx pattern.
I built some test data lines to prove my RegEx approach, but it's not correct. I assume that the red rectangle matches should not be matched, so how can I exclude these matches?

(\t?0$|\t?0\t)

7:32:13 7.36        0   0.011%  162.7
7:32:13 7.36    d   0   0.011%      0
0   7.36    d   0   0.011%      0
0   7.36    d   0   0.011%  55  0
51:0    7.36    d   0   0.011%  55  0
7:30:20 0   7.36    d   0   0.011%  55  0

regex-question.png

I guess you @pixelsearch can help me out with it 😊 ?

Also for the record:
If this shouldn't be part of the thread because it could be a different thread, then please excuse me.
I didn't want to get out of context too much.

Best regards
Sven

Edited by SOLVE-SMART

Stay innovative!

Spoiler

🌍 Au3Forums

🎲 AutoIt (en) Cheat Sheet

📊 AutoIt limits/defaults

💎 Code Katas: [...] (comming soon)

🎭 Collection of GitHub users with AutoIt projects

🐞 False-Positives

🔮 Me on GitHub

💬 Opinion about new forum sub category

📑 UDF wiki list

✂ VSCode-AutoItSnippets

📑 WebDriver FAQs

👨‍🏫 WebDriver Tutorial (coming soon)

Link to comment
Share on other sites

I always wished we opened 1 thread specially dedicated to any RegEx question, where everybody could ask a question (in the same thread) and be answered there. One day maybe... and I assure you it will be crowded :D

Meanwhile, SOLVE-SMART, I'm struggling with the same issue as yours because my RegEx knowledge is poor.
For example, with this subject closely related to yours, 6 lines, 5 Tabs per line (30 Tabs total) :

7:32:13 7.36    0   0.011%  162.7   0
7:32:13 7.36    d   0   0.011%  0
0   7.36    d   0   0.011%  0
0   7.36    d   0   0.011%  55
51:0    7.36    d   0   0.011%  0.001
7:30:20 0   7.36    d   0   0

First of all, I'm not even sure Tabs can be pasted usefully in The Forum code (aren't they transformed to spaces when other users try to copy paste the code ?) which makes it difficult to test for other users.

If I apply this pattern :

(?m)(^0)\t

It returns 2 matches (the 2 0's at beginning of lines, followed by a tab), great !

0
0

If I apply that pattern :

(?m)\t(0$)

It returns 4 matches (the 4 0's at the end of lines, preceded by a tab), fantastic !

0
0
0
0

Why does it return 10 matches when I try (certainly wrongly) to combine both patterns ?

(?m)(^0)\t|\t(0$)

869806963_why10matchesandnot6.png.a34517c19be02e60836be77feea9e19b.png

Row 0 : 
Row 1 : Chr(48) 
Row 2 : 
Row 3 : Chr(48) 
Row 4 : Chr(48) 
Row 5 : 
Row 6 : Chr(48) 
Row 7 : Chr(48) 
Row 8 : 
Row 9 : Chr(48) 

What should be done in this last pattern to return only 6 matches (2 + 4) ?

When this question is solved, then we could add an alternation (OR e.g. |) to retrieve 0's in the middle of each line, when they're preceded by a Tab AND followed by a Tab, something like |\t0\t

Just my 2 poor cts...

Edited by pixelsearch
nothing special
Link to comment
Share on other sites

This should do it :

(?m)^0\t|(?<=\t)0(?=\t)|\t0$

I tried it first without the positive lookahead / positive lookbehind, with this pattern :

(?m)^0\t|\t0\t|\t0$

But it failed on last line, where 2 0's are separated by a Tab character, as found in the the last line (and its 2 last columns) of the subject I indicated in my preceding post.

With this "wrong" pattern \t0\t it seems difficult to grab the last 0 (though it's preceded by a Tab) because the preceding grabbed 0 "ate" the Tab following him, so the offset is placed now just before the last 0 and not before the last Tab, that's why the last 0 isn't grabbed.

The advantage of positive lookahead / positive lookbehind is this :

"They do not consume characters in the string, but only assert whether a match is possible or not."

Jan Goyvaerts (Regular-Expressions)

Edit: @SOLVE-SMART did it solve your example too ?
Fingers crossed :)

Edited by pixelsearch
Link to comment
Share on other sites

Thank you very much @pixelsearch for your engagement 🤝 .

Unfortunately I believe, it's still not enough. But this depends absolutly on the requirements of @SoftWearInGinEar. Please see the following test data and their matches of your RegEx patterns and one adjusted RegEx pattern by me.

💡 Please notice, I had to replace (?m) with (?:) in VSCode to get the RegEx pattern work. But in the AutoIt code it doesn't matter => both variants lead to the same result.

Spoiler

 

First screenshot, 8 matches => without the positive lookahead/positive lookbehind:
regex-matches-8.png

Second screenshot, 9 matches => with the positive lookahead/positive lookbehind:
regex-matches-9.png

Third screenshot, 13 matches => with additional check for "double TAB":
regex-matches-13.png

 

I marked the missing cases by red rectangles. But this is only relevant when this assumption is true:

  • when a line contains a zero as column value, then match and delete this line
  • when not all column values of a line containing values (empty), then match and delete this too

I guess this is only just for fun, because we try to do it in a robust way with several combinations and so one. Out of the specific file from the OP, the pattern (\t\t|\t$) is simply enough 😂 .

Best regards
Sven

test-data.csv

Edited by SOLVE-SMART

Stay innovative!

Spoiler

🌍 Au3Forums

🎲 AutoIt (en) Cheat Sheet

📊 AutoIt limits/defaults

💎 Code Katas: [...] (comming soon)

🎭 Collection of GitHub users with AutoIt projects

🐞 False-Positives

🔮 Me on GitHub

💬 Opinion about new forum sub category

📑 UDF wiki list

✂ VSCode-AutoItSnippets

📑 WebDriver FAQs

👨‍🏫 WebDriver Tutorial (coming soon)

Link to comment
Share on other sites

This would be the final, but very ugly RegEx pattern, to match the assumed criteria (see post above) 😀 .

(?m)^0\t|^\t|(?<=\t)0(?=\t)|\t0$|\t$|\t\t

regex-matches-16.png

Best regards
Sven

Stay innovative!

Spoiler

🌍 Au3Forums

🎲 AutoIt (en) Cheat Sheet

📊 AutoIt limits/defaults

💎 Code Katas: [...] (comming soon)

🎭 Collection of GitHub users with AutoIt projects

🐞 False-Positives

🔮 Me on GitHub

💬 Opinion about new forum sub category

📑 UDF wiki list

✂ VSCode-AutoItSnippets

📑 WebDriver FAQs

👨‍🏫 WebDriver Tutorial (coming soon)

Link to comment
Share on other sites

1 hour ago, SOLVE-SMART said:

Unfortunately I believe, it's still not enough.

If I'm not mistaken, it is enough... for the subject you provided in the 1st place in this post 

7:32:13 7.36        0   0.011%  162.7
7:32:13 7.36    d   0   0.011%      0
0   7.36    d   0   0.011%      0
0   7.36    d   0   0.011%  55  0
51:0    7.36    d   0   0.011%  55  0
7:30:20 0   7.36    d   0   0.011%  55  0

14 0's are now correctly retrieved when using the positive lookahead / positive lookbehind pattern :

(?m)^0\t|(?<=\t)0(?=\t)|\t0$

Now, to solve OP's issue certainly requires other patterns (e.g empty lines, multiple followed Tabs for example) and you provided the solution, bravo :)

I'm very happy you started this conversation because it allowed us to discover the power of positive lookahead / positive lookbehind. Please just let me add a few notes from the author, which are interesting too :

18. Testing The Same Part of a String for More Than One Requirement 

Lookaround, which I [Jan Goyvaerts, the author] introduced in detail in the previous topic, is a very powerful concept. Unfortunately, it is often underused by people new to regular expressions, because lookaround  is a bit confusing. The confusing part is that the lookaround is zero-width. So if you have a regex in which a lookahead is followed by another piece of regex, or a lookbehind is preceded by another piece of regex, then the regex will traverse part of the string twice.

If I'm not mistaken, it's exactly what we did, "traversing part of the string twice."

For example, with the line that we worked on, which ends with these 4 characters Tab 0 Tab 0

* The lookaround part checks for Tab 0 Tab, without consuming (eating) any Tab, and grabs the penultimate 0
* Then the \t0$ checks again the last Tab ("traversing part of the string twice") and grabs the last 0 preceded by a Tab . So the last Tab has been checked twice, consumed only once, and now we know why :D

To our expert gurus: please be kind enough to correct anything wrong or badly expressed in our last RegEx posts
Thanks !

Edited by pixelsearch
Link to comment
Share on other sites

3 minutes ago, pixelsearch said:

If I'm not mistaken, it is enough... for the subject you provided in the 1st place in this post 

You're right about this.

4 minutes ago, pixelsearch said:

Now, to solve OP's issue certainly requires other patterns (e.g empty lines, multiple followed Tabs for example) and you provided the solution, bravo :)

Thanks. In case it is the solution which the OP is looking for 😂 ?!

5 minutes ago, pixelsearch said:

I'm very happy you started this conversation because it allowed us to discover the power of positive lookahead / positive lookbehind. Please just let me add a few notes from the author, which are interesting too : [...]

I am happy about this too. Thanks for the additional notes and explanations which are very educational 👍 .

7 minutes ago, pixelsearch said:

To our expert gurus: please be kind enough to correct anything wrong or badly expressed in our last RegEx posts

Exactly, this would be great and possibly necessary 😅😇 ?!
Thanks again for the good insights @pixelsearch 🤝 .

Best regards
Sven

Stay innovative!

Spoiler

🌍 Au3Forums

🎲 AutoIt (en) Cheat Sheet

📊 AutoIt limits/defaults

💎 Code Katas: [...] (comming soon)

🎭 Collection of GitHub users with AutoIt projects

🐞 False-Positives

🔮 Me on GitHub

💬 Opinion about new forum sub category

📑 UDF wiki list

✂ VSCode-AutoItSnippets

📑 WebDriver FAQs

👨‍🏫 WebDriver Tutorial (coming soon)

Link to comment
Share on other sites

Hey Everyone,

Sorry to get back so late and thank you all for the help! 

I haven't finished reading all the other comments yet, but I feel I should address the one about the six columns and the first column first.

There are six columns, the first was not put in manually, I am making a script that reads data off a program (let's call it program 'X' to not complicate it) and write it to an output file. The function below reads and writes the local time off my computer, and the data off of the GUI of X:

Func WaitForStep($hX, $hFile)
    ;Do
    FileWriteLine($hFile, _
        @HOUR & ':' & @MIN & ':' & @SEC & @TAB & _
        Clock($hX) & @TAB & _
        ControlGetText($hX, '', "[NAME:textBox29]") & @TAB & _
        ControlGetText($hX, '', "[NAME:textBox20]") & @TAB & _
        ControlGetText($hX, '', "[NAME:textBox10]") & @TAB & _
        ControlGetText($hX, '', "[NAME:textBox32]"))
    ;$fSecsLast = $fSecs
    ;$fSecs = ControlGetText($hX, '', "[NAME:textBox83]")
   ; Until $fSecs == $fSecsLast
EndFunc

Link to comment
Share on other sites

20 hours ago, pixelsearch said:

Why is that ?
If not mistaken, there are 5 Tabs per line in any lines of OP's output, no matter the 1st line got only its 1st column filled. The problem would be if some lines had not 5 Tabs in them, but I didn't find any.

1391312099_5Tabsperline.png.68e2d857605410c477e180021e95b651.png

I don't see any blank line in OP's output (e.g totally empty) but maybe there could be (?)
The question is : what should be done with the 2nd line below ?

956784662_blanklastcolumn.png.244d364182e5b3b93354919389b7ba11.png

Will you delete it because the last column is empty ?
I guess you will because you don't want any column to be empty (or 0)

 but it's always good to have OP's confirmation :)

Edit: and we should ask too, will you delete it if  the value in last column was... 0, as in this altered line ?

56077748_alteredlineadded0inlastcolumn.png.8ebba2ff8443cfa9f02372f5c7858735.png

Yes the whole row should be deleted if the last column is 0 or blank. It should delete it if any of them are 0 or blank.

Link to comment
Share on other sites

15 hours ago, SOLVE-SMART said:

Thank you very much @pixelsearch for your engagement 🤝 .

Unfortunately I believe, it's still not enough. But this depends absolutly on the requirements of @SoftWearInGinEar. Please see the following test data and their matches of your RegEx patterns and one adjusted RegEx pattern by me.

💡 Please notice, I had to replace (?m) with (?:) in VSCode to get the RegEx pattern work. But in the AutoIt code it doesn't matter => both variants lead to the same result.

  Reveal hidden contents

 

First screenshot, 8 matches => without the positive lookahead/positive lookbehind:
regex-matches-8.png

Second screenshot, 9 matches => with the positive lookahead/positive lookbehind:
regex-matches-9.png

Third screenshot, 13 matches => with additional check for "double TAB":
regex-matches-13.png

 

I marked the missing cases by red rectangles. But this is only relevant when this assumption is true:

  • when a line contains a zero as column value, then match and delete this line
  • when not all column values of a line containing values (empty), then match and delete this too

I guess this is only just for fun, because we try to do it in a robust way with several combinations and so one. Out of the specific file from the OP, the pattern (\t\t|\t$) is simply enough 😂 .

Best regards
Sven

test-data.csv 913 B · 2 downloads

Yes that it what I am looking for, delete line if any of the columns are 0 or blank.

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...