Sign in to follow this  
Followers 0
foster74

Deleting lines from large files and arrays takes too long, any other ways/options?

11 posts in this topic

Deleting a line from a file near 1,000,000 lines takes close to a second, also if I read the file to an array and delete an element it also takes too long. What my program needs to do is go through a large file line by line and occasionally delete a line, but for what I need it for anywhere near a second is too long. =( What other options/ways are quicker? Any suggestions are very much appreciated! Thank you.

Share this post


Link to post
Share on other sites



#2 ·  Posted (edited)

Deleting a line from a file near 1,000,000 lines takes close to a second, also if I read the file to an array and delete an element it also takes too long. What my program needs to do is go through a large file line by line and occasionally delete a line, but for what I need it for anywhere near a second is too long. =( What other options/ways are quicker? Any suggestions are very much appreciated! Thank you.

Have you tried using FileOpen and then going through with FileReadLine?

Edit: To delete the lines try _FileWriteToLine... I'm not sure if that will work but it's worth a try.

Edited by Achilles

My Programs[list][*]Knight Media Player[*]Multiple Desktops[*]Daily Comics[*]Journal[/list]

Share this post


Link to post
Share on other sites

Thank you very much for the suggestion, but deleting a line using _filewritetoline takes a very long time =(

Share this post


Link to post
Share on other sites

On my system deleting 2 lines from the start or end of a 1000000 lines, using StringTrimLeft/Right takes about 0.14 sec. Reading /writing the lines to a file each takes 0.8 to 0.9 secs, disk i/o is the bottleneck.

Share this post


Link to post
Share on other sites

Does it have to be a text file or can you use a SQL database or something?

Share this post


Link to post
Share on other sites

Take the FileOpen and then going through with FileReadLine. However this is where it changes, instead of rewriting the entire file start a new file for example you have a file: Testing.txt and the database changing into Testing_1.txt. That will shave off a considerable amount and reduce your programs footprint if your take the whole file before executing anything. I do know that is how the more advance editors work, some even keeping a Backup of the previous file changed. So you have:

FileOpen("Testing.txt")

FileOpen("Testing_1.txt")

While

Readline("Testing.txt")

Execute your work on the line

Writeline("Testing_1.txt")

WEnd

FileClose("Testing.txt")

FileClose(Testing_1.txt")

Exit

This is a model of what you want to do


0x576520616C6C206469652C206C697665206C69666520617320696620796F75207765726520696E20746865206C617374207365636F6E642E

Share this post


Link to post
Share on other sites

NEVER use FileReadLine in a situation like this. You can find no slower method than that do work with large files.

How are you determining the lines to be deleted. If you have some type of pattern that can be used then you can simply use FileRead() with StringReplace() or more likely StringRegExpReplace()


George

Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.

Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.***

The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number.

Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else.

"Old age and treachery will always overcome youth and skill!"

Share this post


Link to post
Share on other sites

NEVER use FileReadLine in a situation like this. You can find no slower method than that do work with large files.

How are you determining the lines to be deleted. If you have some type of pattern that can be used then you can simply use FileRead() with StringReplace() or more likely StringRegExpReplace()

The program is a TCP server which communicates with about 40 clients over a network, it will send out lines of text to clients and will determine whether to delete/change each line on a case to case basis. The program will continue to loop through the file continuously. I also would very much like a sudden shut down of the computer not to roll back any data changes, so a frequent and quick way to save would be a huge plus. SQLite really seems like a pain to delve into, especially not knowing if it can do all this quickly. I appreciate all the previous and future suggestions, thank you!

Share this post


Link to post
Share on other sites

The program is a TCP server which communicates with about 40 clients over a network, it will send out lines of text to clients and will determine whether to delete/change each line on a case to case basis. The program will continue to loop through the file continuously. I also would very much like a sudden shut down of the computer not to roll back any data changes, so a frequent and quick way to save would be a huge plus. SQLite really seems like a pain to delve into, especially not knowing if it can do all this quickly. I appreciate all the previous and future suggestions, thank you!

Still trying to figure out the best way =/

Share this post


Link to post
Share on other sites

The program is a TCP server which communicates with about 40 clients over a network, it will send out lines of text to clients and will determine whether to delete/change each line on a case to case basis. The program will continue to loop through the file continuously. I also would very much like a sudden shut down of the computer not to roll back any data changes, so a frequent and quick way to save would be a huge plus. SQLite really seems like a pain to delve into, especially not knowing if it can do all this quickly. I appreciate all the previous and future suggestions, thank you!

if you tell us more about the purpose of the script, we can possibly show you better ways than using a large text file. However, from your description above I cannot say anything.

Kurt


__________________________________________________________(l)user: Hey admin slave, how can I recover my deleted files?admin: No problem, there is a nice tool. It's called rm, like recovery method. Make sure to call it with the "recover fast" option like this: rm -rf *

Share this post


Link to post
Share on other sites

if you tell us more about the purpose of the script, we can possibly show you better ways than using a large text file. However, from your description above I cannot say anything.

Kurt

The program will be dealing with around a million lines with about 7 pieces of information per line. Which are:

1. Being an constant identifier between 10 and 14 characters.

2. A 5 character variable which will need to be changed around 1% of the time.

3. A single digit variable which will need to be changed around .1% of the time.

4,5. Will be constants around 20 characters.

6,7. These will be 0 or 1. One of which will be changed every time it's accessed.

The program will communicate with around 40 client programs, and will change data when receiving data back from a client. A few of the clients will be doing something different with the data, this is the reason for the 0/1 variables (information 6 and 7). To track which client has received which line. i.e.,

Line 1 to Client1 (Change information 6 to 1)

Line 2 to Client1 (Change information 6 to 1)

Line 3 to Client1 (Change information 6 to 1)

Line 4 to Client1 (Change information 6 to 1)

Line 5 to Client1 (Change information 6 to 1)

Line 6 to Client1 (Change information 6 to 1)

Line 7 to Client1 (Change information 6 to 1)

Line 1 to Client2 (Change information 7 to 1)

Line 8 to Client1 (Change information 6 to 1)

Line 9 to Client1 (Change information 6 to 1)

Line 2 to Client2 (Change information 7 to 1)

Line 10 to Client1 (Change information 6 to 1)

So it will know which line to send, once it has gone through every line it will changed the information back to 0, rinse and repeat.

I will need the database to save frequently and quickly because this server will be sending out around 100 lines per second. The program will be accessing the database non-concurrently. Sorry if any/all of this is confusing, I've been up too long :P Also I couldn't think of another name besides information so forgive me for that! Any suggestions are VERY much appreciated, thank you.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0