Sign in to follow this  
Followers 0
OverloadUT

Reading only part of a file

21 posts in this topic

I am in need of reading a file starting from a particular byte. I do not want to read the entire file in to memory and then scan it that way, because this file could be up to 30 megs, and I need to read it every 5 seconds.

I looked through the help file and I couldn't find any file read functions that let you seek to a particular byte and read from there.

Does this exist?

Share this post


Link to post
Share on other sites



#2 ·  Posted (edited)

I really dont think so... but

you could read the amount of lines in the file then start reading the actual file from a precise line number

8)

Edited by Valuater

NEWHeader1.png

Share this post


Link to post
Share on other sites

#3 ·  Posted (edited)

Use FileRead, read up to and the byte needed, then use stringmid or stringright just to extract what you need.

That should work.

Edited by gafrost

SciTE for AutoItDirections for Submitting Standard UDFs

 

Don't argue with an idiot; people watching may not be able to tell the difference.

 

Share this post


Link to post
Share on other sites

Yes, both of those solutions would accomplish the goal of reading part of the file, but the fundamental problem is the fact that both of them require reading the entire file in to memory. As I said, my file could be 30mb, so that would be a huge amount of memory to use every 5 seconds.

The file system allows you to "seek" to a particular byte and only read the bytes you want to, but all of the AU3 functions seem to read the entire file in to memory...

Share this post


Link to post
Share on other sites

#5 ·  Posted (edited)

but the fundamental problem is the fact that both of them require reading the entire file in to memory.

no, what i suggested is that you find the line by _FileCountLines() then, use the line or lines closest to the area you want to check

8)

Edited by Valuater

NEWHeader1.png

Share this post


Link to post
Share on other sites

#6 ·  Posted (edited)

_FileCountLines() is simply a UDF that does this:

Return StringLen(StringAddCR(FileRead($sFilePath, $N))) - $N + 1 ; $N is the size of the file

Therefore, it has to read the entire file in to memory with the "FileRead()" call. I need to avoid doing that.

Edited by OverloadUT

Share this post


Link to post
Share on other sites

#7 ·  Posted (edited)

_FileCountLines() is simply a UDF that does this:

Return StringLen(StringAddCR(FileRead($sFilePath, $N))) - $N + 1 ; $N is the size of the file

Therefore, it has to read the entire file in to memory with the "FileRead()" call. I need to avoid doing that.

Yes initially, however if you find that your desired information in on ( or around ) a certain line number

Then... you can write your script based on that line ( or lines )

Thus, you would not need to read the entire file each time

ok... thats it for me

good luck

8)

Edited by Valuater

NEWHeader1.png

Share this post


Link to post
Share on other sites

Hi,

I an not sure if it helps, but I think Larry's binary UDF only reads small buffers;

_APIFileSetPos

APIFileReadWrite.au3

http://www.autoitscript.com/forum/index.ph...ost&p=86564

Binary File Read/Write

Binary File Read/Write

PS do you know which byte you want, or have to find it?

Is it in a text file or exe, jpg etc?

Randall

Share this post


Link to post
Share on other sites

Global Const $ForReading = 1
$fso = ObjCreate("scripting.filesystemobject")
$f = $fso.OpenTextFile(@ScriptFullPath, $ForReading)
$f.Skip(100)
$s_text = $f.Read(10)
$f.Close()
MsgBox(0,"Test",$s_text)


SciTE for AutoItDirections for Submitting Standard UDFs

 

Don't argue with an idiot; people watching may not be able to tell the difference.

 

Share this post


Link to post
Share on other sites

#11 ·  Posted (edited)

@gary,

Thanks, Good to know,

Best, Randall

Reading from a certain location is easy, writing is another matter.

If someone wants that ability, might be a good idea to create a plug-in and use fseek.

Edited by gafrost

SciTE for AutoItDirections for Submitting Standard UDFs

 

Don't argue with an idiot; people watching may not be able to tell the difference.

 

Share this post


Link to post
Share on other sites

Thank you gafrost, that's exactly what I need!

In case you're curious, my AutoIt program serves as the "bridge" between Civilization 4 and my civstats.com website. The AutoIt program will need to read from a log file created by a Civ4 python script and upload only the new log items to the webserver every time it changes. This log file could potentially become very large so just remembering the last byte I stopped at on the last upload is seems like the most logical way of doing it. Every couple seconds I check to see if the log file has grown, and if it has I start reading any bytes starting after the last byte I read.

Share this post


Link to post
Share on other sites

#13 ·  Posted (edited)

@OverloadUT, you probably would do better looking at the tail functions; like in the recent thread;

Tail gone craze

You didn't say you needed to check the "end" of a long log file; might have got there sooner.

@gafrost

Surely that write function is already there in Larry's function.. see the link in my post #8 above?

Randall

Edited by randallc

Share this post


Link to post
Share on other sites

#14 ·  Posted (edited)

Here is the starter for saving at a certain spot;

speed up by;

1. buffer to appropriate size for reda / write the ending

2. change to vbs for read/ write the ending [only needed if "Insert at byte" rather than "replace at byte"]

What do you think?

Randall

[EDIT -zip - see my next post]

Edited by randallc

Share this post


Link to post
Share on other sites

#16 ·  Posted (edited)

Hi

Look at great Larry's Binary File Read/Write UDF

Yes, that's what I'm using, but he

1. had an error by 1 byte. [in the write function; cut off the last character of the string being written]

[** still incorrect; ? adding a @LF as well?] (* Can anyone pleasae help get this correct?)

2. Did not write examples showing usage.

quick example**only

3. Abandoned it, saying "redundant now we have binaryString"

This thread has become superfluous since the BinaryString datatype addition. **quoted from Larry

Larry Please correct me if I'm wrong.

I have therefore ;

1. Corrected error in Write function [** still incorrect; ? adding a @LF as well?] (* Can anyone pleasae help get this correct?)

2. Written extra funcs;

Now very rapid file write to a specific byte if only ""replace"

2. 3 secs file write to a specific byte for "insert" in 20Mb file. [using obj read ;as Larry admitted was too slow using his binary read]

Binary read is VERY slow **quoted from Larry

Best, Randall

[@gafrost; Larry's func does a "replace " at a certain byte in 5msecs!; ; "Insert" is acceptable speed, I think, without external program - what do you think?]

_CharInsertByByte($s_FileRead,$s_FileCopy,$textInsertF,$i_StartByte,$i_Replace=1,$i_Buffer=50000000,$i_Hex=0)oÝ÷ Ù«­¢+Øí    ¥¹ÉåA½Ì¹ÔÌ(¥¹±ÕÅÕ½ÐíA%I¹ÔÌÅÕ½Ðì)±½°ÀÌØíÍ}
½ÁäõMÉ¥ÁѥȵÀìÅÕ½ÐìÀäÈí½Áä¹ÑáÐÅÕ½Ðì°ÀÌØíÑáÑ%¹ÍÉÐôÅÕ½ÐíéééééèÅÕ½Ðì(ÀÌØíÍ}QÍÐõ¥±=Á¹¥±½ ÅÕ½Ðí
¡½½Í¥±ÅÕ½Ðì°MÉ¥ÁѥȰÅÕ½Ðí¥±Ì ¨¹á쨹±°ì¨¹ÑáФÅÕ½Ðì°Ä¤)¥±
½Áä ÀÌØíÍ}QÍаÀÌØíÍ}
½Áä°ä¤)5Í    ½à À°ÅÕ½ÐìÅÕ½Ðì°ÅÕ½ÐíÐ¥±I1¥¹ ÀÌØíÍ}
½Áä¤ôÅÕ½ÐìµÀí¥±I1¥¹ ÀÌØíÍ}
½Á䤤)5Í  ½à À°ÅÕ½ÐìÅÕ½Ðì°ÅÕ½ÐíÐ¥±ÑM¥é ÀÌØíÍ}
½Áä¤ôÅÕ½ÐìµÀí¥±ÑM¥é ÀÌØíÍ}
½Á䤤(ÀÌØíÑ¥µÈõQ¥µÉ%¹¥Ð ¤)}
¡É%¹ÍÉÑ   å  åÑ ÀÌØíÍ}QÍаÀÌØíÍ}
½Áä°ÀÌØíÑáÑ%¹ÍÉаÄİĤìÀÌØí¥}IÁ±ôÄm±Í%¹ÍÉÑt)
½¹Í½±]ɥѡQ¥µÉ¥ ÀÌØíÑ¥µÈ¤µÀí1¤)5Í    ½à À°ÅÕ½ÐìÅÕ½Ðì°ÅÕ½ÐíQ¥µÉ¥ ÀÌØíÑ¥µÈ¤ôÅÕ½ÐìµÀíQ¥µÉ¥ ÀÌØíÑ¥µÈ¤¤)5Í   ½à À°ÅÕ½ÐìÅÕ½Ðì°ÅÕ½ÐíP¥±I1¥¹ ÀÌØíÍ}
½Áä¤ôÅÕ½ÐìµÀí¥±I1¥¹ ÀÌØíÍ}
½Á䤤)5Í  ½à À°ÅÕ½ÐìÅÕ½Ðì°ÅÕ½ÐíP¥±ÑM¥é ÀÌØíÍ}
½Áä¤ôÅÕ½ÐìµÀí¥±ÑM¥é ÀÌØíÍ}
½Á䤤(ìôôôôôôô9=]%Q %9MIPôôôôôôôôôôôôôôôôôôôôôôôôôôô)¥±
½Áä ÀÌØíÍ}QÍаÀÌØíÍ}
½Áä°ä¤)5Í    ½à À°ÅÕ½ÐìÅÕ½Ðì°ÅÕ½ÐíÐ¥±I1¥¹ ÀÌØíÍ}
½Áä¤ôÅÕ½ÐìµÀí¥±I1¥¹ ÀÌØíÍ}
½Á䤤)5Í  ½à À°ÅÕ½ÐìÅÕ½Ðì°ÅÕ½ÐíÐ¥±ÑM¥é ÀÌØíÍ}
½Áä¤ôÅÕ½ÐìµÀí¥±ÑM¥é ÀÌØíÍ}
½Á䤤(ÀÌØíÑ¥µÈõQ¥µÉ%¹¥Ð ¤)}
¡É%¹ÍÉÑ   å  åÑ ÀÌØíÍ}QÍаÀÌØíÍ}
½Áä°ÀÌØíÑáÑ%¹ÍÉаÄÄ°À¤ìÀÌØí¥}IÁ±ôÄm±Í%¹ÍÉÑt)
½¹Í½±]ɥѡQ¥µÉ¥ ÀÌØíÑ¥µÈ¤µÀí1¤)5Í    ½à À°ÅÕ½ÐìÅÕ½Ðì°ÅÕ½ÐíQ¥µÉ¥ ÀÌØíÑ¥µÈ¤ôÅÕ½ÐìµÀíQ¥µÉ¥ ÀÌØíÑ¥µÈ¤¤)5Í   ½à À°ÅÕ½ÐìÅÕ½Ðì°ÅÕ½ÐíP¥±I1¥¹ ÀÌØíÍ}
½Áä¤ôÅÕ½ÐìµÀí¥±I1¥¹ ÀÌØíÍ}
½Á䤤)5Í  ½à À°ÅÕ½ÐìÅÕ½Ðì°ÅÕ½ÐíP¥±ÑM¥é ÀÌØíÍ}
½Áä¤ôÅÕ½ÐìµÀí¥±ÑM¥é ÀÌØíÍ}
½Á䤤(

[EDIT - did not need vbs OR scripting object- see 3 posts ahead for new attachment

I agree, btw, we do not need the object read as the API is faster for big reads (still 100Mb ? 7 secs) [and therefore Write as Insert function]

]

Edited by randallc

Share this post


Link to post
Share on other sites

#17 ·  Posted (edited)

@OverloadUT, you probably would do better looking at the tail functions; like in the recent thread;

Tail gone craze

You didn't say you needed to check the "end" of a long log file; might have got there sooner.

Those tail functions use _FileCountLines() which has to read the entire file in to memory. Once again, I cannot have it do that.

The solutions presented by gafrost and randallc do not read the entire file in to memory, which is exactly what I need. I'll use qafrost's because it's a very small amount of code and all I need to do is read a file from a particular byte forward, not write to the file.

Thanks everyone!

Edit: Actually, the "scripting.filesystemobject" method used in gafrost's solution takes a while to seek far in to a file. I tested it seeking 100mb in to a 200mb file, and it took a good 3 seconds to execute. The _APIFile functions in randallc's solution works perfectly - seeking 100mb in only takes a few ms. Thanks!

Edited by OverloadUT

Share this post


Link to post
Share on other sites

OK!

I note they are realy Larry's functions, though!

1. I found the binary read slow too; did you only read a small part of the 200Mb file from the 100mb starting point? - if not, could you post me a snippet of your script to show how you called it?

2. "tail.exe" can be wrapped to make tail functions quick; perhaps elsewhere on the forum; but you have your answer anyway, by the sound of it.

Best, randall

Share this post


Link to post
Share on other sites

#19 ·  Posted (edited)

I agree, btw, we do not need the object read as the API is faster for big reads (still 100Mb ? 7 secs) [and therefore Write as Insert function]

Randall

*** LOOKS safe to me now; don't use previous versions. [changed the workaround in APIFileWrite..]

BinaryPos.zip

Edited by randallc

Share this post


Link to post
Share on other sites

I agree, btw, we do not need the object read as the API is faster for big reads (still 100Mb ? 7 secs) [and therefore Write as Insert function]

Randall

*** LOOKS safe to me now; don't use previous versions. [changed the workaround in APIFileWrite..]

i know its a old post, but is damn usefull!

100mb file spicked with 1000x 1kb text vars created in 1.77 sek o.O

1.7 ms per filepoint switch + write

thx a lot <_<

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0