Sign in to follow this  
Followers 0
linenoize

notepad text file splitter

7 posts in this topic

I need to take a text file that is very long, and split it into several smaller text files. I have used autoit for smaller functions, but this is a little out of my league. Can anyone point me in the right direction?

Share this post


Link to post
Share on other sites



HI,

What about _FileCountLines and then divide by parts you wanna get and write the parts in the files or just read the file into an array with _FileReadToArray and then divide and write back.

So long,

Mega


Scripts & functions Organize Includes Let Scite organize the include files

Yahtzee The game "Yahtzee" (Kniffel, DiceLion)

LoginWrapper Secure scripts by adding a query (authentication)

_RunOnlyOnThis UDF Make sure that a script can only be executed on ... (Windows / HD / ...)

Internet-Café Server/Client Application Open CD, Start Browser, Lock remote client, etc.

MultipleFuncsWithOneHotkey Start different funcs by hitting one hotkey different times

Share this post


Link to post
Share on other sites

#3 ·  Posted (edited)

depends on what you mean by "really long", and how fast you need the operation to be. Also, to a lesser extent, it depends on how you want to do the breaking up. Do you want the sections to fit on a floppy, CD , Flash drive, or tape drive, or is there some time or other syntax logic you're looking to identify / branch on ?

If the file is terabytes long, counting the lines might not be the best approach if you're interested in getting done before dinner.

If you're interested in breaking the file up for archival , you'd be well advised to consider many of the compression / archival utilities that will archive / burn / record files across multiple units of physical media (you'd be even better advised to make multiple copies and store them in diverse locations)

If you need syntactical breaks, grep , find , cut, head, tail, and other text examination utils that have the ability to pipe data are fabulous for working w/ large files, and extracting the useful bits out of them.

If your datasets are really large and the extraction logic overly complex, huge speed advantages can be gained by sucking the data into a DBMS. Working w/ 30 to 50 kline spreadsheets and au3 led me to sucking the data into MySQL. I gained (literally) a ten thousand fold performance increase in the querying logic on identical hardware.

Of course, au3 is well suited to wrapping almost all of the above w/ nice UDFs.

Edited by flyingboz

Reading the help file before you post... Not only will it make you look smarter, it will make you smarter.

Share this post


Link to post
Share on other sites

ok, great info from both of you. here are a few more details:

the file is usually 30 - 80 pages long (if you where to print it).

I think from the last post by flyingboz, you are explaining things that are beyond my reach. this is a simple thing that i could do with grep but i have to accomplish it in windows. (does grep come for windows?)

speed isnt a really big factor. doesnt have to be fast, will be going straight to a network drive and appending to various files with the data pulled out of the original text file.

does this make sense? also, what are UDFs?

Share this post


Link to post
Share on other sites

#5 ·  Posted (edited)

...also, what are UDFs?

User Defined Function (UDF)

MsgBox ( flag, "title", "text" [, timeout] ) is an AutoIt3 function.

A UDF might be what you see in this post:

http://www.autoitscript.com/forum/index.php?showtopic=22531

A UDF would be made up of AutoIt3 functions to perform a certain function - hopefully the way the the user wants the function performed.

Clear as mud?

Edit: More info to read on the topic:

http://www.autoitscript.com/autoit3/scite/...F_Standards.htm

Edited by herewasplato

[size="1"][font="Arial"].[u].[/u][/font][/size]

Share this post


Link to post
Share on other sites

(does grep come for windows?)

windows equiv (poor cousin to grep, imo) is called find ; from a command prompt type find /? ; if you like grep, google grep windows binary; there are many ports avail. One of the popular ports is on sourceforge, unxutils, iirc.

speed isn't a big factor

Than the approach using the _FileCount...() UDFs will likely work ok for you ; though it is inefficient and slow compared to other methodologies, due to the fact that you are having to read the file multiple times; rather than in a single pass.

Reading the help file before you post... Not only will it make you look smarter, it will make you smarter.

Share this post


Link to post
Share on other sites

I need to take a text file that is very long, and split it into several smaller text files. I have used autoit for smaller functions, but this is a little out of my league. Can anyone point me in the right direction?

Some pseudocode:

Open text file in input mode

Loop until end of text file reached

     Read next line of the text file into a string variable

     Process the string variable

     Write whatever you need to whatever file you want

Go back to loop

Close all files

Exit program

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0