Sign in to follow this  
Followers 0
AskThatDude

Splitting Multi-Page text file

8 posts in this topic

Ok everyone I am trying to develop a script that reads a multi-page txt file that contains headers. The file is formatted as such:

Some Company Cool Report Page 1

Batch: 2008 Employee Workload 4/21/2008

234234234234234234 Some Guy Working 20000008

245345345355453545 Some Girl Sleeping 59095093

What I want to do is read the txt file and split it up into numerous txt files. I found the scrip below which allows me to split the file into many txt files but is there a way to tell it to split based on a line?

Meaning look at the top line of the file. This would be the start of each txt file. I can get it to work based on line count but this can very. Any ideas?

Here is the code I am working with:

CODE
$lines_per_output_file = 55

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

;generate a fake input file for this test

$junk = ""

For $i = 1 To 100

$junk = $junk & $i & @CRLF

Next

$file = FileOpen("test.txt", 2)

FileWrite($file, $junk)

FileClose($file)

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

;read and split file into an array

$array_of_whole_file = StringSplit(FileRead("MYFILE.TXT"), @CRLF, 1)

$filecnt = 1

$linecnt = 1

While 1

;open a numbered output file

FileOpen("test_" & $filecnt & ".txt", 2)

;write x number of lines to that file

For $i = 1 To $lines_per_output_file

FileWriteLine("test_" & $filecnt & ".txt", $array_of_whole_file[$linecnt])

$linecnt = $linecnt + 1

If $array_of_whole_file[0] = $linecnt Then

FileClose("test_" & $filecnt & ".txt")

Exit

EndIf

Next

FileClose("test_" & $filecnt & ".txt")

$filecnt = $filecnt + 1

WEnd

Share this post


Link to post
Share on other sites



Do you know any Regular Expressions? StringRegExp should do what you want, and return an array of the results.

Use regular-expressions.info to learn about it if you don't have any experience.


Regards,Josh

Share this post


Link to post
Share on other sites

No, I don't think so. Because the top line of each file as a page number so the search with not be accurate.

Do you know any Regular Expressions? StringRegExp should do what you want, and return an array of the results.

Use regular-expressions.info to learn about it if you don't have any experience.

Share this post


Link to post
Share on other sites

But that is the beauty of regular expressions.

You can use it to check for any number at the end, so it is okay if the line changes.

A bit busy now, but maybe someone else could come up with an example pattern for you, or I will later today


Regards,Josh

Share this post


Link to post
Share on other sites

#5 ·  Posted (edited)

Here is a better example of the file I'm trying to process:

CODE
1REALLY COOL SERVICES FOR YOU A ABC -1234567890 SETDATE 07/16/08 PAGE 1

ABC -PRK163 CUSTOMER ACTIVITY RUNDATE 07/16/08

EDFES-0923435 RUNTIME 18:11:01

0 1200 SOME CRAZY STREET 123456 23000 BLUEBERRY RD. DETROITS 840 NY 481

0 CUSTOMERSS NUMBER TYPE DEBIT CREDIT FEES SW DATE TIME/ CARD INST/ SW TERM SEQ/ EXCEPTION/ REJECT

CUSTOME #/ACCOUNT # LOC DATE TIME TERM INST LOCAL TERM LOC SEQ D2/E1/PA/C4

------------------------------------------------------------------------------------------------------------------------------------

*** PREVIOUS SUSPENSE TOTALS CASH-IN .00 CASH-OUT .00 NET CASH .00

TOTAL DEBITS .00 TOTAL CREDITS .00 TOTAL MEMO .00

WITHDRAWALS .00- ENV DEPOSITS .00 TRANSFERS FROM .00- INQUIRIES .00

PAYMENTS FROM .00 PAYMENT ENCL .00 TRANSFERS TO .00 OTHER MEMO .00

.00 PAYMENTS TO .00 PURCHASE .00

CHECK DEPOSITS .00 CASH DEPOSITS .00

DEPOSIT CSHBK .00

CASHED CHECK .00

------------------------------------------------------------------------------------------------------------------------------------

------------------------------------------------------------------------------------------------------------------------------------

*** PREVIOUS SUSPENSE COUNTS

Basically anytime I see a line beginning with "1REALLY COOL SERVICES FOR YOU A" I want to start a new file. Is that possible? I'm having trouble understanding how to properly code that.

Edited by AskThatDude

Share this post


Link to post
Share on other sites

#6 ·  Posted (edited)

I'm back. Could you post a file that you want to create multiple pages from? And then Post what you want the first file to contain.

EDIT: For the file you want to create multiple files from, make sure it has enough content to be split into 2 or more. The one above only has 1 page

Edited by JFee

Regards,Josh

Share this post


Link to post
Share on other sites

OK, starting with your example header: "Some Company Cool Report Page 1" ...

$data = FileReadLine($infile)
If StringRegExp($data, "^Some Company Cool Report Page \d+") Then
  ; close file for previous page, and start file for this page
Else
   FileWriteLine($curr_outfile, $data)
EndIf

The "^" at the beginning of the regular expression anchors the match to the beginning of the text line.

The "\d+" at the end of the regular expression matches a sequence of one or more digits. That makes the search work for "Page 3" and for "Page 123".

Hope that helps!

Share this post


Link to post
Share on other sites

Also if there will me multiple company names in one file, you can do this:

$data = FileReadLine($infile)
If StringRegExp($data, "^(.*) Page \d+") Then
 ; close file for previous page, and start file for this page
Else
   FileWriteLine($curr_outfile, $data)
EndIf

The (.*) I added means any number (the * means any number) of characters that aren't new lines (the . means that)


Regards,Josh

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0