Modify

Opened 11 years ago

Closed 11 years ago

#198 closed Bug (Fixed)

_FileReadToArray

Reported by: Xenobiologist Owned by: Jpm
Milestone: 3.2.11.11 Component: Standard UDFs
Version: 3.2.11.5 Severity: Blocking
Keywords: _FileReadToArray Cc: Gary

Description

Hi,

this is no really big bug, but it would be nice if someone could explain that to me.

In the function _FileReadToArray there is this code

$aArray = StringSplit(StringStripCR(StringStripWS(FileRead($hFile, FileGetSize($sFilePath)), 2)), @LF)

This cuts off emtpy lines at the end of the file. Is this correct? Why not putting empty lines to the array?

Thanks for clarification.

Mega

Attachments (0)

Change History (12)

comment:1 Changed 11 years ago by Valik

Can I stab whoever wrote that line of code? There's no sense in nesting function calls like that. You don't gain much over:

$sTemp = FileRead($hFile)
$sTemp = StringStripWS($sTemp, 2)
$sTemp = StringStripCR($sTemp, 2)
$aArray = StringSplit($sTemp, @LF)

I could go either way on this, it probably should preserve the file contents as much as possible. However, trailing whitespace is not important since presumably this is going to be used to read a text file.

I will mention a bug with this function, though, it only supports Windows and Unix documents. Mac documents using only @CR will appear as a single line since the @CR will be stripped and the @LF won't match anything to split.

A better way of writing the code would be:

$sTemp = FileRead($hFile)
$sTemp = StringStripWS($sTemp, 2)
$aArray = StringSplit($sTemp, @CRLF) ; Try Windows @CRLF first
If @error Then $aArray = StringSplit($sTemp, @LF) ; Unix @LF is next most common
If @error Then $aArray = StringSplit($sTemp, @CR) ; Finally try Mac @CR

comment:2 follow-up: Changed 11 years ago by Jos

aahh, a case of "stab first then ask"?

Do you mean to use the whole @crlf as string to split on?
Shouldn't that be: $aArray = StringSplit($sTemp, @CRLF,1) ; Try Windows @CRLF first

At the time This UDF was written the 3rd parameter wasn't there.

To answer the OP's question with another question: Should a file that ends with @CRLF be processed with a final Empty record ?

Jos

comment:3 Changed 11 years ago by Xenobiologist

Hi,

I come from here:

#include<File.au3>
#include<Array.au3>

_setLineNr(@ScriptDir & '\test.txt')

Func _setLineNr($path)
	If Not FileExists($path) Then Return SetError(1, 0, -1)
	Local $lines_A
	If Not _FileReadToArray($path, $lines_A) Then SetError(2, 0, -1)
	Local $count = StringLen(_FileCountLines($path))
	For $i = 1 To UBound($lines_A) - 1
		$lines_A[$i] = StringFormat('%0.' & $count & 'd', $i) & ': ' & $lines_A[$i]
	Next
	If _FileWriteFromArray($path, $lines_A, 1) Then Return 1
	Return SetError(3, 0, -1)
EndFunc   ;==>_setLineNr

I wondered why _FileCountLines counts all the empty lines and _FileReadToArray doesn't. That is why I thought I put a question here to clarify it.

comment:4 in reply to: ↑ 2 ; follow-up: Changed 11 years ago by Valik

Replying to Jos:

aahh, a case of "stab first then ask"?

Always.

Do you mean to use the whole @crlf as string to split on?
Shouldn't that be: $aArray = StringSplit($sTemp, @CRLF,1) ; Try Windows @CRLF first

Yes, that's what I meant (dry coded).

At the time This UDF was written the 3rd parameter wasn't there.

Fair enough, though it still doesn't explain all the nested function calls.

To answer the OP's question with another question: Should a file that ends with @CRLF be processed with a final Empty record ?

I expect to be able to read a @CRLF file with _FileReadToArray() and then to write it with _FileWriteFromArray() and produce an identical file. It doesn't seem useful to me to strip trailing whitespace on read (that's an operation that should take place at write-time).

comment:5 in reply to: ↑ 4 Changed 11 years ago by Xenobiologist

Replying to Valik:

To answer the OP's question with another question: Should a file that ends with @CRLF be processed with a final Empty record ?

I expect to be able to read a @CRLF file with _FileReadToArray() and then to write it with _FileWriteFromArray() and produce an identical file. It doesn't seem useful to me to strip trailing whitespace on read (that's an operation that should take place at write-time).

Yes, I agree. That is what I meant. Should be the same.

comment:6 Changed 11 years ago by Valik

  • Severity set to Blocking

comment:7 follow-up: Changed 11 years ago by Valik

My vote is we change this function to stop stripping trailing whitespace. It seems to me there is no real gain in doing this as cleaning up whitespace should be done at write-time, not at read time. Optionally, the code could be modified as I suggested in the first comment so that it will support files using any of the 3 possible line-ending styles.

comment:8 in reply to: ↑ 7 ; follow-up: Changed 11 years ago by jpm

Replying to Valik:

My vote is we change this function to stop stripping trailing whitespace. It seems to me there is no real gain in doing this as cleaning up whitespace should be done at write-time, not at read time. Optionally, the code could be modified as I suggested in the first comment so that it will support files using any of the 3 possible line-ending styles.

The proposed modification does not work for windows, extra empty lines

comment:9 in reply to: ↑ 8 ; follow-up: Changed 11 years ago by Valik

Replying to jpm:

Replying to Valik:

My vote is we change this function to stop stripping trailing whitespace. It seems to me there is no real gain in doing this as cleaning up whitespace should be done at write-time, not at read time. Optionally, the code could be modified as I suggested in the first comment so that it will support files using any of the 3 possible line-ending styles.

The proposed modification does not work for windows, extra empty lines

If you copied what I wrote verbatim, that's expected, as Gary pointed out, I didn't specify the flag to use multi-character delimiters.

comment:10 in reply to: ↑ 9 Changed 11 years ago by Valik

Replying to Valik:

Replying to jpm:

Replying to Valik:

My vote is we change this function to stop stripping trailing whitespace. It seems to me there is no real gain in doing this as cleaning up whitespace should be done at write-time, not at read time. Optionally, the code could be modified as I suggested in the first comment so that it will support files using any of the 3 possible line-ending styles.

The proposed modification does not work for windows, extra empty lines

If you copied what I wrote verbatim, that's expected, as Gary pointed out, I didn't specify the flag to use multi-character delimiters.

Err, it was Jos that pointed it out, but anyway, yeah, the code I posted wasn't quite correct.

comment:11 Changed 11 years ago by Jpm

In fact an identical file is not produced when the last "line" has no line separator _FileWriteFromArray will always add one.

comment:12 Changed 11 years ago by Jpm

  • Milestone set to 3.2.11.11
  • Owner changed from Gary to Jpm
  • Resolution set to Fixed
  • Status changed from new to closed

Fixed in version: 3.2.11.11

Guidelines for posting comments:

  • You cannot re-open a ticket but you may still leave a comment if you have additional information to add.
  • In-depth discussions should take place on the forum.

For more information see the full version of the ticket guidelines here.

Add Comment

Modify Ticket

Action
as closed The owner will remain Jpm.
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.