Modify

Opened 12 years ago

Closed 12 years ago

#2117 closed Feature Request (Completed)

Improved _FileCountLines()

Reported by: Spiff59 Owned by: guinness
Milestone: 3.3.9.5 Component: Standard UDFs
Version: Severity: None
Keywords: Cc:

Description

The following version of _FileCountLines() is 60-70% faster than the production 3.3.8.0 version. It's MAIN improvement is that it uses roughly half the memory of the current version, enabling it to scan larger files and making it less susceptible to a memory allocation error.

Additionally, the helpfile does not document the existing @error = 2 condition, which is returned when scanning a 0-byte file.

Test script included. Remove the (extra) leading underslash on the function name if you intend to place it in production.
Thanks

#include <File.au3>

$timer = TimerInit()
;$Lines = _FileCountLines(@ScriptDir & "\DataFile3.txt")
$timer = TimerDiff($timer) / 1000
$timer2 = TimerInit()
$Lines2 = __FileCountLines(@ScriptDir & "\DataFile3.txt")
$timer2 = TimerDiff($timer2) / 1000
;MsgBox(0,"", $Lines & " lines" & @CRLF & $timer & " seconds" & @CRLF & $Lines2 & " lines" & @CRLF & $timer2 & " seconds" )


Func __FileCountLines($sFilePath)
	Local $hFile = FileOpen($sFilePath, $FO_READ)
	If $hFile = -1 Then Return SetError(1, 0, 0)
	Local $sFileContent = StringStripWS(FileRead($hFile), 2)
	FileClose($hFile)
	StringRegExpReplace($sFileContent, "\n", "") ; terminated with linefeed terminated
	If Not @extended Then StringRegExpReplace($sFileContent, "\r", "") ; terminated with carriage return
	If Not @extended Then
		If StringLen($sFileContent) Then
			Return 1 ; single-line file
		Else
			Return SetError(2, 0, 0) ; empty file
		EndIf
	EndIf
	Return @extended
EndFunc   ;==>_FileCountLines

Attachments (0)

Change History (7)

comment:1 Changed 12 years ago by anonymous

Do fix the "terminated with linefeed terminated" comment ;) I decided it sounded better to have "terminated" as the first word on the 2 comments, but failed to delete the trailing instance of the word from the end of the linefeed comment. Thanks.

comment:2 Changed 12 years ago by anonymous

Poop on me! I tested without all the frills in the production version (edits, error handling, etc). When I added all the fluff after testing, I should have re-tested. The SRER version of this does not require the StringStripWS call, in fact it cuases the function to report 1 less line than exist in the file. The StringStripSW() sneeds to be removed.

This SRER version also does not suffer from the problem reported in tracker #1831.
It correctly reads beyond a line containing a NUL, and processes the test file from that ticket correctly.

comment:3 Changed 12 years ago by Spiff59

I guess this could have been a "Bug" rather than "Feature request".
Further testing shows the existing function rarely reports the same number of lines as a file opened any editor (Notepad, SciTE, etc) as it strips all trailing whitespace.

An extra test is required to properly handle EOF.
This is my final, fully-tested (really!) version...

Func _FileCountLines($sFilePath)
	Local $hFile = FileOpen($sFilePath, $FO_READ)
	If $hFile = -1 Then Return SetError(1, 0, 0)
	Local $sFileContent = FileRead($hFile), $aTerminators[2] = ["\n", "\r"] ; linefeed, carriage return
	FileClose($hFile)
	For $x = 0 to 1
		StringRegExpReplace($sFileContent, $aTerminators[$x], $aTerminators[$x])
		If @extended Then
			Local $count = @extended
			If StringRight($sFileContent, 1) <> $aTerminators[$x] Then $count += 1
			ExitLoop
		EndIf
	Next
	If Not $count Then
		If StringLen($sFileContent) Then
			$count = 1 ; single-line file
		Else
			Return SetError(2, 0, 0) ; 0-byte file
		EndIf
	EndIf
	Return $count
EndFunc   ;==>_FileCountLines

comment:4 Changed 12 years ago by TicketCleanup

  • Version 3.3.8.0 deleted

Automatic ticket cleanup.

comment:5 Changed 12 years ago by Beege

I would like to point out jchd had a modified proposal Spiff59's that removes all memory limitations here:
http://www.autoitscript.com/forum/topic/137024-modified-filecountlines/page__view__findpost__p__958820

comment:6 Changed 12 years ago by trancexx

  • Component changed from AutoIt to Standard UDFs

comment:7 Changed 12 years ago by guinness

  • Milestone set to 3.3.9.5
  • Owner set to guinness
  • Resolution set to Completed
  • Status changed from new to closed

Changed by revision [7226] in version: 3.3.9.5

Guidelines for posting comments:

  • You cannot re-open a ticket but you may still leave a comment if you have additional information to add.
  • In-depth discussions should take place on the forum.

For more information see the full version of the ticket guidelines here.

Add Comment

Modify Ticket

Action
as closed The owner will remain guinness.
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.