Opened on Jan 29, 2012 at 7:47:10 PM
Closed on Sep 4, 2012 at 8:18:03 AM
#2117 closed Feature Request (Completed)
Improved _FileCountLines()
| Reported by: | Spiff59 | Owned by: | guinness |
|---|---|---|---|
| Milestone: | 3.3.9.5 | Component: | Standard UDFs |
| Version: | Severity: | None | |
| Keywords: | Cc: |
Description
The following version of _FileCountLines() is 60-70% faster than the production 3.3.8.0 version. It's MAIN improvement is that it uses roughly half the memory of the current version, enabling it to scan larger files and making it less susceptible to a memory allocation error.
Additionally, the helpfile does not document the existing @error = 2 condition, which is returned when scanning a 0-byte file.
Test script included. Remove the (extra) leading underslash on the function name if you intend to place it in production.
Thanks
#include <File.au3> $timer = TimerInit() ;$Lines = _FileCountLines(@ScriptDir & "\DataFile3.txt") $timer = TimerDiff($timer) / 1000 $timer2 = TimerInit() $Lines2 = __FileCountLines(@ScriptDir & "\DataFile3.txt") $timer2 = TimerDiff($timer2) / 1000 ;MsgBox(0,"", $Lines & " lines" & @CRLF & $timer & " seconds" & @CRLF & $Lines2 & " lines" & @CRLF & $timer2 & " seconds" ) Func __FileCountLines($sFilePath) Local $hFile = FileOpen($sFilePath, $FO_READ) If $hFile = -1 Then Return SetError(1, 0, 0) Local $sFileContent = StringStripWS(FileRead($hFile), 2) FileClose($hFile) StringRegExpReplace($sFileContent, "\n", "") ; terminated with linefeed terminated If Not @extended Then StringRegExpReplace($sFileContent, "\r", "") ; terminated with carriage return If Not @extended Then If StringLen($sFileContent) Then Return 1 ; single-line file Else Return SetError(2, 0, 0) ; empty file EndIf EndIf Return @extended EndFunc ;==>_FileCountLines
Attachments (0)
Change History (7)
comment:2 by , on Jan 29, 2012 at 8:06:23 PM
Poop on me! I tested without all the frills in the production version (edits, error handling, etc). When I added all the fluff after testing, I should have re-tested. The SRER version of this does not require the StringStripWS call, in fact it cuases the function to report 1 less line than exist in the file. The StringStripSW() sneeds to be removed.
This SRER version also does not suffer from the problem reported in tracker #1831.
It correctly reads beyond a line containing a NUL, and processes the test file from that ticket correctly.
comment:3 by , on Jan 29, 2012 at 8:50:51 PM
I guess this could have been a "Bug" rather than "Feature request".
Further testing shows the existing function rarely reports the same number of lines as a file opened any editor (Notepad, SciTE, etc) as it strips all trailing whitespace.
An extra test is required to properly handle EOF.
This is my final, fully-tested (really!) version...
Func _FileCountLines($sFilePath) Local $hFile = FileOpen($sFilePath, $FO_READ) If $hFile = -1 Then Return SetError(1, 0, 0) Local $sFileContent = FileRead($hFile), $aTerminators[2] = ["\n", "\r"] ; linefeed, carriage return FileClose($hFile) For $x = 0 to 1 StringRegExpReplace($sFileContent, $aTerminators[$x], $aTerminators[$x]) If @extended Then Local $count = @extended If StringRight($sFileContent, 1) <> $aTerminators[$x] Then $count += 1 ExitLoop EndIf Next If Not $count Then If StringLen($sFileContent) Then $count = 1 ; single-line file Else Return SetError(2, 0, 0) ; 0-byte file EndIf EndIf Return $count EndFunc ;==>_FileCountLines
comment:5 by , on Jan 31, 2012 at 6:53:31 AM
I would like to point out jchd had a modified proposal Spiff59's that removes all memory limitations here:
http://www.autoitscript.com/forum/topic/137024-modified-filecountlines/page__view__findpost__p__958820
comment:6 by , on Jun 3, 2012 at 3:58:32 PM
| Component: | AutoIt → Standard UDFs |
|---|
comment:7 by , on Sep 4, 2012 at 8:18:03 AM
| Milestone: | → 3.3.9.5 |
|---|---|
| Owner: | set to |
| Resolution: | → Completed |
| Status: | new → closed |
Changed by revision [7226] in version: 3.3.9.5

Do fix the "terminated with linefeed terminated" comment ;) I decided it sounded better to have "terminated" as the first word on the 2 comments, but failed to delete the trailing instance of the word from the end of the linefeed comment. Thanks.