lbsl Posted May 9, 2012 Share Posted May 9, 2012 (edited) Goodday folks, I'm having problems reading log-files using the _FileReadToArray() function so i'm forced to use the FileReadLine function instead. The _FileReadToArray() doesn't work because it quits reading the file if it encounters a null character as the first character on a line. (The FileCountLines() for that matter fails as well) The log files i try to read unfortunately contain many of these. Is there any particular way to crank up the performance of the FileReadLine function? Also any possibility that a future build of AutoIT would use LOF method on files instead of trying to estimate the last line in a file based upon specific characters codes for the _FileReadToArray function? perhaps allow two reading modes:raw and plain, where in plain, special character codes are replaced by symbol tags like [null] and in raw only LF and CR are used to divide content into arrays. (I can understand for the null character this probably has to be tagged anyway as the array is likely null terminated as well?) Regards, Vince. Edited May 9, 2012 by lbsl Link to comment Share on other sites More sharing options...
lbsl Posted May 9, 2012 Author Share Posted May 9, 2012 (edited) I tried to adjust _FileReadToArray myself, but i figured out something goes wrong with the contents loaded after using FileRead() and reading the whole file to a buffer. If i use ConsoleWrite, it also quits spewing out the rest of the buffer as soon as it encounters the first null character on the line. Edited May 9, 2012 by lbsl Link to comment Share on other sites More sharing options...
Spiff59 Posted May 9, 2012 Share Posted May 9, 2012 Check out this thread: Link to comment Share on other sites More sharing options...
lbsl Posted May 9, 2012 Author Share Posted May 9, 2012 Thanks for the link (Found so many references about filetoarray not working, but this one was the needle in the haystack) Did some minor adjustment to replace the null characters meanwhile.It works but i will see if the binary functions will perform faster. Func _FileReadToArray($sFilePath, ByRef $aArray) Local $hFile = FileOpen($sFilePath, $FO_READ) If $hFile = -1 Then Return SetError(1, 0, 0);; unable to open the file ;; Read the file and remove any trailing white spaces Local $tbuffer = FileRead($hFile, FileGetSize($sFilePath)) Local $aFile = "" ;~ $aFile = StringStripWS($aFile, 2) ; remove last line separator if any at the end of the file For $x = 1 To FileGetSize($sFilePath) If Asc(StringMid($tbuffer, $x, 1)) > 0 Then $aFile = $aFile & StringMid($tbuffer, $x, 1) Else $aFile = $aFile & "[null]" EndIf Next If StringRight($aFile, 1) = @LF Then $aFile = StringTrimRight($aFile, 1) If StringRight($aFile, 1) = @CR Then $aFile = StringTrimRight($aFile, 1) FileClose($hFile) If StringInStr($aFile, @LF) Then $aArray = StringSplit(StringStripCR($aFile), @LF) ElseIf StringInStr($aFile, @CR) Then ;; @LF does not exist so split on the @CR $aArray = StringSplit($aFile, @CR) Else ;; unable to split the file If StringLen($aFile) Then Dim $aArray[2] = [1, $aFile] Else Return SetError(2, 0, 0) EndIf EndIf Return 1 EndFunc ;==>_FileReadToArray Link to comment Share on other sites More sharing options...
Zedna Posted May 9, 2012 Share Posted May 9, 2012 #include <WinAPI.au3> Global $sFile, $hFile, $sText, $nBytes, $tBuffer $sFile = @ScriptDir & 'test.txt' ; read 100 bytes from end of file $tBuffer = DLLStructCreate("byte[100]") $hFile = _WinAPI_CreateFile($sFile, 2, 2) _WinAPI_SetFilePointer($hFile, -100, 2) _WinAPI_ReadFile($hFile, DLLStructGetPtr($tBuffer), 100, $nBytes) _WinAPI_CloseHandle($hFile) $sText = BinaryToString(DLLStructGetData($tBuffer, 1)) $sText = StringReplace($sText, Chr(0), '<NULL>') ConsoleWrite($sText) Resources UDF Â ResourcesEx UDF Â AutoIt Forum Search Link to comment Share on other sites More sharing options...
Spiff59 Posted May 9, 2012 Share Posted May 9, 2012 You could try replacing the entire For/Next loop with: StringReplace($tbuffer, Chr(0), "[null]") Processing the entire file in one pass ought to be considerably faster. Link to comment Share on other sites More sharing options...
Zedna Posted May 9, 2012 Share Posted May 9, 2012 (edited) Best optimized way to use FileReadLine?The best optimized way is to not use FileReadLine. Edited May 9, 2012 by Zedna Resources UDF Â ResourcesEx UDF Â AutoIt Forum Search Link to comment Share on other sites More sharing options...
lbsl Posted May 9, 2012 Author Share Posted May 9, 2012 #include <WinAPI.au3> Global $sFile, $hFile, $sText, $nBytes, $tBuffer $sFile = @ScriptDir & 'test.txt' ; read 100 bytes from end of file $tBuffer = DLLStructCreate("byte[100]") $hFile = _WinAPI_CreateFile($sFile, 2, 2) _WinAPI_SetFilePointer($hFile, -100, 2) _WinAPI_ReadFile($hFile, DLLStructGetPtr($tBuffer), 100, $nBytes) _WinAPI_CloseHandle($hFile) $sText = BinaryToString(DLLStructGetData($tBuffer, 1)) $sText = StringReplace($sText, Chr(0), '<NULL>') ConsoleWrite($sText) This looks quite fast... I have some 8MB log file to feast it on, should at least give me a noticable difference. You could try replacing the entire For/Next loop with: StringReplace($tbuffer, Chr(0), "[null]") Processing the entire file in one pass ought to be considerably faster. Yes you are right in that one. I did just that thing, also cleaned up some code in the _filetoarray function by simply replacing all CRLF and CR combinations with an LF. Didn't understood why CR and LF were both filtered seperately if either of them would be used to split lines. At least this works so far: Func _FileReadToArray($sFilePath, ByRef $aArray) Local $hFile = FileOpen($sFilePath, $FO_READ) If $hFile = -1 Then Return SetError(1, 0, 0);; unable to open the file ;; Read the file and remove any trailing white spaces Local $tbuffer = FileRead($hFile, FileGetSize($sFilePath)) FileClose($hFile) Local $aFile = StringReplace(BinaryToString($tbuffer), Chr(0), "[nul]") $aFile = StringReplace(BinaryToString($aFile), Chr(13)&Chr(10), Chr(10)) $aFile = StringReplace(BinaryToString($aFile), Chr(13), Chr(10)) If StringRight($aFile, 1) = @LF Then $aFile = StringTrimRight($aFile, 1) If StringInStr($aFile, @LF) Then $aArray = StringSplit($aFile, @LF) Else ;; unable to split the file If StringLen($aFile) Then Dim $aArray[2] = [1, $aFile] Else Return SetError(2, 0, 0) EndIf EndIf Return 1 EndFunc ;==>_FileReadToArray I also attempted to add filters for [eth][stx][etx] etc, but that seemed a bit too much for IT to process. Link to comment Share on other sites More sharing options...
BrewManNH Posted May 9, 2012 Share Posted May 9, 2012 Some OS's use CRLF, some use LF, some use CR, if you only split on one of them, you can't split the lines correctly. Also, you're tripling the time needed to process the file by replacing the NUL with "" and also replacing the CR and CRLF with LF. If I posted any code, assume that code was written using the latest release version unless stated otherwise. Also, if it doesn't work on XP I can't help with that because I don't have access to XP, and I'm not going to.Give a programmer the correct code and he can do his work for a day. Teach a programmer to debug and he can do his work for a lifetime - by Chirag GudeHow to ask questions the smart way! I hereby grant any person the right to use any code I post, that I am the original author of, on the autoitscript.com forums, unless I've specifically stated otherwise in the code or the thread post. If you do use my code all I ask, as a courtesy, is to make note of where you got it from. Back up and restore Windows user files _Array.au3 - Modified array functions that include support for 2D arrays.  -  ColorChooser - An add-on for SciTE that pops up a color dialog so you can select and paste a color code into a script.  -  Customizable Splashscreen GUI w/Progress Bar - Create a custom "splash screen" GUI with a progress bar and custom label.  -  _FileGetProperty - Retrieve the properties of a file  -  SciTE Toolbar - A toolbar demo for use with the SciTE editor  -  GUIRegisterMsg demo - Demo script to show how to use the Windows messages to interact with controls and your GUI.  -   Latin Square password generator Link to comment Share on other sites More sharing options...
Moderators Melba23 Posted May 9, 2012 Moderators Share Posted May 9, 2012 lbsl,You might be interested in this little SRE which forces all line endings (whether @CR, @LF or @CRLF) into @CRLF to ensure you can split the lines correctly - it works even if there is a mixture of endings within the same file: $sText = StringRegExpReplace($sText, "((?<!\x0d)\x0a|\x0d(?!\x0a))", @CRLF)M23  Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind Open spoiler to see my UDFs: Spoiler ArrayMultiColSort ---- Sort arrays on multiple columnsChooseFileFolder ---- Single and multiple selections from specified path treeview listingDate_Time_Convert -- Easily convert date/time formats, including the language usedExtMsgBox --------- A highly customisable replacement for MsgBoxGUIExtender -------- Extend and retract multiple sections within a GUIGUIFrame ---------- Subdivide GUIs into many adjustable framesGUIListViewEx ------- Insert, delete, move, drag, sort, edit and colour ListView itemsGUITreeViewEx ------ Check/clear parent and child checkboxes in a TreeViewMarquee ----------- Scrolling tickertape GUIsNoFocusLines ------- Remove the dotted focus lines from buttons, sliders, radios and checkboxesNotify ------------- Small notifications on the edge of the displayScrollbars ----------Automatically sized scrollbars with a single commandStringSize ---------- Automatically size controls to fit textToast -------------- Small GUIs which pop out of the notification area  Link to comment Share on other sites More sharing options...
Zedna Posted May 9, 2012 Share Posted May 9, 2012 (edited) This looks quite fast... I have some 8MB log file to feast it on, should at least give me a noticable difference.My solution is REALLY very fast because it reads only few desired bytes from end of file (and not whole file from beginning)so its speed doesn't depend on size of file.EDIT:You can also use StringSplit() on result ($sText) to get these few last rows in array ... Edited May 9, 2012 by Zedna Resources UDF Â ResourcesEx UDF Â AutoIt Forum Search Link to comment Share on other sites More sharing options...
Spiff59 Posted May 9, 2012 Share Posted May 9, 2012 To throw in my two cents on your latest posted version... The $afile variable returned from your first StringReplace() call would be a string type, so the BinaryToString($afile) calls in the next 2 lines could be yanked. Link to comment Share on other sites More sharing options...
lbsl Posted May 9, 2012 Author Share Posted May 9, 2012 Some OS's use CRLF, some use LF, some use CR, if you only split on one of them, you can't split the lines correctly. Also, you're tripling the time needed to process the file by replacing the NUL with "" and also replacing the CR and CRLF with LF. That is true, but i don't notice an immense speed drop on this. I had more performance problems splitting out log contents to two RichEdit forms. It looks like when you concatenate results into a string, goes as fast up to a certain amount of lines that are concatenated, then you have to append the contents to the RichEdit box and clear the string to refill it up. lbsl, You might be interested in this little SRE which forces all line endings (whether @CR, @LF or @CRLF) into @CRLF to ensure you can split the lines correctly - it works even if there is a mixture of endings within the same file: $sText = StringRegExpReplace($sText, "((?<!x0d)x0a|x0d(?!x0a))", @CRLF) M23 Thanks for the snippet. The results were however different than with the double stringreplace lines (i did replaced the @LF filter with the @CRLF in the string matching lines below the replace lines). I also thought to have read something particular about comparing two chars using StringRegExp in general. My solution is REALLY very fast because it reads only few desired bytes from end of file (and not whole file from beginning) so its speed doesn't depend on size of file. EDIT: You can also use StringSplit() on result ($sText) to get these few last rows in array ... It's not about reading the last few lines, i need to read the whole file anyway, but it is about content lacking due to the null characters. To throw in my two cents on your latest posted version... The $afile variable returned from your first StringReplace() call would be a string type, so the BinaryToString($afile) calls in the next 2 lines could be yanked. Thanks for the pennies, applied the change Link to comment Share on other sites More sharing options...
lbsl Posted May 10, 2012 Author Share Posted May 10, 2012 (edited) #include <WinAPI.au3> Global $sFile, $hFile, $sText, $nBytes, $tBuffer $sFile = @ScriptDir & 'test.txt' ; read 100 bytes from end of file $tBuffer = DLLStructCreate("byte[100]") $hFile = _WinAPI_CreateFile($sFile, 2, 2) _WinAPI_SetFilePointer($hFile, -100, 2) _WinAPI_ReadFile($hFile, DLLStructGetPtr($tBuffer), 100, $nBytes) _WinAPI_CloseHandle($hFile) $sText = BinaryToString(DLLStructGetData($tBuffer, 1)) $sText = StringReplace($sText, Chr(0), '<NULL>') ConsoleWrite($sText) I have been toying with this snippet. It works great on files that aren't opened by any other program, but it fails if the file is opened by another program, regardless if that file was opened in shared read mode or not. This would have been very handy to quickly update changes made to the file being read. I don't get any errors but $nBytes always returns 0 and i have no idea why it doesn't return any error in this case. I'm going to fool around with FileSetPos (As FileRead() does work) Edited May 10, 2012 by lbsl Link to comment Share on other sites More sharing options...
Zedna Posted May 10, 2012 Share Posted May 10, 2012 (edited) Use Share parameter in _WinAPI_CreateFile() $hFile = _WinAPI_CreateFile($sFile, 2, 2, 7) ; share for READ+WRITE+DELETE Edited May 10, 2012 by Zedna Resources UDF Â ResourcesEx UDF Â AutoIt Forum Search Link to comment Share on other sites More sharing options...
lbsl Posted May 11, 2012 Author Share Posted May 11, 2012 Use Share parameter in _WinAPI_CreateFile() $hFile = _WinAPI_CreateFile($sFile, 2, 2, 7) ; share for READ+WRITE+DELETE Yes i have tried that, no luck, the application who has it open for sure locked it for write access (I have not written that application unfortunately). I even looked around for the overlapped structure to work with, but that doesn't seem to have use if no bytes are ever read in the first place. From all the advises from above, this is the final modified _FileReadToArray that allows reading from any arbitrary position in the file including the null filtering. expandcollapse popup; #FUNCTION# ==================================================================================================================== ; Name...........: _FileReadToArray ; Description ...: Reads the specified file into an array. ; Syntax.........: _FileReadToArray($sFilePath, ByRef $aArray) ; Parameters ....: $sFilePath - Path and filename of the file to be read. ; $aArray - The array to store the contents of the file. ; $offset - The fileposition to start reading from (by default 0 always returns last measured filesize) ; Return values .: Success - Returns a 1 ; Failure - Returns a 0 ; @Error - 0 = No error. ; |1 = Error opening specified file ; |2 = Unable to Split the file ; Author ........: Jonathan Bennett <jon at hiddensoft dot com>, Valik - Support Windows Unix and Mac line separator ; Modified.......: Lbsl - added loading from offset for reading live modified files. ; Remarks .......: $aArray[0] will contain the number of records read into the array. ; Related .......: _FileWriteFromArray ; Link ..........: ; Example .......: Yes ; =============================================================================================================================== Func _FileReadToArray($sFilePath, ByRef $aArray, ByRef $offset) Local $hFile = FileOpen($sFilePath, $FO_READ) Local $set_offset_first = false If $hFile = -1 Then Return SetError(1, 0, 0);; unable to open the file ;; Read the file and remove any trailing white spaces Local $fSize = FileGetSize($sFilePath) FileSetPos ( $hFile, $offset, $FILE_BEGIN ) ;When fetching all contents for the first time, we know we are going to read ;the $fSize amount of bytes, therefore, we set the offset to the current End ;of the file for the next pass if the user wants to continue from that position If $offset == 0 Then $offset = $fSize $set_offset_first = True EndIf Local $tbuffer = FileRead($hFile, FileGetSize($sFilePath)) FileClose($hFile) ;However when we do a second pass, $fSize actually would be beyond the ;the $fSize amount of bytes, therefore, we set the offset to the current End ;of the file after the file has been closed and we cloase it after the bytes ;have been read, this is also simply the reason no byte amount to read is defined ;because you can't tell on a file that is being life updated. If $set_offset_first == False Then $offset = FileGetSize ( $sFilePath ) EndIf Local $aFile = StringReplace(BinaryToString($tbuffer), Chr(0), "[nul]") If StringRight($aFile, 1) = @LF Then $aFile = StringTrimRight($aFile, 1) If StringRight($aFile, 1) = @CR Then $aFile = StringTrimRight($aFile, 1) If StringInStr($aFile, @LF) Then $aArray = StringSplit(StringStripCR($aFile), @LF) ElseIf StringInStr($aFile, @CR) Then ;; @LF does not exist so split on the @CR $aArray = StringSplit($aFile, @CR) Else ;; unable to split the file If StringLen($aFile) Then Dim $aArray[2] = [1, $aFile] Else Return SetError(2, 0, 0) EndIf EndIf EndFunc ;==>_FileReadToArray Here's the test snippet: expandcollapse popup#include <File.au3> Local $sFilePath = @ScriptDir&"test.txt" Local $last_updated = FileGetTime($sFilePath,0,1) Local $sFileCurrentPosition = 0 Local $sLines = 0 Dim $sFileContent = 0 ConsoleWrite('Initial offset:' &$sFileCurrentPosition&@CRLF) _FileReadToArray($sFilePath, $sFileContent, $sFileCurrentPosition) If @error or Ubound($sFileContent) < 1 Then If Ubound($sFileContent) < 1 Then ConsoleWrite('Array content empty' ) Else ConsoleWrite('Error occured' ) EndIf Exit EndIf Dim $sDisplayText[Ubound($sFileContent)] ConsoleWrite('File-> ['&$sFilePath&"]"&@CRLF) ConsoleWrite("---------------------------------------------------------------------"&@CRLF) ConsoleWrite("--------------------------Unaltered content--------------------------"&@CRLF) ConsoleWrite("---------------------------------------------------------------------"&@CRLF) For $x = 0 To UBound($sFileContent)-1 $sDisplayText[$x] = $sFileContent[$x] If $x > 0 Then ConsoleWrite($sFileContent[$x]&@CRLF) EndIf Next $sLines = $sFileContent[0] While 1 If FileGetTime($sFilePath,0,1) > $last_updated Then $last_updated = FileGetTime($sFilePath,0,1) Dim $snwFileContent = 0 ConsoleWrite("---------------------------------------------------------------------"&@CRLF) ConsoleWrite("----------------------File modification detected---------------------"&@CRLF) If _FileCountLines($sFilePath) <= $sLines Then ;File has no added content, read it again $sLines = _FileCountLines($sFilePath) $sFileCurrentPosition = 0 ConsoleWrite("---------------------------Lines modificatied------------------------"&@CRLF) ConsoleWrite("---------------------------------------------------------------------"&@CRLF) Else ConsoleWrite("-----------------------------Lines added-----------------------------"&@CRLF) ConsoleWrite("---------------------------------------------------------------------"&@CRLF) EndIf ConsoleWrite('going to read offset:' &$sFileCurrentPosition&@CRLF) _FileReadToArray($sFilePath, $snwFileContent, $sFileCurrentPosition) ConsoleWrite('next offset:' &$sFileCurrentPosition&@CRLF) ConsoleWrite("------------------------------Read lines-----------------------------"&@CRLF) If UBound($snwFileContent) > 0 Then ReDim $sDisplayText[$sDisplayText[0]+$snwFileContent[0]+1] For $x = 1 To $snwFileContent[0] ConsoleWrite($snwFileContent[$x]&@CRLF) $sDisplayText[$sDisplayText[0]+$x] = $snwFileContent[$x] Next Else ConsoleWrite('File has no content or inaccessible?'&@CRLF) EndIf EndIf WEnd Just fool around with the test.txt in notepad and save it after each change. Or simply let it loose on a datalogger. Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now