singbass Posted October 9, 2008 Share Posted October 9, 2008 I have a simple text file that is 300,000 + lines long and it is double spaced. All I want to do is reformat it and remove all the blank lines. I could do it by opening the file, reading line by line and only writing out to another file, those lines that weren't blank. However, I thought that using _ReplaceStringInFile would work if I tried to replace all @crlf @crlf with just a single @crlf. Unfortunately, it doesn't work. I have tried using chr(13) & chr(10) in place of the hex 0D0A in the file, but that doesn't work either. Even if I just try to replace @CRLF with something like "X", it returns 0 which would indicate to me that nothing is found to replace. Do I somehow need to escape the @CRLF in the _ReplaceStringInFile function in order to find it in my text file? Link to comment Share on other sites More sharing options...
Pain Posted October 9, 2008 Share Posted October 9, 2008 Fileread and use StringStripCR() or StringStripWS() then filewrite to save it. Link to comment Share on other sites More sharing options...
singbass Posted October 9, 2008 Author Share Posted October 9, 2008 If I have to fileread and filewrite then I also have to clean up after myself and delete the temporary file that I will have to create. I just thought since the function was available and it works for replacing other characters, it would be more elegant. Link to comment Share on other sites More sharing options...
enaiman Posted October 9, 2008 Share Posted October 9, 2008 StringStripCR and StringStripWS will remove all of them leaving a huge 1 line file, I'm affraid that won't solve this. First you need to see which character do you have at the end of the line, it might be @LF or @CR or @CRLF. Copy-paste a few lines in a new text file, in that file replace @CR with "@CR" and @LF with "@LF" (notice the quotes) and open the file after that and see what is at the end of the line. Once that is clear you will know what you need to replace. SNMP_UDF ... for SNMPv1 and v2c so far, GetBulk and a new example script wannabe "Unbeatable" Tic-Tac-Toe Paper-Scissor-Rock ... try to beat it anyway :) Link to comment Share on other sites More sharing options...
Pain Posted October 9, 2008 Share Posted October 9, 2008 $file = "C:\test.txt" $handle1 = FileOpen($file, 0) $read = FileRead($file) $new = StringStripCR ($read) FileClose($handle1) $handle2 = FileOpen($file, 2);Write mode (erase previous contents) FileWrite($handle2, $new) FileClose($handle2) One I just wrote, not tested. Link to comment Share on other sites More sharing options...
enaiman Posted October 9, 2008 Share Posted October 9, 2008 Load the file in SciTE Editor and press Ctrl+Shift+9 to see the line end chars.I had no idea about that (after more than 2 years of using SciTE ... ).Thanks Hubertus72 for "opening my eyes" SNMP_UDF ... for SNMPv1 and v2c so far, GetBulk and a new example script wannabe "Unbeatable" Tic-Tac-Toe Paper-Scissor-Rock ... try to beat it anyway :) Link to comment Share on other sites More sharing options...
singbass Posted October 10, 2008 Author Share Posted October 10, 2008 I too had no idea about the ctrl-shift-9 key combo in SciTE. I just opened it with another editor that lets me switch to hex and I could see the x0D x0A at the end of each line. Using SciTE I can now also see the CR LF combo at the end of the lines. Even trying the suggestion of changing @CR to "@CR" doesn't help. I still don't think the _ReplaceStringInFile function can find the @CR. The contents of the file I'm trying to read looks like this; --------------------- 00:00:34 09/23/08 00:01:04 09/23/08 00:01:34 09/23/08 00:02:04 09/23/08 00:02:34 09/23/08 00:03:04 09/23/08 00:03:34 09/23/08 --------------------- Here is the entire block of code I am using; $file1=("testfile.txt") $retval = _ReplaceStringInFile($file1,@CR,"@CR") MsgBox(0,"",$retval,1) When I run the code above, $retval returns 0. When I change the function to replace every 0 with a letter, it works great and $retval returns the number of lines changed. Therefore, I don't think the function can see the @CR. If I don't get a solution, I guess I will go with the FileOpen-FileWrite method, but I was hoping to avoid this if I could. Link to comment Share on other sites More sharing options...
Pain Posted October 10, 2008 Share Posted October 10, 2008 $filename = "test.txt" $handle1 = FileOpen($filename, 0) $read = FileRead($filename) FileClose($handle1) $new = StringReplace(StringReplace(StringStripCR($read), chr(10), "@"), "@@", @CRLF) $handle2 = FileOpen($filename, 2) FileWrite($handle2, $new) FileClose($handle2) Link to comment Share on other sites More sharing options...
Bowmore Posted October 10, 2008 Share Posted October 10, 2008 This is how I clean up files with blank lines. It will remove all completely blank lines and lines containing only white-space; Works with Windows, Unix and Mac files $sFile = "X:\data\test.txt" $hFile = FileOpen($sFile, 0 ) $sData = FileRead($hFile) FileClose($hFile) $hFile = FileOpen($sFile, 2) $sData = StringRegExpReplace($sData, '(?:(\r?\n|\r)+)[ \t]*(?:\r?\n|\r)', '\1') ;; Comment out to keep white-space lines $sData = StringRegExpReplace($sData, '(?:(\r?\n|\r){2,})','\1') ;; replace multiple newlines with a single newline $sData = StringRegExpReplace($sData, '(?:(?:^(\r?\n|\r))|(?:(\r?\n|\r)$))','') ;; remove newlines from start and end of file $State = FileWrite($hFile,$sData) FileClose($hFile) "Programming today is a race between software engineers striving to build bigger and better idiot-proof programs, and the universe trying to build bigger and better idiots. So far, the universe is winning."- Rick Cook Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now