SiteMaze Posted October 8, 2007 Share Posted October 8, 2007 I have text file that have more than 1/2 million lines, for which I wanted to split the text file by lines. I wrote a simple script, as below. It works for small files but not for big files. What gives? The original text file is around 500mb, but my computer got ample of free memory. $file = FileOpen("C:\CD8M.csv",0) $i = 1 $oldstring = "" $newstring = "" $j = 50000 ; the line where to split While 1 $line = FileReadLine($file,$i) If @error = -1 then ExitLoop If $i < $j Then $oldstring = $oldstring & $line & @CRLF Else $newstring = $newstring & $line & @CRLF EndIf $i = $i + 1 WEnd FileClose($file) $file = FileOpen("C:\first.csv",1) FileWrite($file,$oldstring) FileClose($file) $file = FileOpen("C:\second.csv",1) FileWrite($file,$newstring) FileClose($file) Arsenal Football Fan Club in Singapore Link to comment Share on other sites More sharing options...
Achilles Posted October 8, 2007 Share Posted October 8, 2007 Try posting in "General Help and Support"... This isn't really an example script. My Programs[list][*]Knight Media Player[*]Multiple Desktops[*]Daily Comics[*]Journal[/list] Link to comment Share on other sites More sharing options...
SiteMaze Posted October 8, 2007 Author Share Posted October 8, 2007 Sorry, can the mods please move it. Arsenal Football Fan Club in Singapore Link to comment Share on other sites More sharing options...
Achilles Posted October 8, 2007 Share Posted October 8, 2007 Well, either they moved it or I'm going crazy and it was always in General Help and Support. My Programs[list][*]Knight Media Player[*]Multiple Desktops[*]Daily Comics[*]Journal[/list] Link to comment Share on other sites More sharing options...
Nahuel Posted October 8, 2007 Share Posted October 8, 2007 Well, either they moved it or I'm going crazy and it was always in General Help and Support.Nah, I saw it there too. Mods are fast. Link to comment Share on other sites More sharing options...
Achilles Posted October 8, 2007 Share Posted October 8, 2007 (edited) Nah, I saw it there too. Mods are fast.Ok good... Wasn't too sure about myself for a second there.Anyways, since this is my third post in this thread I decided it should be helpful. I found this in the helpfile which might be part of your problem:From a performance standpoint it is a bad idea to read line by line specifying "line" parameter whose value is incrementing by one. This forces AutoIt to reread the file from the beginning until it reach the specified line.I'm guessing that there is an AutoIt limit or something here (even though it's not mentioned that I can see). What do you mean when you say it doesn't work? Does AutoIt crash or does the script execute successfully and not produce the desired files? If you're positive it works on small files and it doesn't work on your huge file then logically their is nothing wrong with your code but rather a limit to either AutoIt or your computer. Edited October 8, 2007 by Piano_Man My Programs[list][*]Knight Media Player[*]Multiple Desktops[*]Daily Comics[*]Journal[/list] Link to comment Share on other sites More sharing options...
randallc Posted October 8, 2007 Share Posted October 8, 2007 (edited) Hi, All the above is true; 1. Additionally, you can optimise your code "&="; 2. It is 3x faster to just StringSplit a fileread, if you have memory etc; expandcollapse popup; filelargesplit.au3 #include<file.au3> $file = FileOpen("C:\CD8M.csv", 0) ;~ Local $file1 = FileOpenDialog("test.csv", @ScriptDir & "\", "csv (*.csv)", 1 + 4),$i ;~ Local $file1 = @ScriptDir & "\DELL9150FastSearchAllNew.txt",$i ConsoleWrite(_FileCountLines($file1) & @LF) Local $time1 = TimerInit() $file = FileOpen($file1, 0) ;~ L$i = 1 $oldstring = "" $newstring = "" $j = 50000 ; the line where to split ;~ FileReadLine($file,$i) While 1 $line = FileReadLine($file) If @error = -1 Then ExitLoop If $i < $j Then $oldstring &= $line & @CRLF Else $newstring &= $line & @CRLF EndIf $i += 1 WEnd FileClose($file) $file = FileOpen(@ScriptDir & "\first.csv", 1) FileWrite($file, $oldstring) FileClose($file) $file = FileOpen(@ScriptDir & "\second.csv", 1) FileWrite($file, $newstring) FileClose($file) ConsoleWrite("filelargesplit.au3=" & Round(TimerDiff($time1)) & " msec" & @LF) #cs 606268 filelargesplit.au3=18325 msec #ce ;======================================================================== Local $c = FileDelete(@ScriptDir & "\first.csv"), $c = FileDelete(@ScriptDir & "\second.csv") Local $time1 = TimerInit(), $fileRead = FileRead($file1), $i_pos = StringInStr($fileRead, @CRLF, 0, 50000) FileWrite(@ScriptDir & "\first.csv", StringLeft($fileRead, $i_pos - 1)) FileWrite(@ScriptDir & "\second.csv", StringMid($fileRead, $i_pos + 2)) ConsoleWrite("fileread=" & Round(TimerDiff($time1)) & " msec" & @LF) #cs 606268 lines filelargesplit.au3=18736 msec fileread=5623 msec #ceBest, randall Edited October 8, 2007 by randallc ExcelCOM... AccessCom.. Word2... FileListToArrayNew...SearchMiner... Regexps...SQL...Explorer...Array2D.. _GUIListView...array problem...APITailRW Link to comment Share on other sites More sharing options...
SiteMaze Posted November 6, 2007 Author Share Posted November 6, 2007 What do you mean when you say it doesn't work? Does AutoIt crash or does the script execute successfully and not produce the desired files? If you're positive it works on small files and it doesn't work on your huge file then logically their is nothing wrong with your code but rather a limit to either AutoIt or your computer. For small files, it gives the correct output. For large files, it keeps on processing with CPU at 100%. I waited at most 24hrs before I endprocess it. I wouldn't say it crashed, I think it is just inefficient coding.I believe there must be more efficient way of handling large text files because freeware programs can split my 300mb textfile within 5 minutes. It shouldn't be any limit by my computer but rather that of efficient coding.I will try out RandallC suggestion and report back.thanks a lot. Arsenal Football Fan Club in Singapore Link to comment Share on other sites More sharing options...
SiteMaze Posted November 6, 2007 Author Share Posted November 6, 2007 (edited) Thanks @RandallC $file1 = "C:\largefile.csv" $i = 50000 $fileRead = FileRead($file1) $i_pos = StringInStr($fileRead, @CRLF, 0, $i) FileWrite(@ScriptDir & "\first.csv", StringLeft($fileRead, $i_pos - 1)) FileWrite(@ScriptDir & "\second.csv", StringMid($fileRead, $i_pos + 2)) I splitted a 150mb large text file in 25 seconds. Proper implementation of text manipulation. Edited November 6, 2007 by SiteMaze Arsenal Football Fan Club in Singapore Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now