mite Posted July 14, 2008 Share Posted July 14, 2008 Hello Everybody, I`m new here, I hope learn and share ideas around muttley First I`m not a AutoIt programmer yet and my problem is simple : I need to split a text file in multiples files(based on the number of lines). and output something text.txt to text_1.txt text_3.txt text_2.txt go on.. I searched on Google and all this forum to find a solution to do it, but I just did`t find any solution at all. I trying to code something here, but without success yet $file = "file.txt" $hFile = FileOpen($file,0) $sRead = FileRead($hFile) FileClose($hFile) Local $a = StringSplit($sRead, "|") For $element In $a ConsoleWrite($element & @CRLF) Next Someone can guide a help for me? Many Thanks Folks!! Link to comment Share on other sites More sharing options...
Moderators SmOke_N Posted July 14, 2008 Moderators Share Posted July 14, 2008 You need to be a lot more specific on the criteria of the split. _FileReadToArray() will split the file into individual lines, and [0] element will have the total number of lines in the file as well. Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer. Link to comment Share on other sites More sharing options...
herewasplato Posted July 14, 2008 Share Posted July 14, 2008 (edited) ...I need to split a text file in multiples files(based on the number of lines). and output something text.txt to text_1.txt text_3.txt text_2.txt go on..If you promise not to laugh too hard - you can see one way to do it:$lines_per_output_file = 10 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;generate a fake input file for this test $junk = "" For $i = 1 To 100 $junk = $junk & $i & @CRLF Next $file = FileOpen("test.txt", 2) FileWrite($file, $junk) FileClose($file) ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;read and split file into an array $array_of_whole_file = StringSplit(FileRead("test.txt"), @CRLF, 1) $filecnt = 1 $linecnt = 1 While 1 ;open a numbered output file FileOpen("test_" & $filecnt & ".txt", 2) ;write x number of lines to that file For $i = 1 To $lines_per_output_file FileWriteLine("test_" & $filecnt & ".txt", $array_of_whole_file[$linecnt]) $linecnt = $linecnt + 1 If $array_of_whole_file[0] = $linecnt Then FileClose("test_" & $filecnt & ".txt") Exit EndIf Next FileClose("test_" & $filecnt & ".txt") $filecnt = $filecnt + 1 WEnd...but SmOke_N is right - there is more than one way to interpret your request. Edit1: Doh! - Welcome to the forums! Edit2: added comments to my messy code Edited July 14, 2008 by herewasplato [size="1"][font="Arial"].[u].[/u][/font][/size] Link to comment Share on other sites More sharing options...
Paulie Posted July 14, 2008 Share Posted July 14, 2008 ...and output somethinglol you know when the word "something" is used, the request isn't specific enough.. Here is what i'll assume you meant though, since Smoke_N brought it up #Include <File.au3> Dim $FilePath = "c:\path\test.txt", $BaseName = "test_", $Lines _FileReadToArray($FilePath,$Lines) For $i = 1 to $Lines[0] $File = FileOpen(@ScriptDir&"\"&$BaseName&$i&".txt",2) Filewrite($File, $Lines[$i] FileClose($File) Next This will make a new file for every line of the base file. However, i doubt thats what you meant though... muttley Link to comment Share on other sites More sharing options...
herewasplato Posted July 14, 2008 Share Posted July 14, 2008 ...However, i doubt thats what you meant though... muttleyOh, you may be much closer than I was. At least you stuck to the code/concept that was in the OP. :-) [size="1"][font="Arial"].[u].[/u][/font][/size] Link to comment Share on other sites More sharing options...
mite Posted July 14, 2008 Author Share Posted July 14, 2008 (edited) You need to be a lot more specific on the criteria of the split. _FileReadToArray() will split the file into individual lines, and [0] element will have the total number of lines in the file as well. SmOke_N, I guess this StringSplit() as used below did the same thing. If you promise not to laugh too hard - you can see one way to do it:$lines_per_output_file = 10 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;generate a fake input file for this test $junk = "" For $i = 1 To 100 $junk = $junk & $i & @CRLF Next $file = FileOpen("test.txt", 2) FileWrite($file, $junk) FileClose($file) ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;read and split file into an array $array_of_whole_file = StringSplit(FileRead("test.txt"), @CRLF, 1) $filecnt = 1 $linecnt = 1 While 1 ;write x number of lines to that file For $i = 1 To $lines_per_output_file FileWriteLine("test_" & $filecnt & ".txt", $array_of_whole_file[$linecnt]) $linecnt = $linecnt + 1 If $array_of_whole_file[0] = $linecnt Then FileClose("test_" & $filecnt & ".txt") Exit EndIf Next FileClose("test_" & $filecnt & ".txt") $filecnt = $filecnt + 1 WEndoÝ÷ Ù»Jc¤xج®(!¶Ø^è¬Þ¶§¢w°k+h{^®Þ·*.®·ª¹ë-ØÔ:!ßuÊ&zÚ-çè®é¬ßqÝu×r¦z{l¶²ë,ÉÊ{ú®¢×©äÊ)eáz·°jÊejÚ-«b´lÌ(®H§÷«Ë¥.ëmꮢÛ(ëax*ºHë-騽çmábã©zz®¢Û^²Ú®¢Ö¦§Mú~)^r{Z®¢Üf«¨·f§v+ly鬶½ëayø¥y©òÁ¬ªº^© This will make a new file for every line of the base file. However, i doubt thats what you meant though... muttley I confess, The "something" wasn`t clear. Thank you herewasplato, it did the trick! The fact is, I have 20000 lines to save in chunks of 2000 each, the herewasplato code did the trick. The only weird thing I found was the anti virus AVG resident shield(lastest version on Vista) was blocking the script to create the new files, the processor was getting 100%(on the anti virus process stead of AutoIt one) and the file never saves. Even after compile the EXE program the problem remains. After close it several times / disable it, the script begin to create the files. Its is off topic and I don`t know if it`ll help the AutoIt programmers, some people can use this anti virus(the most free one used, I guess) and be unable to run the a programs created from AutoIt due this issue. Anyway, Thanks you guys for the help, you rocks! Edited July 14, 2008 by mite Link to comment Share on other sites More sharing options...
herewasplato Posted July 14, 2008 Share Posted July 14, 2008 ...Even after compile the EXE program the problem remains...Lucky you - you are being protected from yourself :-) My guess is that the Heuristics function in AVG sees the rapid creation of multiple files with similar names as a bad thing. You can try to: Compile the script without the UPX compression.* List the compiled script as an exception - not to be monitored by AVG. *Start > Programs > AutoIt v3 > Compile Script to .exe Select "Compression" from the GUI's menu bar Uncheck "UPX Compress .exe stub" [The resulting file will be about twice what it would be compressed - but this will lessen the chance that an anti-virus product will decide to quarantine all compiles scripts at some point in the future. AVG, AVAST, Symantec and many other AV products have all done this at some point in time.] I'm glad that the code worked for you... Enjoy AutoIt :-) [size="1"][font="Arial"].[u].[/u][/font][/size] Link to comment Share on other sites More sharing options...
mite Posted July 15, 2008 Author Share Posted July 15, 2008 (edited) Hello herewasplato,Thanks I checked here and the AVG is fine now!Yes I love autoIt!Just more one thing : now about performance.I found an old little tool to split file called csplit.exe (it sens a UNIX split command line converted to win32) http://man.root.cz/1/csplit/it doesn`t support UTF-16 as AutoIt. but when the text is in ASCII it works just fine.The point is, this tool splits the text in 1-2 secs the autoIt script do it in 7-10 secs. I think is possible to optimize your script to run a little bit faster.What do you think: Stead of use FileWriteLine() create a temporary array with the splited content and save one time using FileWrite() stead.Or, slice the array part where the split line begin and where it ends. less loops, would do it faster.I am a actionScript and PHP programer, as I am not an auto It programmer, am not sure how to do thinks like that yet.Do you think it would let it faster?I`m trying to do it right now, if I finish I`ll post the script here. You helped saved my day today.Thanks again! Edited July 15, 2008 by mite Link to comment Share on other sites More sharing options...
Moderators SmOke_N Posted July 15, 2008 Moderators Share Posted July 15, 2008 (edited) SmOke_N, I guess this StringSplit() as used below did the same thing.Well, you can do it a ton of ways really... _FileReadToArray() does in fact use StringSplit ... you being new, I figured I'd give you the simple means. Here's just another example (pseudo example as I haven't tested it) ... to show you what I meant.$a_each_element_is_the_file_to_write = _SplitFileToArray("output.txt", 2000) _ArrayDisplay($a_each_element_is_the_file_to_write) Func _SplitFileToArray($s_file, $n_split_lines) If FileExists($s_file) = 0 Then Return SetError(1, 0, 0) Local $s_read = FileRead($s_file) Local $a_sre = StringRegExp($s_read, "(?s)" & _ "((?:.+?(?:\z|\r\n)){0," & $n_split_lines & "}|" & _ "(?:.+?(?:\z|\n)){0," & $n_split_lines & "}|" & _ "(?:.+?(?:\z|\r)){0," & $n_split_lines & "})", 3) If IsArray($a_sre) = 0 Then Return SetError(2, 0, 0) Local $n_ubound = UBound($a_sre) - 1 ReDim $a_sre[$n_ubound] Return $a_sre EndFunc Edit: Interesting enough. In StringRegExp using {number,number} says from left to right, minimum to maximum. Max here being n_split_lines. I decided to test the above. Seems that 789 is the max allowed muttley ... anything over that failed. Edit2: Doesn't seem to be an AutoIt limitation but a PCRE one, I just replicated it in another language as well, even changing the oveccount doesn't help. Edited July 15, 2008 by SmOke_N Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer. Link to comment Share on other sites More sharing options...
herewasplato Posted July 15, 2008 Share Posted July 15, 2008 (edited) ...What do you think: Stead of use FileWriteLine() create a temporary array with the splited content and save one time using FileWrite() stead. Or, slice the array part where the split line begin and where it ends. less loops, would do it faster....I actually thought of several ways to code it, but I was having a bad coding day. I could not concentrate on what I was supposed to be doing - so, I came to the forum. I had a hard time writing the code above as it was... I kept barfing the syntax. If I hit another lull, I'll attempt to code it better/faster. I'm not a programmer by training or trade... so I might never arrive at the best approach and I'm not so good at error checking. Using the _FileReadToArray() UDF will handle the split better than the one line that I threw in. Take a look at the code. It is on line 175, in this file: C:\Program Files\AutoIt3\Include\File.au3 in a typical install of v3.2.12.0. You can also make use of the error checking that is built into that UDF. If you decide to keep this part: If $array_of_whole_file[0] = $linecnt Then FileClose("test_" & $filecnt & ".txt") Exit EndIf You can change it to: If $array_of_whole_file[0] = $linecnt Then Exit The help files says that AutoIt will close all open files upon exiting (but hints that it is better to code the close)... but a single line "If" is quicker than "If/EndIf". If you want to "code the close" - look in the help file for OnAutoItExit. Lines like this: $filecnt = $filecnt + 1 should be $filecnt += 1 for speed... I just never remember that syntax. I often wonder how those in this forum that are real programmers put up with seeing code like mine - perhaps it is thru shear moral fortitude that they don't go postal on me. ...later... Edit: WYSI not WYG Edited July 15, 2008 by herewasplato [size="1"][font="Arial"].[u].[/u][/font][/size] Link to comment Share on other sites More sharing options...
mite Posted July 15, 2008 Author Share Posted July 15, 2008 Hey Guys, I`ll check your new posts tomorrow because I am tired today after 18 hours of codes This is the current code I am working with, it catchs the values from command line (mode,filename, number of lines, encoding)... also it catch the content from clipboard, this is the reason Im not using the _FileReadToArray(). from tests, It saved 100.000 lines in 25 files lines in 45 secs aprox.. I`m happy with this result and I won`t need more than that (100.000) I am lazy to comments muttley expandcollapse popupIf($CmdLine[1]="simple") Then $a=FileOpen ($CmdLine[2],Number($CmdLine[3])+2) $ok=FileWrite($a,ClipGet()) FileClose($a) ConsoleWrite ($ok) EndIf If($CmdLine[1]="split") Then $lines = Number($CmdLine[4]) $file = StringSplit(ClipGet(), @CRLF, 1) $filecnt = 1 $linecnt = 1 While 1 $handle = FileOpen($CmdLine[2] & $filecnt & ".xml",Number($CmdLine[3])+2) FileWriteLine($handle, "<node>") For $i = 1 To $lines FileWriteLine($handle, $file[$linecnt]) $linecnt = $linecnt + 1 If $file[0] = $linecnt Then FileWriteLine($handle, "</node>") FileClose($handle) ConsoleWrite ($filecnt) Exit EndIf Next FileWriteLine($handle, "</node>") FileClose($handle) $filecnt = $filecnt + 1 WEnd EndIf See ya Link to comment Share on other sites More sharing options...
mite Posted July 15, 2008 Author Share Posted July 15, 2008 Well, you can do it a ton of ways really... _FileReadToArray() does in fact use StringSplit ... you being new, I figured I'd give you the simple means. Here's just another example (pseudo example as I haven't tested it) ... to show you what I meant.$a_each_element_is_the_file_to_write = _SplitFileToArray("output.txt", 2000) _ArrayDisplay($a_each_element_is_the_file_to_write) Func _SplitFileToArray($s_file, $n_split_lines) If FileExists($s_file) = 0 Then Return SetError(1, 0, 0) Local $s_read = FileRead($s_file) Local $a_sre = StringRegExp($s_read, "(?s)" & _ "((?:.+?(?:\z|\r\n)){0," & $n_split_lines & "}|" & _ "(?:.+?(?:\z|\n)){0," & $n_split_lines & "}|" & _ "(?:.+?(?:\z|\r)){0," & $n_split_lines & "})", 3) If IsArray($a_sre) = 0 Then Return SetError(2, 0, 0) Local $n_ubound = UBound($a_sre) - 1 ReDim $a_sre[$n_ubound] Return $a_sre EndFunc Edit: Interesting enough. In StringRegExp using {number,number} says from left to right, minimum to maximum. Max here being n_split_lines. I decided to test the above. Seems that 789 is the max allowed muttley ... anything over that failed. Edit2: Doesn't seem to be an AutoIt limitation but a PCRE one, I just replicated it in another language as well, even changing the oveccount doesn't help. Hey SmOke_N, Very interesting, I don`t undestand StringRegExp very well, but I`ll check the documentation. I actually thought of several ways to code it, but I was having a bad coding day. I could not concentrate on what I was supposed to be doing - so, I came to the forum. I had a hard time writing the code above as it was... I kept barfing the syntax. If I hit another lull, I'll attempt to code it better/faster. I'm not a programmer by training or trade... so I might never arrive at the best approach and I'm not so good at error checking. Using the _FileReadToArray() UDF will handle the split better than the one line that I threw in. Take a look at the code. It is on line 175, in this file: C:\Program Files\AutoIt3\Include\File.au3 in a typical install of v3.2.12.0. You can also make use of the error checking that is built into that UDF. If you decide to keep this part: If $array_of_whole_file[0] = $linecnt Then FileClose("test_" & $filecnt & ".txt") Exit EndIf You can change it to: If $array_of_whole_file[0] = $linecnt Then Exit The help files says that AutoIt will close all open files upon exiting (but hints that it is better to code the close)... but a single line "If" is quicker than "If/EndIf". If you want to "code the close" - look in the help file for OnAutoItExit. Lines like this: $filecnt = $filecnt + 1 should be $filecnt += 1 for speed... I just never remember that syntax. I often wonder how those in this forum that are real programmers put up with seeing code like mine - perhaps it is thru shear moral fortitude that they don't go postal on me. ...later... Edit: WYSI not WYG I decided not to do anything new to the code, since my project is very big, I can`t lose time with performance issues right now, the most important is it working ,isn`t? After everything finished I`ll return to improve that. Thanks for your help, good luck with your codes too. Actually I m not a programmer by training too. Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now