brodie28 Posted September 26, 2008 Share Posted September 26, 2008 (edited) #include <Array.au3> #Include <File.au3> for $i = 1 to 100 $fname = "passlist" & $i & ".txt" dim $aArray[9999999] ConsoleWrite($fname & @CRLF) _FileReadToArray($fname, $aArray) $size = $aArray[0] redim $aArray[$size] for $x = 1 to $aArray[0] ConsoleWrite("file " & $i & " " & $x & " from " & $aArray[0] & @CRLF) if StringIsAlNum ($aArray[$x]) = 0 Then _ArrayDelete($aArray, $x) EndIf Next _FileWriteFromArray($fname, $aArray) Next Basically I have 100 text files, all with about 95000 lines of text... I want to go through all of these text files and delete every line in the text file that is not alpha numeric only. This works... But god is it slow. Any ideas on why it is only writing to the console about once a second? EDIT: It is actually about twice a second... Maybe thats just as fast as it can go? Edited September 26, 2008 by brodie28 Link to comment Share on other sites More sharing options...
dbzfanatic Posted September 26, 2008 Share Posted September 26, 2008 You keep re-dimming that massive array. Try taking that out of your for loop and doing it once,that might help. Go to my website. | My Zazzle Page (custom products)Al Bhed Translator | Direct linkScreenRec ProSimple Text Editor (STE) [TUTORIAL]Task Scheduler UDF <--- First ever UDF!_ControlPaste() UDF[quote name='renanzin' post='584064' date='Sep 26 2008, 07:00 AM']whats help ?[/quote] Link to comment Share on other sites More sharing options...
Moderators SmOke_N Posted September 26, 2008 Moderators Share Posted September 26, 2008 ConsoleWrite is why it's slower. Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer. Link to comment Share on other sites More sharing options...
brodie28 Posted September 26, 2008 Author Share Posted September 26, 2008 I have to keep redimming the array because Im not sure how to define an array of unknown size in autoit (Im not sure you can). So I have to redim it for every text file. Anyway, the script gets stuck in the nested for loop, which is where the time is being taken. SmokeN, if I get rid of consolewrite, do you think it will make a difference? Its probably going to take hours anyway, and I would kind of like to know where its up to. Any way to do that without consolewrite? Link to comment Share on other sites More sharing options...
brodie28 Posted September 26, 2008 Author Share Posted September 26, 2008 (edited) I ran into an error where Arraydelete was rediming the array to a smaller size, so after a while the array would be out of range with the for loop. So I did this. strangely it is going much much faster this way, seemingly for no reason. #include <Array.au3> #Include <File.au3> for $i = 1 to 100 $fname = "passlist" & $i & ".txt" dim $aArray[9999999] ConsoleWrite($fname & @CRLF) _FileReadToArray($fname, $aArray) $size = $aArray[0] redim $aArray[$size] for $x = 1 to $size ConsoleWrite("file " & $i & " " & $x & " from " & $size & @CRLF) if StringIsAlNum ($aArray[$x]) = 0 Then _ArrayDelete($aArray, $x) $size = $size - 1 EndIf Next _FileWriteFromArray($fname, $aArray) Next Edited September 26, 2008 by brodie28 Link to comment Share on other sites More sharing options...
Moderators SmOke_N Posted September 26, 2008 Moderators Share Posted September 26, 2008 (edited) I have to keep redimming the array because Im not sure how to define an array of unknown size in autoit (Im not sure you can). So I have to redim it for every text file. Anyway, the script gets stuck in the nested for loop, which is where the time is being taken. SmokeN, if I get rid of consolewrite, do you think it will make a difference? Its probably going to take hours anyway, and I would kind of like to know where its up to. Any way to do that without consolewrite?Well the redim issue is really the issue with _ArrayDelete(). You have to remember, you're redimming 90+ thousand times with a 100 files. That's CRAZY lol. This is a partial concept (Not tested):#include <file.au3> Local $a_array For $i = 1 To 100 If FileExists("passlist" & $i & ".txt") Then _FileRemoveNonAlNumLines_Array($a_array, "passlist" & $i & ".txt") Next _FileWriteFromArray("passlist_checked_.txt", $a_array, 1) Func _FileRemoveNonAlNumLines_Array(ByRef $av_array, $s_file) Local $a_split_file = StringSplit(StringStripCR(FileRead($s_file)), @LF) Local $i_add = 0, $a_ret, $i_base, $i_ub If IsArray($av_array) = 0 Then $a_ret = $a_split_file $i_base = 1 Else $a_ret = $av_array $i_ub = UBound($a_ret) ReDim $a_ret[$i_ub + ($a_split_file[0] + 1)] $i_base = $i_ub - 1 $i_add = $i_base EndIf For $i = $i_base To $a_split_file[0] If StringIsAlNum($a_split_file[$i]) Then $i_add += 1 $a_ret[$i_add] = $a_split_file[$i] EndIf Next If Not $i_add Then Return ReDim $a_ret[$i_add + 1] $av_array = $a_ret Return $av_array EndFunc Edit: This way, instead of redimming it 9 million times you only redim no more than 200 times. Also, I'd seriously think about writing to a file after every return, and just empty the array. Looping through 9 million elements to write to a file is rediculous. That's more than half of the max allowed elements. You'd find the speed to increase exponentially from start to finish. Edited September 26, 2008 by SmOke_N Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer. Link to comment Share on other sites More sharing options...
Moderators SmOke_N Posted September 26, 2008 Moderators Share Posted September 26, 2008 (edited) I was amazed that _FileWriteFromArray() didn't have an append option!!!! Anyway, this would probably be faster... expandcollapse popupLocal $a_array For $i = 1 To 2 If FileExists("passlist" & $i & ".txt") Then _FileRemoveNonAlNumLines_Array($a_array, "passlist" & $i & ".txt") _FileWriteFromArray_Append("passlist_checked_.txt", $a_array, 1) $a_array = "" EndIf Next Func _FileWriteFromArray_Append($s_file, $a_array, $i_base = 0, $i_ubound = 0) If IsArray($a_array) = 0 Then Return SetError(1, 0, 0) If FileExists($s_file) = 0 Then FileClose(FileOpen($s_file, 2)) If $i_ubound = 0 Or $i_ubound = -1 Or $i_ubound = Default Then $i_ubound = UBound($a_array) - 1 Local $s_write_to_file = FileRead($s_file) If $s_write_to_file <> "" Then $s_write_to_file = StringRegExpReplace($s_write_to_file, "[\r\n]+\z", "") & @CRLF EndIf For $i = $i_base To $i_ubound $s_write_to_file &= $a_array[$i] & @CRLF Next Return FileWrite($s_file, StringTrimRight($s_write_to_file, 2)) EndFunc Func _FileRemoveNonAlNumLines_Array(ByRef $av_array, $s_file) Local $a_split_file = StringSplit(StringStripCR(FileRead($s_file)), @LF) Local $i_add = 0, $a_ret, $i_base, $i_ub If IsArray($av_array) = 0 Then $a_ret = $a_split_file $i_base = 1 Else $a_ret = $av_array $i_ub = UBound($a_ret) ReDim $a_ret[$i_ub + ($a_split_file[0] + 1)] $i_base = $i_ub - 1 $i_add = $i_base EndIf For $i = $i_base To $a_split_file[0] If StringIsAlNum($a_split_file[$i]) Then $i_add += 1 $a_ret[$i_add] = $a_split_file[$i] EndIf Next If Not $i_add Then Return ReDim $a_ret[$i_add + 1] $av_array = $a_ret Return $av_array EndFuncoÝ÷ ØßiËHÂ¥vèm¦åÉÚ-秶*Þ¦º#yË]÷Þéí²¶§X¤y«¢+Ù½ÈÀÌØí¤ôÄQ¼ÄÀÀ(%%¥±á¥ÍÑÌ ÅÕ½ÐíÁÍͱ¥ÍÐÅÕ½ÐìµÀìÀÌØí¤µÀìÅÕ½Ðì¹ÑáÐÅÕ½Ðì¤Q¡¸($%}µå ÕÍѽµÕ¹Ñ¥½¸ ÅÕ½ÐíÁÍͱ¥ÍÐÅÕ½ÐìµÀìÀÌØí¤µÀìÅÕ½Ðì¹ÑáÐÅÕ½Ðì°ÅÕ½ÐíÁÍͱ¥ÍÑ}¡|¹ÑáÐÅÕ½Ðì¤(%¹%)9áÐ()Õ¹}µå ÕÍѽµÕ¹Ñ¥½¸ ÀÌØíÍ}¥±°ÀÌØíÍ}½ÕÑ}¥±¤(%1½°ÀÌØí}ÍÁ±¥Ñ}¥±ôMÑÉ¥¹MÁ±¥Ð¡MÑÉ¥¹MÑÉ¥Á H¡¥±I ÀÌØíÍ}¥±¤¤°1¤($(%1½°ÀÌØíÍ}¡½±}±¥¹ôÅÕ½ÐìÅÕ½Ðì(%½ÈÀÌØí¤ôÄQ¼ÀÌØí}ÍÁ±¥Ñ}¥±lÁt($%%MÑÉ¥¹%ͱ9Õ´ ÀÌØí}ÍÁ±¥Ñ}¥±lÀÌØí¥t¤Q¡¸($$$ÀÌØíÍ}¡½±}±¥¹µÀìôÀÌØí}ÍÁ±¥Ñ}¥±lÀÌØí¥tµÀì I1($%¹%(%9áÐ($(%IÑÕɸ¥±]É¥Ñ ÀÌØíÍ}½ÕÑ}¥±°ÀÌØíÍ}¡½±}±¥¹¤)¹Õ¹Would ultimately be faster than anything we've done thus far. Edited September 26, 2008 by SmOke_N Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer. Link to comment Share on other sites More sharing options...
Moderators SmOke_N Posted September 26, 2008 Moderators Share Posted September 26, 2008 Damn I used 2 AutoIt tags, so I can't edit!!Anyway...The first suggestion above, the output was:It took: 0.875693977865902 seconds to finish 1 file with 95,000 lines.My method I suggested last:It took: 0.803650743320729 seconds to finish 1 file with 95,000 lines.So, with that said... looks like you could finish in a couple of minutes with that method . Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer. Link to comment Share on other sites More sharing options...
brodie28 Posted September 26, 2008 Author Share Posted September 26, 2008 (edited) Thanks, that last one worked perfectly, and MUCH faster than anything else I tried. I assume it worked anyway, notepad really struggles to open text files of this huge size. EDIT: something didn't work. It is much much smaller than it should be and alot of things seem to have been deleted when they shouldnt have been. Ill try to see why. Edit2: Im an idiot. The script I used to split one massive text file into 100 smaller ones had an error (I accidentally left something inside a loop when it should have been out) so all the text files turned out the same. Thats now been corrected and the last script is working perfectly. Edited September 26, 2008 by brodie28 Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now