blink314 Posted June 21, 2005 Share Posted June 21, 2005 I have to search through a textfile. I have read the entire text file into an array using _FileToArray and am now looping through the array (reading the file in is only taking 1-2 seconds... not a problem). I've been using StringInStr to check to see if my search term is in the current array line, however, this is proving to be very slow. Is there an alternative to StringInStr? Are there any dll Find functions I could use (I am an extreme novice on dll's)? Or, is there any way I could optomize the process? I have taken out as much junk as possible from my code and the slow up occurs with StringInStr. Would stripping whitespace from both ends make it faster? Any other ideas? The text files can be up to 400,000 lines.... meaning any small speed increase would greatly help!! Thanks, Kevin Link to comment Share on other sites More sharing options...
JSThePatriot Posted June 21, 2005 Share Posted June 21, 2005 What makes you read the file into an array? What I would do if you really dont have any reason is try to read the file by each line... (FileReadLine() in the helpfile). It uses a loop. You can in the same loop have it search the string... if it finds the search string then it outputs that line to another file or calls a function or whatever... JS AutoIt Links File-String Hash Plugin Updated! 04-02-2008 Plugins have been discontinued. I just found out. ComputerGetInfo UDF's Updated! 11-23-2006 External Links Vortex Revolutions Engineer / Inventor (Web, Desktop, and Mobile Applications, Hardware Gizmos, Consulting, and more) Link to comment Share on other sites More sharing options...
blink314 Posted June 21, 2005 Author Share Posted June 21, 2005 Right... I was under the impression that reading from an array would be quicker than reading from a file. The time required to load the file into the array is not a problem, I'm just looking to make the looping through the array quicker. I'll try it and let you know. THanks! Kevin Link to comment Share on other sites More sharing options...
JSThePatriot Posted June 21, 2005 Share Posted June 21, 2005 Right... I was under the impression that reading from an array would be quicker than reading from a file. The time required to load the file into the array is not a problem, I'm just looking to make the looping through the array quicker. I'll try it and let you know. THanks!Kevin<{POST_SNAPBACK}>Please do let me know. I was thinking it would save you a step. You could analize each line as you go down...JS AutoIt Links File-String Hash Plugin Updated! 04-02-2008 Plugins have been discontinued. I just found out. ComputerGetInfo UDF's Updated! 11-23-2006 External Links Vortex Revolutions Engineer / Inventor (Web, Desktop, and Mobile Applications, Hardware Gizmos, Consulting, and more) Link to comment Share on other sites More sharing options...
blink314 Posted June 21, 2005 Author Share Posted June 21, 2005 Well, I save the step of reading in the array, but it is slower each time I read a line from the textfile. Like I said, I dont mind the time required to read in the array, I'm trying to cut down the time within the loop, which looks something like this: For $linecount = 1 to $Filelength $Currentline = $logarray[$linecount] if StringInStr($currentline, $SearchString, 1,1) <> 0 then ... ... ... endif next for If I run the script without reading it into an array and give it a searchstring that is NOT in the file (so none of the additional processin in the ... lines takes place), it takes about 24 seconds to search the first 100,000 lines. If I read the file to an array it takes about 25 seconds... including the time to read the file into the array. Is there any way to make the loop faster? Kevin Link to comment Share on other sites More sharing options...
Nutster Posted June 21, 2005 Share Posted June 21, 2005 That code looks good, but the next should not have for after it. Try reading the whole file into a variable if it is small enough to fit in one string and then run StringInStr on that. Do you need to know the line that the search string is in, or just whether it is there or not? David NuttallNuttall Computer Consulting An Aquarius born during the Age of Aquarius AutoIt allows me to re-invent the wheel so much faster. I'm off to write a wizard, a wonderful wizard of odd... Link to comment Share on other sites More sharing options...
blink314 Posted June 21, 2005 Author Share Posted June 21, 2005 Sorry mistype on that Next For. Yes, I do have to know what line it's on. I'm reading in an AutoCAD log file and I need to find other info around the search match. THanks though, Kevin Link to comment Share on other sites More sharing options...
JSThePatriot Posted June 22, 2005 Share Posted June 22, 2005 (edited) I would like to see your full script if that be possible. I am not sure where you are getting some of your variables and that may be why it is going a bit slow. I have never really had speed troubles. I use the method that I have described. I have used it on a 3MB file and it ended in 5 seconds. I dont know how many lines there were. I can certainly look it up. It is a very fast method so long as there arent other things slowing it down. Edit: The 3MB file was 146408 lines. I have since increased the file size to 7,466,758 with 20MB and am currently testing speed on it. Just finished it ended up being 904.605 seconds. Which is just over 15 minutes. JS Edited June 22, 2005 by JSThePatriot AutoIt Links File-String Hash Plugin Updated! 04-02-2008 Plugins have been discontinued. I just found out. ComputerGetInfo UDF's Updated! 11-23-2006 External Links Vortex Revolutions Engineer / Inventor (Web, Desktop, and Mobile Applications, Hardware Gizmos, Consulting, and more) Link to comment Share on other sites More sharing options...
mother9987 Posted June 22, 2005 Share Posted June 22, 2005 Sorry mistype on that Next For. Yes, I do have to know what line it's on. I'm reading in an AutoCAD log file and I need to find other info around the search match. THanks though,Kevin<{POST_SNAPBACK}>I've been tailing a log file in a game, and what I found realy sped up my script was reading the file like 8k at a time and using 1 StringInStr to see if what I was looking for was anywhere in that chunk.If it's not, throw out everything before the last carrige return and read another 8k.If it is, process everything between the carrige return before what I was looking for and the one after and then throw out everything before the carrige return after.If what you're looking for doesn't occur often in the log file, you should get a lot more speed going that way. I think I got on the order of 10x faster than checking each line individually.If it sounds helpful, I'll try and make enough sense of my code to post a bit. Link to comment Share on other sites More sharing options...
blink314 Posted June 22, 2005 Author Share Posted June 22, 2005 Ok, here it is. The For...Next is of primary concern here. Basically, every object in AutoCAD is listed in blocks of text. Each line is an attribute of that block. I am looking for text that matches my search term in the lines that would contain the text shown on the screen. Then, I have to back up a few lines (no more than 14) to find Page and coordinate info. Speed is key because I am making a find function. I dont want to have to sit for 5 minutes waiting for the find function to work. I realize it may be a minute or slightly more for large files, but Excel VBA can clean up the same textfile in less than half a minute. Would there be any way to use VBA functions in AutoIT (doubting...). ThanksKevinFunc SearchLogFile($SearchTerm, $SearchFile) Dim $logarray[1];The array that stores the logfile If GUICtrlRead($StatusLabel) = " No index yet..." Then ;Ensures that an index exists MsgBox(0, "User ERROR", "You must first make an index file for the selected drawing!") Else HotKeySet("{esc}", "StopSearch") ;Allow user to exit search $Stop = 0 $SearchFile = $DumpDir & $SearchFile & "-Index.log" $FileLength = _FileCountLines($SearchFile) _FileReadToArray($SearchFile, $logarray) ;I am NOT concerned with the time taken here _GUICtrlListViewDeleteAllItems ($ResultsList) ;Deletes previous search results from listview $LineCount = 14 $Track = 4999 ;Counter for screen update $LineCount = 1 GUICtrlSetData($StatusLabel, " Line: 1 of " & $FileLength) For $LineCount = 1 To $FileLength If $LineCount > $Track Then ;Only increments display every 5000 lines GUICtrlSetData($StatusLabel, " Line: " & $LineCount & " of " & $FileLength) $Track = $Track + 5000 EndIf If $Stop = 1 Then ;Catches hotkey (via another function) ExitLoop EndIf $fstring = 0 ;Flag for search result $CurrentLine = StringStripWS($logarray[$LineCount], 3) Select ;Only lines that might contain search results need to be looked at Case StringLeft($CurrentLine, 4) = "text" $fstring = 1 Case StringLeft($CurrentLine, 5) = "Conte" $fstring = 1 Case StringLeft($CurrentLine, 5) = "value" $fstring = 1 Case Else $fstring = 0 ;No search result on this line! EndSelect If $fstring = 1 Then If StringInStr(StringStripWS($CurrentLine, 3), $SearchTerm, 1, 1) <> 0 Then $Found = 0 $PreCount = 1 $Coord = "NA" ;Default the fields so weird entries are noticed $Page = "NA" $text = StringTrimLeft($CurrentLine, StringInStr($CurrentLine, " ", 0, 1)) $text = StringStripWS($text, 3) Do ;If the search term is found, I need to find two pieces of info in surrounding lines $NewCount = $LineCount - $PreCount If StringInStr($logarray[$NewCount], "Y= ", 1, 1) Then ;Finds coordinates of object $Coord = StringReplace($logarray[$NewCount], " ", "") $CoordChar = StringInStr($Coord, "point,", 1, 1) If $CoordChar <> 0 Then $Coord = StringTrimLeft($Coord, $CoordChar + 5) $Coord = StringReplace($Coord, "X=", "") $Coord = StringReplace($Coord, "Y=", ",") $Coord = StringReplace($Coord, "Z=", ",") $Coord = StringStripWS($Coord, 2) EndIf ElseIf StringInStr($logarray[$NewCount], "layout:", 0, 1) <> 0 Then ;Finds Page of object in drawing $Page = StringStripWS(StringReplace($logarray[$NewCount], "layout:", ""), 3) $Found = 1 EndIf If $PreCount = 14 Then ;Each text block is about 14 lines long... no need to search any further $Found = 1 EndIf $PreCount = $PreCount + 1 Until $Found = 1 $Temp = $Page & "|" & $text & "|" & $Coord $ItemNum = GUICtrlCreateListViewItem($Temp, $ResultsList) EndIf EndIf Next HotKeySet("{Esc}") ;Release Hotkey so AutoCAD can use it Dim $logarray ;Clear Array GUICtrlSetData($StatusLabel, " Line: " & $LineCount & " of " & $FileLength) EndIfEndFunc ;==>SearchLogFile Link to comment Share on other sites More sharing options...
scriptkitty Posted June 22, 2005 Share Posted June 22, 2005 with the new Obj/COM stuff in the new AutoIt beta, you can control excel, access, word, IE, etc fast and easy. So if you can do it from excel, you can do it in AutoIt. Pretty easy to set up as well. AutoIt3, the MACGYVER Pocket Knife for computers. Link to comment Share on other sites More sharing options...
blink314 Posted June 22, 2005 Author Share Posted June 22, 2005 Right, but the com stuff just controls the objects I thought. Can you actually use the functions as well?? Kevin Link to comment Share on other sites More sharing options...
JSThePatriot Posted June 22, 2005 Share Posted June 22, 2005 @blink314 It is like I thought and as was explained in the other topic you were in. You dont need to get the amount of lines in the file. That is taking your time and that is exactly why it is taking more time per line read. I am going to write some code below that you need to implement instead of _FileCountLines(). Also, if anything I bet removing spaces just takes up more time. Not sure if you are worried about that. I would bet this should take a max of 30 seconds for 400k lines. Possibly up to 1 min, but I doubt it. You also are setting the same variable multiple times. $LineCount = 14 then 1 then 1 again in the For...Next loop. Remove the first 2. Setting its value in the For...Next loop is fine. Dim $fileO = "somefile.log" $file = FileOpen($fileO, 0) If $file = -1 Then MsgBox(0, "Error", "Unable to open file " & $fileO & ".") Exit EndIf While 1 $line = FileReadLine($file) If @error = -1 Then ExitLoop If StringInStr($line, "SomeText") Then ;Do your stuff here... EndIf WEnd FileClose($file) That will read till it reaches the end of the file. Line by line. I dont believe there is any reason to read to the array or know how many lines there are. You tell me. Now try to use that it will be much faster. JS AutoIt Links File-String Hash Plugin Updated! 04-02-2008 Plugins have been discontinued. I just found out. ComputerGetInfo UDF's Updated! 11-23-2006 External Links Vortex Revolutions Engineer / Inventor (Web, Desktop, and Mobile Applications, Hardware Gizmos, Consulting, and more) Link to comment Share on other sites More sharing options...
blink314 Posted June 23, 2005 Author Share Posted June 23, 2005 Ok, I tried reading from the file like you said... still takes ~ 23 seconds to go through the first 80,000 lines. Incidentally, I put a timer on the filecountlines function and got ~80 ticks.... not a whole lot of time being spent there. As I've said before, the filecountlines and _filereadtoarray arent the problem; it's in the looping. The reason I used an array is because accessing memory is faster than accessing the disk. For some reason AutoIt doesnt seem to benefit much from it, but in VBA you can get a HUGE speed increase by reading things into an array and working on the array. Even if (in Excel) you read in a 26x40000 range, search through it, and write the results to a gui after processing it takes about 5 seconds using array, maybe 10 or more reading cells from the spreadsheet. Here is the redone function. Not really cleaned up... though I did change my logic slightly. I wrote a function that cleans the logfile, getting rid of most of the waste info. This gets the 400,000 line file down to ~80,000. It still takes 2 minutes to clean the log file... but I can accept that I suppose. Since VBA can parse the logfile much quicker (a co-worker has an Excel script to do it... takes about 5 seconds on the 400,000 line logfile) I'm thinking about letting excel do the cleaning. Anyone know of any good way to control Excel (NOT the objects but the scripting, mind) from AutoIt? Here is the code from my newer search function: expandcollapse popupFunc SearchLogFile($SearchTerm, $SearchFile) Dim $logarray[1] If GUICtrlRead($StatusLabel) = " No index yet..." Then MsgBox(0, "User ERROR", "You must first make an index file for the selected drawing!") Else HotKeySet("{esc}", "StopSearch") $Stop = 0 $CleanFile = $DumpDir & $SearchFile & "-Clean.log" $time1 = timerinit() $FileLength = _filecountlines($cleanfile) $Diff1 = timerdiff($time1) msgbox(0,"",$diff1) dim $currentline1 dim $currentline2 dim $currentline if fileexists($cleanfile) Then MsgBox(0,"",$cleanfile) $cleanpath = FileOpen($cleanfile,0) _GUICtrlListViewDeleteAllItems ($ResultsList) $LineCount = 1 $track = 4999 GUICtrlSetData($StatusLabel, " Line: 1 of " & $FileLength) while 1 $currentline2 = $currentline1 $currentline1 = $currentline $currentline = stringstripws(filereadline($cleanpath),3) if @error = -1 then ExitLoop If $LineCount > $Track Then GUICtrlSetData($StatusLabel, " Line: " & $LineCount & " of " & $FileLength) $Track = $Track + 5000 EndIf If $Stop = 1 Then ExitLoop EndIf if stringinstr($currentline,$searchterm,1,1) <> 0 then $text = StringTrimLeft($currentline,stringinstr($currentline," ",1,1)) $Page = stringstripws(StringTrimLeft($currentline2,stringinstr($currentline2,":",1,1)+1),3) $Coord = StringTrimLeft($currentline1,stringinstr($currentline1,",",1,1)+1) $Coord = StringReplace($coord, " ", "") $Coord = StringReplace($Coord, "X=", "") $Coord = StringReplace($Coord, "Y=", ",") $Coord = StringReplace($Coord, "Z=", ",") $Coord = StringStripWS($Coord, 2) $Temp = $Page & "|" & $text & "|" & $Coord $ItemNum = GUICtrlCreateListViewItem($Temp, $ResultsList) EndIf $linecount = $linecount+1 WEnd FileClose($cleanpath) EndIf HotKeySet("{Esc}") Dim $logarray GUICtrlSetData($StatusLabel, " Line: " & $LineCount & " of " & $FileLength) EndIf EndFunc Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now