LostUser Posted September 2, 2008 Posted September 2, 2008 I am making a program to get a list of files that were modified during a specified time frame. Currently I read the entire directory once to find those files. Once I have the total number of files that meet that criteria, I dimension an array for the number of files found and read properties from only the files that meet that criteria specified. What I would like to do is only read the directory once (making it run a little faster), getting the names of the files that meet the criteria and saving the properties of those files at the same time. I would (afaik) have to use ReDim to keep enlarging the size of the array as I find more files that fit the criteria. What I am wondering is if there are any problems using ReDim on a variable over and over ... Memory, slowness, bad programming, etc. Be open minded but not gullible.A hammer sees everything as a nail ... so don't be A tool ... be many tools.
Nahuel Posted September 2, 2008 Posted September 2, 2008 Shouldn't be... I don't know how ReDim works, but it's used with _ArrayAdd() for example. I think it should be better than creating a hugeass array just to make sure you'll fit everything. If you are using UDF's to work with arrays, consider using ByRef so there's no copy process. Now this last method IS great for speed and memory.
seandisanti Posted September 2, 2008 Posted September 2, 2008 I am making a program to get a list of files that were modified during a specified time frame.Currently I read the entire directory once to find those files. Once I have the total number of files that meet that criteria, I dimension an array for the number of files found and read properties from only the files that meet that criteria specified.What I would like to do is only read the directory once (making it run a little faster), getting the names of the files that meet the criteria and saving the properties of those files at the same time. I would (afaik) have to use ReDim to keep enlarging the size of the array as I find more files that fit the criteria.What I am wondering is if there are any problems using ReDim on a variable over and over ... Memory, slowness, bad programming, etc.Have you looked at FileFindFirstFile and FileFindNextFile?
weaponx Posted September 2, 2008 Posted September 2, 2008 Use DirGetSize to get the number of files in the folder, then use that number to declare your array.
LostUser Posted September 2, 2008 Author Posted September 2, 2008 Thanks Nahuel. I didn't even think about looking the UDFs for ReDim. I'll check out the UDF's and see what goes on behind the scenes.Have you looked at FileFindFirstFile and FileFindNextFile?-That is how I am getting the list of files in the directory. Be open minded but not gullible.A hammer sees everything as a nail ... so don't be A tool ... be many tools.
Nahuel Posted September 2, 2008 Posted September 2, 2008 have you tried using _FileListToArray()? I use it all the time and it's quite fast.
LostUser Posted September 2, 2008 Author Posted September 2, 2008 (edited) Use DirGetSize to get the number of files in the folder, then use that number to declare your array.That's a good idea but I only want to get an array the size of what I need. Since I have specific criteria for the files, I just want an array that size.have you tried using _FileListToArray()? I use it all the time and it's quite fast.I checked out the code for this and it is basically using FileFindNextFile. The thing is, it doesn't have a way to check for file attributes other than being a Directory or not. I suppose I could modify the structure and create my own function to allow for File Modification (or other attributes) but I don't know if I want to work on other code. Though I may borrow something from that UDF . Edited September 2, 2008 by LostUser Be open minded but not gullible.A hammer sees everything as a nail ... so don't be A tool ... be many tools.
seandisanti Posted September 2, 2008 Posted September 2, 2008 That's a good idea but I only want to get an array the size of what I need. Since I have specific criteria for the files, I just want an array that size.Ok, here's an idea... post some code. you probably don't need to do it the way that you're trying to, and you look to be dismissing good ideas offered in favor of a less efficient approach. What do you need the array for? are you going to be doing somethign to each file in the array? if so, why use an array since you're already qualifying the files one at a time? If you're going to do somethign to them one at a time if they're in the array, you're actually decreasing efficiency because now instead of just qualifying and acting on each file in turn, you've got a qualifying step, array manipulations, and then actions.... Having your array start a little big is not going to affect performance as much as modifying the size of the array several times over. Post code and you will get help targeted at your problem, whether you really know what it is or not.
Moderators SmOke_N Posted September 2, 2008 Moderators Posted September 2, 2008 _ArrayAdd uses redim every time you add to it. This is slow and cumbersome. 1. List all the files to an array 2. Create an array with the UBound of the array 3. Loop through each file to find out if it meets the criteria. 4. If said file meets the criteria, have a variable you can increase the value by 1 $i_add += 1 5. Set the value of $i_add to the value of the current file that meets the criteria 6. After the loop ends, Redim 1 last time Redim $a_temp[$i_add + 1] then on the next line $a_temp[0] = $i_add #include <file.au3> #include <array.au3> Local $a_array = _FileListToArray("dir") Local $a_temp[$a_array[0] + 1] For $i = 1 To $a_array[0] If My conditional statement is true Then $i_add += 1 $a_temp[$i_add] = $a_array[$i] EndIf Next ReDim $a_temp[$i_add + 1] $a_temp[0] = $i_add $a_array = $a_temp $a_temp = "" _ArrayDisplay($a_array) Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.
LostUser Posted September 2, 2008 Author Posted September 2, 2008 Ok, here's an idea... post some code. you probably don't need to do it the way that you're trying to, and you look to be dismissing good ideas offered in favor of a less efficient approach. What do you need the array for? are you going to be doing somethign to each file in the array? if so, why use an array since you're already qualifying the files one at a time? If you're going to do somethign to them one at a time if they're in the array, you're actually decreasing efficiency because now instead of just qualifying and acting on each file in turn, you've got a qualifying step, array manipulations, and then actions.... Having your array start a little big is not going to affect performance as much as modifying the size of the array several times over. Post code and you will get help targeted at your problem, whether you really know what it is or not. Ok, what I am trying to do is make an automated process that mimics what I do when I am removing/finding malware on a PC. I could run AdAware (which I do), but my initial scan on a computer is to look at specific locations and other things and see what is there. There are beginning comment sections that are for things later in the program. Ok, here is my code. I don't always fully 'get' when to use global or local variables and this is 'in work' so forgive my ignorances and variable names. Any suggestions to what I am trying to do are appreciated. expandcollapse popup; ---------------------------------------------------------------------------- ; ; AutoIt Version: v3.2.12.1 ; ; ; Script Function: Testing for common malware file locations, registry locations, pointers, etc. ; Template AutoIt script. ; ; ---------------------------------------------------------------------------- #include <guiconstants.au3> #include <date.au3> #cs Use .ini file to hold common file locations and registry locations, registry run, registry BHO locations, registry locations for disabled items like taskbar and display properties tabs, etc.. Files will be listed in suspect locations based on newer date/time, lack of version and/or identifying information. possibly embed these into the script later? #ce #cs Registry keys run location keys HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\Run HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\RunOnce HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\Run HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\RunOnce HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\RunOnceEx HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\Explorer\SharedTaskScheduler Maybe check here for files found below. [HKEY_CLASSES_ROOT\CLSID #ce #cs FIle locations Check the local user and the All Users sections C:\Documents and Settings\[local user]\Start Menu\Programs\Startup C:\Documents and Settings\[local user]\Local Settings\Temp C:\Documents and Settings\[local user]\Local Settings\Temporary Internet Files C:\Documents and Settings\[local user]\Local Settings\Temporary Internet Files\Content.IE5 C:\Documents and Settings\[local user]\Application Data C:\Documents and Settings\[local user]\Local Settings\Application Data C:\Temp C:\WINDOWS\Temp C:\WINDOWS\system32 #ce ;@UserName #cs ini file format [paths] path1=C:\Documents and Settings\[local user]\Start Menu\Programs\Startup path1=C:\Documents and Settings\[local user]\Local Settings\Temp path1=C:\Documents and Settings\[local user]\Local Settings\Temporary Internet Files path1=C:\Documents and Settings\[local user]\Local Settings\Temporary Internet Files\Content.IE5 path1=C:\Documents and Settings\[local user]\Application Data path1=C:\Documents and Settings\[local user]\Local Settings\Application Data path1=C:\Temp path1=C:\WINDOWS\Temp path1=C:\WINDOWS\system32 #ce #cs Command line applictations to help with finding malware netstat -a or -na or -nao [time] reg query [registry key you want to query] dir /a (finds all files, including hidden and system) net users (shows user accounts on the system) localgroup administrators (shows users that are members of the administrators group) tasklist /svc (shows processes along with all the services running from each process) #ce ; Global $arr_FileList[12] Global $arr_FileInfo[12]=["Comments","InternalName","ProductName","CompanyName","LegalCopyright","ProductVersion","FileDescription","LegalTrademarks","PrivateBuild","FileVersion","OriginalFilename","SpecialBuild"] Global $SearchPath="C:\windows\" Global $Filter="*.*" Global $v_FileName="" $FullPath="" If FileExists("C:\Test.ini") Then FileDelete("C:\Test.ini") _MalFileFind () ;MsgBox(0,"","Array size = " & UBound($arr_FileList) & @CRLF & "contents = " & $arr_FileList[1]) For $z=1 To UBound($arr_FileList)-1 ;MsgBox(0,"","Doing the For Next Loop") $s_GetName=$arr_FileList[$z] ;Look for the last '\' or '/' and separate the file name from the end of the path if necessary If StringInStr($v_FileName,"\") Then $v_FileName=StringMid($s_GetName,StringInStr($s_GetName,"\",0,-1)+1,StringLen($s_GetName)) Else If StringInStr($v_FileName,"/") Then $v_FileName=StringMid($s_GetName,StringInStr($s_GetName,"/",0,-1)+1,StringLen($s_GetName)) Else $v_FileName=$s_GetName EndIf EndIf $FullPath=$SearchPath&$v_FileName ;MsgBox(0,"","Full Path=" & $FullPath) If FileExists($FullPath) Then IniWrite("C:\Test.ini",$v_FileName,"Location",$FullPath) $count=0 If StringInStr(FileGetAttrib($FullPath),"D") Then IniWrite("C:\Test.ini",$s_GetName,"Type","Directory") Else IniWrite("C:\Test.ini",$s_GetName,"Type","File") while 1 If $count=12 Then ExitLoop $value=FileGetVersion($FullPath,$arr_FileInfo[$count]) IniWrite("C:\Test.ini",$s_GetName,$arr_FileInfo[$count],$value) $count+=1 WEnd EndIf IniWrite("C:\Test.ini",$v_FileName,".","********************") EndIf Next Run("notepad.exe c:\test.ini") Func _MalFileFind () ;Find malware or suspect files based on just being newer. $Hours=192 $Today=_NowCalc() $NewCount=0 $count=0 $TotalCount=0 $hnd_search=FileFindFirstFile($SearchPath&$Filter) ; MsgBox(0,"","FindFirstFile error = " & $hnd_search) ;First check and see if there are any files newer than 192 hours (8 days * 24 hours) While 1 $TotalCount+=1 $s_File=FileFindNextFile($hnd_search) $err=@error If $err=1 Then ExitLoop $s_FTime=FileGetTime($SearchPath&$s_File,0,1) $s_FTime=StringMid($s_FTime,1,4)&"/"&StringMid($s_FTime,5,2)&"/"&StringMid($s_FTime,7,2)&" "&StringMid($s_FTime,9,2)&":"&StringMid($s_FTime,11,2)&":"&StringMid($s_FTime,13,2) $diff=_DateDiff('h', $s_FTime,$Today) If $diff < $Hours Then $count+=1 EndIf WEnd ; FileClose($hnd_search) ; MsgBox(0,"","Error = " & $err) ; MsgBox(0,"","Done checking. Found "&$count&" files less than "&$Hours&" hours old out of "&$TotalCount&" files.") ;If there were any files found newer that 168 hours, then go back through and recheck and store the file ; names in an array. Have not tested doing one loop and using ReDim. $OldCount=$count $OldTotal=$TotalCount $TotalCount=0 $count=0 If $OldCount > 0 Then ReDim $arr_FileList[$OldCount+1] ; MsgBox(0,"","Array size = " & UBound($arr_FileList) & @CRLF & "contents = " & $arr_FileList[1]) $hnd_search=FileFindFirstFile($SearchPath&$Filter) While 1 $TotalCount+=1 ; If $TotalCount=0 Then ExitLoop $s_File=FileFindNextFile($hnd_search) $err=@error ;ToolTip("Error = " &$err & @CRLF & "TotalCount = " & $TotalCount & @CRLF & "File = " & $s_File & @CRLF & "File Modified = "&$s_FTime) If $err=1 Then ExitLoop $s_FTime=FileGetTime($SearchPath&$s_File,0,1) $s_FTime=StringMid($s_FTime,1,4)&"/"&StringMid($s_FTime,5,2)&"/"&StringMid($s_FTime,7,2)&" "&StringMid($s_FTime,9,2)&":"&StringMid($s_FTime,11,2)&":"&StringMid($s_FTime,13,2) $diff=_DateDiff('h', $s_FTime,$Today) If $diff < $Hours Then $count+=1 If $count > $OldCount Then ReDim $arr_FileList[$count] MsgBox(0,"Change of number of files scanned.","The second scan shows a different number of files than the first scan." & @CRLF & "A second message will display once all the files are done" & @CRLF & "being scanned. It will show the number of files from the first and second scans.") EndIf ; MsgBox(0,"","NewCount = " & $count & @CRLF & "File = " & $s_File & @CRLF & "File Modified = "&$s_FTime) $arr_FileList[$count]=$s_File ; MsgBox(0,"","File: " & $arr_FileList[$Count]) EndIf WEnd FileClose($hnd_search) EndIf If $TotalCount <> $OldTotal Or $count<>$OldCount Then MsgBox(0,"Change of number of files scanned - report","First scan:" & @CRLF & "All files counted =" & $OldTotal & @CRLF & "Modified files counted =" & $OldCount & @CRLF & "Second scan:" & @CRLF & "All files counted =" & $TotalCount & @CRLF & "Modified files counted ="&$count) EndIf ToolTip("") EndFunc;_MalFileFind ;Get file information to determine if there is missing and possibly missing information ; which could mean that the file is malware. ;Should values be assigned to certain indicators with the idea that higher values mean ; that it is more likely that a file is malware? Good luck reading this. Be open minded but not gullible.A hammer sees everything as a nail ... so don't be A tool ... be many tools.
seandisanti Posted September 2, 2008 Posted September 2, 2008 Ok, here is my code. I don't always fully 'get' when to use global or local variables and this is 'in work' so forgive my ignorances and variable names. Any suggestions to what I am trying to do are appreciated.Good luck reading this.Good deal, going through it right now, and I'll edit this post with any suggestions. I'm sorry if i sounded negative, everyone really is here to help (myself included) but the dismissals without the code kind of pushed one of my buttons.
LostUser Posted September 2, 2008 Author Posted September 2, 2008 _ArrayAdd uses redim every time you add to it. This is slow and cumbersome. 1. List all the files to an array 2. Create an array with the UBound of the array 3. Loop through each file to find out if it meets the criteria. 4. If said file meets the criteria, have a variable you can increase the value by 1 $i_add += 1 5. Set the value of $i_add to the value of the current file that meets the criteria 6. After the loop ends, Redim 1 last time Redim $a_temp[$i_add + 1] then on the next line $a_temp[0] = $i_add #include <file.au3> #include <array.au3> Local $a_array = _FileListToArray("dir") Local $a_temp[$a_array[0] + 1] For $i = 1 To $a_array[0] If My conditional statement is true Then $i_add += 1 $a_temp[$i_add] = $a_array[$i] EndIf Next ReDim $a_temp[$i_add + 1] $a_temp[0] = $i_add $a_array = $a_temp $a_temp = "" _ArrayDisplay($a_array) Thanks SmOke_N that is basically what I am wanting to do. I was just wondering if ReDim (over and over) is an efficient way to do it. Sorry about not posting my code, I just was wonder if using ReDim over and over could be good or bad. Also, I haven't looked at this script in a few weeks and I think I may have mis-stated some of what I am already doing also. Looking back at my script, I think I see that I am getting a list of files that meet the modified criteria initially but then I go back through and use the FileGetVersion command to get all the other file information. Then I put that into a .ini file. I thought it was going back completely through all the files twice but it isn't. However, if it is faster to get the Version information as soon as a file meets the criteria. After all that information is in an array, save that information into a .ini file ... it might be faster? I also intend on only checking the file versions of executables but this is only the beginning stages. Be open minded but not gullible.A hammer sees everything as a nail ... so don't be A tool ... be many tools.
seandisanti Posted September 2, 2008 Posted September 2, 2008 Thanks SmOke_N that is basically what I am wanting to do. I was just wondering if ReDim (over and over) is an efficient way to do it.Sorry about not posting my code, I just was wonder if using ReDim over and over could be good or bad.Also, I haven't looked at this script in a few weeks and I think I may have mis-stated some of what I am already doing also.Looking back at my script, I think I see that I am getting a list of files that meet the modified criteria initially but then I go back through and use the FileGetVersion command to get all the other file information. Then I put that into a .ini file.I thought it was going back completely through all the files twice but it isn't. However, if it is faster to get the Version information as soon as a file meets the criteria. After all that information is in an array, save that information into a .ini file ... it might be faster?I also intend on only checking the file versions of executables but this is only the beginning stages.At a glance it looks like you can bypass the array completely, and just work with each of the files as it is qualified. I mean, currently you're looking to:1. identify next file2. qualify file3. redim array4. add element to array5. access array UBound times writing each file to iniwhen you could be just doing1. identify next file2. qualify file3. write to ini if qualifiedno matter how efficiently you handle the extra steps, it's still creating extra steps and technically hurting performance. If you want to see the difference, just make a copy of your script, and remove the array portions and handle the ini writing in your initial loop, and use _Timer_init and _Timer_Diff to benchmark both ways. I'm sure you'll see consistent results verifying that less instructions means shorter run time and resource usage
LostUser Posted September 3, 2008 Author Posted September 3, 2008 At a glance it looks like you can bypass the array completely, and just work with each of the files as it is qualified. I mean, currently you're looking to:1. identify next file2. qualify file3. redim array4. add element to array5. access array UBound times writing each file to iniwhen you could be just doing1. identify next file2. qualify file3. write to ini if qualifiedno matter how efficiently you handle the extra steps, it's still creating extra steps and technically hurting performance. If you want to see the difference, just make a copy of your script, and remove the array portions and handle the ini writing in your initial loop, and use _Timer_init and _Timer_Diff to benchmark both ways. I'm sure you'll see consistent results verifying that less instructions means shorter run time and resource usageI was going to do it that way but I think the reason I was using an array was that I wanted to limit disk activity until the process was all done. However, upon reflection, there probably won't be much disk writing activity as the actual files that meet the criteria are usually (in my experience) very few ... usually less than 20 or 30 (in my work environment).I think that I'll work it out writing the ini file as it finds qualified files.Thanks for all the help folks. If anyone has anymore suggestions I'd be glad to see any other ideas. The 'idea' that this is all a part of is still in its infancy. Be open minded but not gullible.A hammer sees everything as a nail ... so don't be A tool ... be many tools.
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now