Jump to content

Using ReDim over and over...any issues I need to know about?


LostUser
 Share

Recommended Posts

I am making a program to get a list of files that were modified during a specified time frame.

Currently I read the entire directory once to find those files. Once I have the total number of files that meet that criteria, I dimension an array for the number of files found and read properties from only the files that meet that criteria specified.

What I would like to do is only read the directory once (making it run a little faster), getting the names of the files that meet the criteria and saving the properties of those files at the same time. I would (afaik) have to use ReDim to keep enlarging the size of the array as I find more files that fit the criteria.

What I am wondering is if there are any problems using ReDim on a variable over and over ... Memory, slowness, bad programming, etc.

Be open minded but not gullible.A hammer sees everything as a nail ... so don't be A tool ... be many tools.

Link to comment
Share on other sites

Shouldn't be... I don't know how ReDim works, but it's used with _ArrayAdd() for example.

I think it should be better than creating a hugeass array just to make sure you'll fit everything. If you are using UDF's to work with arrays, consider using ByRef so there's no copy process. Now this last method IS great for speed and memory.

Link to comment
Share on other sites

I am making a program to get a list of files that were modified during a specified time frame.

Currently I read the entire directory once to find those files. Once I have the total number of files that meet that criteria, I dimension an array for the number of files found and read properties from only the files that meet that criteria specified.

What I would like to do is only read the directory once (making it run a little faster), getting the names of the files that meet the criteria and saving the properties of those files at the same time. I would (afaik) have to use ReDim to keep enlarging the size of the array as I find more files that fit the criteria.

What I am wondering is if there are any problems using ReDim on a variable over and over ... Memory, slowness, bad programming, etc.

Have you looked at FileFindFirstFile and FileFindNextFile?
Link to comment
Share on other sites

Thanks Nahuel. I didn't even think about looking the UDFs for ReDim. I'll check out the UDF's and see what goes on behind the scenes.

Have you looked at FileFindFirstFile and FileFindNextFile?-

That is how I am getting the list of files in the directory.

Be open minded but not gullible.A hammer sees everything as a nail ... so don't be A tool ... be many tools.

Link to comment
Share on other sites

Use DirGetSize to get the number of files in the folder, then use that number to declare your array.

That's a good idea but I only want to get an array the size of what I need. Since I have specific criteria for the files, I just want an array that size.

have you tried using _FileListToArray()? I use it all the time and it's quite fast.

I checked out the code for this and it is basically using FileFindNextFile. The thing is, it doesn't have a way to check for file attributes other than being a Directory or not. I suppose I could modify the structure and create my own function to allow for File Modification (or other attributes) but I don't know if I want to work on other code. Though I may borrow something from that UDF .

Edited by LostUser

Be open minded but not gullible.A hammer sees everything as a nail ... so don't be A tool ... be many tools.

Link to comment
Share on other sites

That's a good idea but I only want to get an array the size of what I need. Since I have specific criteria for the files, I just want an array that size.

Ok, here's an idea... post some code. you probably don't need to do it the way that you're trying to, and you look to be dismissing good ideas offered in favor of a less efficient approach. What do you need the array for? are you going to be doing somethign to each file in the array? if so, why use an array since you're already qualifying the files one at a time? If you're going to do somethign to them one at a time if they're in the array, you're actually decreasing efficiency because now instead of just qualifying and acting on each file in turn, you've got a qualifying step, array manipulations, and then actions.... Having your array start a little big is not going to affect performance as much as modifying the size of the array several times over. Post code and you will get help targeted at your problem, whether you really know what it is or not.
Link to comment
Share on other sites

  • Moderators

_ArrayAdd uses redim every time you add to it. This is slow and cumbersome.

1. List all the files to an array

2. Create an array with the UBound of the array

3. Loop through each file to find out if it meets the criteria.

4. If said file meets the criteria, have a variable you can increase the value by 1 $i_add += 1

5. Set the value of $i_add to the value of the current file that meets the criteria

6. After the loop ends, Redim 1 last time Redim $a_temp[$i_add + 1] then on the next line $a_temp[0] = $i_add

#include <file.au3>
#include <array.au3>
Local $a_array = _FileListToArray("dir")
Local $a_temp[$a_array[0] + 1]
For $i = 1 To $a_array[0]
    If My conditional statement is true Then
        $i_add += 1
        $a_temp[$i_add] = $a_array[$i]
    EndIf
Next
ReDim $a_temp[$i_add + 1]
$a_temp[0] = $i_add
$a_array = $a_temp
$a_temp = ""
_ArrayDisplay($a_array)

Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.

Link to comment
Share on other sites

Ok, here's an idea... post some code. you probably don't need to do it the way that you're trying to, and you look to be dismissing good ideas offered in favor of a less efficient approach. What do you need the array for? are you going to be doing somethign to each file in the array? if so, why use an array since you're already qualifying the files one at a time? If you're going to do somethign to them one at a time if they're in the array, you're actually decreasing efficiency because now instead of just qualifying and acting on each file in turn, you've got a qualifying step, array manipulations, and then actions.... Having your array start a little big is not going to affect performance as much as modifying the size of the array several times over. Post code and you will get help targeted at your problem, whether you really know what it is or not.

Ok, what I am trying to do is make an automated process that mimics what I do when I am removing/finding malware on a PC. I could run AdAware (which I do), but my initial scan on a computer is to look at specific locations and other things and see what is there. There are beginning comment sections that are for things later in the program.

Ok, here is my code. I don't always fully 'get' when to use global or local variables and this is 'in work' so forgive my ignorances and variable names. Any suggestions to what I am trying to do are appreciated.

; ----------------------------------------------------------------------------
;
; AutoIt Version: v3.2.12.1
; 
;
; Script Function: Testing for common malware file locations, registry locations, pointers, etc.
;   Template AutoIt script.
;
; ----------------------------------------------------------------------------

#include <guiconstants.au3>
#include <date.au3>

#cs
Use .ini file to hold common file locations and registry locations, registry run, 
 registry BHO locations, registry locations for disabled items like taskbar and
 display properties tabs, etc..
Files will be listed in suspect locations based on newer date/time, lack of
 version and/or identifying information.
 possibly embed these into the script later?
#ce 

#cs Registry keys
run location keys
HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\Run
HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\RunOnce
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\Run
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\RunOnce
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\RunOnceEx
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\Explorer\SharedTaskScheduler
Maybe check here for files found below. [HKEY_CLASSES_ROOT\CLSID
#ce
#cs FIle locations
Check the local user and the All Users sections
C:\Documents and Settings\[local user]\Start Menu\Programs\Startup
C:\Documents and Settings\[local user]\Local Settings\Temp
C:\Documents and Settings\[local user]\Local Settings\Temporary Internet Files
C:\Documents and Settings\[local user]\Local Settings\Temporary Internet Files\Content.IE5
C:\Documents and Settings\[local user]\Application Data
C:\Documents and Settings\[local user]\Local Settings\Application Data
C:\Temp
C:\WINDOWS\Temp
C:\WINDOWS\system32

#ce
;@UserName

#cs
ini file format
[paths]
path1=C:\Documents and Settings\[local user]\Start Menu\Programs\Startup
path1=C:\Documents and Settings\[local user]\Local Settings\Temp
path1=C:\Documents and Settings\[local user]\Local Settings\Temporary Internet Files
path1=C:\Documents and Settings\[local user]\Local Settings\Temporary Internet Files\Content.IE5
path1=C:\Documents and Settings\[local user]\Application Data
path1=C:\Documents and Settings\[local user]\Local Settings\Application Data
path1=C:\Temp
path1=C:\WINDOWS\Temp
path1=C:\WINDOWS\system32
#ce

#cs Command line applictations to help with finding malware
netstat -a or -na or -nao [time]
reg query [registry key you want to query]
dir /a (finds all files, including hidden and system)
net users (shows user accounts on the system)
localgroup administrators (shows users that are members of the administrators group)
tasklist /svc (shows processes along with all the services running from each process)


#ce
;
Global $arr_FileList[12]
Global $arr_FileInfo[12]=["Comments","InternalName","ProductName","CompanyName","LegalCopyright","ProductVersion","FileDescription","LegalTrademarks","PrivateBuild","FileVersion","OriginalFilename","SpecialBuild"]
Global $SearchPath="C:\windows\"
Global $Filter="*.*"
Global $v_FileName=""
$FullPath=""
If FileExists("C:\Test.ini") Then FileDelete("C:\Test.ini")

_MalFileFind ()

;MsgBox(0,"","Array size = " & UBound($arr_FileList) & @CRLF & "contents = " & $arr_FileList[1])
    For $z=1 To UBound($arr_FileList)-1
;MsgBox(0,"","Doing the For Next Loop")
    $s_GetName=$arr_FileList[$z]
;Look for the last '\' or '/' and separate the file name from the end of the path if necessary
        If StringInStr($v_FileName,"\") Then
            $v_FileName=StringMid($s_GetName,StringInStr($s_GetName,"\",0,-1)+1,StringLen($s_GetName))
        Else
            If StringInStr($v_FileName,"/") Then
                $v_FileName=StringMid($s_GetName,StringInStr($s_GetName,"/",0,-1)+1,StringLen($s_GetName))
            Else
                $v_FileName=$s_GetName
            EndIf
        EndIf
        $FullPath=$SearchPath&$v_FileName
;MsgBox(0,"","Full Path=" & $FullPath)
        If FileExists($FullPath) Then
            IniWrite("C:\Test.ini",$v_FileName,"Location",$FullPath)
            $count=0
            If StringInStr(FileGetAttrib($FullPath),"D") Then
                IniWrite("C:\Test.ini",$s_GetName,"Type","Directory")
            Else
                IniWrite("C:\Test.ini",$s_GetName,"Type","File")
                while 1
                    If $count=12 Then ExitLoop
                    $value=FileGetVersion($FullPath,$arr_FileInfo[$count])
                    IniWrite("C:\Test.ini",$s_GetName,$arr_FileInfo[$count],$value)
                    $count+=1
                WEnd
            EndIf
            IniWrite("C:\Test.ini",$v_FileName,".","********************")
        EndIf
    Next
    Run("notepad.exe c:\test.ini")

Func _MalFileFind ()
;Find malware or suspect files based on just being newer.
    $Hours=192
    $Today=_NowCalc()
    $NewCount=0
    $count=0
    $TotalCount=0
    $hnd_search=FileFindFirstFile($SearchPath&$Filter)
;   MsgBox(0,"","FindFirstFile error = " & $hnd_search)
;First check and see if there are any files newer than 192 hours (8 days * 24 hours)
    While 1
        $TotalCount+=1
        $s_File=FileFindNextFile($hnd_search)
        $err=@error
        If $err=1 Then ExitLoop
        $s_FTime=FileGetTime($SearchPath&$s_File,0,1)
        $s_FTime=StringMid($s_FTime,1,4)&"/"&StringMid($s_FTime,5,2)&"/"&StringMid($s_FTime,7,2)&" "&StringMid($s_FTime,9,2)&":"&StringMid($s_FTime,11,2)&":"&StringMid($s_FTime,13,2)
        $diff=_DateDiff('h', $s_FTime,$Today)
        If $diff < $Hours Then
            $count+=1
        EndIf
    WEnd
;   FileClose($hnd_search)
;   MsgBox(0,"","Error = " & $err)
;   MsgBox(0,"","Done checking.  Found "&$count&" files less than "&$Hours&" hours old out of "&$TotalCount&" files.")
;If there were any files found newer that 168 hours, then go back through and recheck and store the file
; names in an array.  Have not tested doing one loop and using ReDim.
    $OldCount=$count
    $OldTotal=$TotalCount
    $TotalCount=0
    $count=0
    If $OldCount > 0 Then
        ReDim $arr_FileList[$OldCount+1]
;       MsgBox(0,"","Array size = " & UBound($arr_FileList) & @CRLF & "contents = " & $arr_FileList[1])
        $hnd_search=FileFindFirstFile($SearchPath&$Filter)
        While 1
            $TotalCount+=1
;           If $TotalCount=0 Then ExitLoop
            $s_File=FileFindNextFile($hnd_search)
            $err=@error
;ToolTip("Error = " &$err & @CRLF & "TotalCount = " & $TotalCount & @CRLF & "File = " & $s_File & @CRLF & "File Modified = "&$s_FTime)
            If $err=1 Then ExitLoop
            $s_FTime=FileGetTime($SearchPath&$s_File,0,1)
            $s_FTime=StringMid($s_FTime,1,4)&"/"&StringMid($s_FTime,5,2)&"/"&StringMid($s_FTime,7,2)&" "&StringMid($s_FTime,9,2)&":"&StringMid($s_FTime,11,2)&":"&StringMid($s_FTime,13,2)
            $diff=_DateDiff('h', $s_FTime,$Today)
            If $diff < $Hours Then
                $count+=1
                If $count > $OldCount Then 
                    ReDim $arr_FileList[$count]
                    MsgBox(0,"Change of number of files scanned.","The second scan shows a different number of files than the first scan." & @CRLF & "A second message will display once all the files are done" & @CRLF & "being scanned.  It will show the number of files from the first and second scans.")
                EndIf
;               MsgBox(0,"","NewCount = " & $count & @CRLF & "File = " & $s_File & @CRLF & "File Modified = "&$s_FTime)
                $arr_FileList[$count]=$s_File
;               MsgBox(0,"","File: " & $arr_FileList[$Count])
            EndIf
        WEnd
        FileClose($hnd_search)  
    EndIf
    If $TotalCount <> $OldTotal Or $count<>$OldCount Then
        MsgBox(0,"Change of number of files scanned - report","First scan:" & @CRLF & "All files counted =" & $OldTotal & @CRLF & "Modified files counted =" & $OldCount & @CRLF & "Second scan:" & @CRLF & "All files counted =" & $TotalCount & @CRLF & "Modified files counted ="&$count)
    EndIf
    ToolTip("")

EndFunc;_MalFileFind


;Get file information to determine if there is missing and possibly missing information
; which could mean that the file is malware.
;Should values be assigned to certain indicators with the idea that higher values mean
; that it is more likely that a file is malware?

Good luck reading this.

Be open minded but not gullible.A hammer sees everything as a nail ... so don't be A tool ... be many tools.

Link to comment
Share on other sites

Ok, here is my code. I don't always fully 'get' when to use global or local variables and this is 'in work' so forgive my ignorances and variable names. Any suggestions to what I am trying to do are appreciated.

Good luck reading this.

Good deal, going through it right now, and I'll edit this post with any suggestions. I'm sorry if i sounded negative, everyone really is here to help (myself included) but the dismissals without the code kind of pushed one of my buttons.

Link to comment
Share on other sites

_ArrayAdd uses redim every time you add to it. This is slow and cumbersome.

1. List all the files to an array

2. Create an array with the UBound of the array

3. Loop through each file to find out if it meets the criteria.

4. If said file meets the criteria, have a variable you can increase the value by 1 $i_add += 1

5. Set the value of $i_add to the value of the current file that meets the criteria

6. After the loop ends, Redim 1 last time Redim $a_temp[$i_add + 1] then on the next line $a_temp[0] = $i_add

#include <file.au3>
#include <array.au3>
Local $a_array = _FileListToArray("dir")
Local $a_temp[$a_array[0] + 1]
For $i = 1 To $a_array[0]
    If My conditional statement is true Then
        $i_add += 1
        $a_temp[$i_add] = $a_array[$i]
    EndIf
Next
ReDim $a_temp[$i_add + 1]
$a_temp[0] = $i_add
$a_array = $a_temp
$a_temp = ""
_ArrayDisplay($a_array)
Thanks SmOke_N that is basically what I am wanting to do. I was just wondering if ReDim (over and over) is an efficient way to do it.

Sorry about not posting my code, I just was wonder if using ReDim over and over could be good or bad.

Also, I haven't looked at this script in a few weeks and I think I may have mis-stated some of what I am already doing also.

Looking back at my script, I think I see that I am getting a list of files that meet the modified criteria initially but then I go back through and use the FileGetVersion command to get all the other file information. Then I put that into a .ini file.

I thought it was going back completely through all the files twice but it isn't. However, if it is faster to get the Version information as soon as a file meets the criteria. After all that information is in an array, save that information into a .ini file ... it might be faster?

I also intend on only checking the file versions of executables but this is only the beginning stages.

Be open minded but not gullible.A hammer sees everything as a nail ... so don't be A tool ... be many tools.

Link to comment
Share on other sites

Thanks SmOke_N that is basically what I am wanting to do. I was just wondering if ReDim (over and over) is an efficient way to do it.

Sorry about not posting my code, I just was wonder if using ReDim over and over could be good or bad.

Also, I haven't looked at this script in a few weeks and I think I may have mis-stated some of what I am already doing also.

Looking back at my script, I think I see that I am getting a list of files that meet the modified criteria initially but then I go back through and use the FileGetVersion command to get all the other file information. Then I put that into a .ini file.

I thought it was going back completely through all the files twice but it isn't. However, if it is faster to get the Version information as soon as a file meets the criteria. After all that information is in an array, save that information into a .ini file ... it might be faster?

I also intend on only checking the file versions of executables but this is only the beginning stages.

At a glance it looks like you can bypass the array completely, and just work with each of the files as it is qualified. I mean, currently you're looking to:

1. identify next file

2. qualify file

3. redim array

4. add element to array

5. access array UBound times writing each file to ini

when you could be just doing

1. identify next file

2. qualify file

3. write to ini if qualified

no matter how efficiently you handle the extra steps, it's still creating extra steps and technically hurting performance. If you want to see the difference, just make a copy of your script, and remove the array portions and handle the ini writing in your initial loop, and use _Timer_init and _Timer_Diff to benchmark both ways. I'm sure you'll see consistent results verifying that less instructions means shorter run time and resource usage

Link to comment
Share on other sites

At a glance it looks like you can bypass the array completely, and just work with each of the files as it is qualified. I mean, currently you're looking to:

1. identify next file

2. qualify file

3. redim array

4. add element to array

5. access array UBound times writing each file to ini

when you could be just doing

1. identify next file

2. qualify file

3. write to ini if qualified

no matter how efficiently you handle the extra steps, it's still creating extra steps and technically hurting performance. If you want to see the difference, just make a copy of your script, and remove the array portions and handle the ini writing in your initial loop, and use _Timer_init and _Timer_Diff to benchmark both ways. I'm sure you'll see consistent results verifying that less instructions means shorter run time and resource usage

I was going to do it that way but I think the reason I was using an array was that I wanted to limit disk activity until the process was all done. However, upon reflection, there probably won't be much disk writing activity as the actual files that meet the criteria are usually (in my experience) very few ... usually less than 20 or 30 (in my work environment).

I think that I'll work it out writing the ini file as it finds qualified files.

Thanks for all the help folks. If anyone has anymore suggestions I'd be glad to see any other ideas. The 'idea' that this is all a part of is still in its infancy.

Be open minded but not gullible.A hammer sees everything as a nail ... so don't be A tool ... be many tools.

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...