Sign in to follow this  
Followers 0
crazycrash

Search for corrupted files, memory allocation problem

5 posts in this topic

#1 ·  Posted (edited)

Hey guys

I'm currently writing a macro to try and find out which files where corrupted due to so far unknown reasons. The files contain over 99% NULL's. I am thus going through all the files and calculating the percentage of NULL's so I can quickly spot those corrupted files in my file jungle. The problem I now have is that as soon as I have a file of lets say 100mb Autoit runs out of memory....

Thanks for the help/recommendations!

Cheers,

Adrian

#cs ----------------------------------------------------------------------------
AutoIt Version: 3.3.6.1
Author:      Adrian
#ce ----------------------------------------------------------------------------

$logfile = FileOpen("filelist.txt", 1)
; Check if file opened for writing OK
If $logfile = -1 Then
    MsgBox(0, "Error", "Unable to open file.")
    Exit
EndIf

$CurrentDir = @WorkingDir
$FilesInFolder = _DriveList(@WorkingDir, 2)
ProgressOn("Parsing Directory", "Progress:", "Working...")

For $i = 1 To $FilesInFolder[0]
$fh = FileOpen($FilesInFolder[$i], 16)
$sString = FileRead($fh)
FileClose($fh)
$sChar = "0"
$iCount = StringRegExpReplace($sString, $sChar, $sChar, 0)
$TotalNull = @extended
$TotalChars = StringLen($sString)
$PercentNull = Round(100 * $TotalNull / $TotalChars, 2)
FileWriteLine($logfile, $FilesInFolder[$i] & ", " & $PercentNull)
$currentProgress = Round(100*$i/$FilesInFolder[0], 0)
;MsgBox( 0, "", $currentProgress)
ProgressSet($currentProgress, $currentProgress & "%")
Next
FileClose($logfile)
Sleep(750)
ProgressOff()


Func _DriveList($sInitialPath, $iMaxLevel = 0) ; CREDIT TO Melba23
Local $asFolderList[2][2] = [[1, 0],[0, 0]]
Local $asFileList[1] = [0]
; Add trailing  if required
If StringRight($sInitialPath, 1) <> "" Then $sInitialPath = $sInitialPath & ""
; Store result
$asFolderList[1][0] = $sInitialPath
; Search in listed folders
While $asFolderList[0][0] > 0
  ; Check if we have exceeded the level limit
  If $asFolderList[$asFolderList[0][0]][1] <= $iMaxLevel Then
   ConsoleWrite("+ Searching: " & $asFolderList[$asFolderList[0][0]][0] & @CRLF)
   ; Set current level
   $iCurrLevel = $asFolderList[$asFolderList[0][0]][1]
   ; Set path to search
   $sCurrentPath = $asFolderList[$asFolderList[0][0]][0]
   ; Reduce folder array count
   $asFolderList[0][0] -= 1
   ; Get Search handle
   $hSearch = FileFindFirstFile($sCurrentPath & "*.*")
   ; If folder empty move to next in list
   If $hSearch = -1 Then ContinueLoop
   ; Search folder
   While 1
    $sName = FileFindNextFile($hSearch)
    ; Check for end of folder
    If @error Then ExitLoop
    ; Check for subfolder - @extended set in 3.3.1.1 +
    If @extended Then ; Add to folder list
     ; Increase folder list count
     $asFolderList[0][0] += 1
     ; Double folder list size if too small (fewer ReDim needed)
     If UBound($asFolderList) <= $asFolderList[0][0] Then ReDim $asFolderList[UBound($asFolderList) * 2][2]
     ; Add folder name
     $asFolderList[$asFolderList[0][0]][0] = $sCurrentPath & $sName & ""
     ; Add folder level
     $asFolderList[$asFolderList[0][0]][1] = $iCurrLevel + 1
    Else ; Add to file list if it is *.csv file

     ; Increase file list count
     $asFileList[0] += 1
     ; Double file list size if too small (fewer ReDim needed)
     If UBound($asFileList) <= $asFileList[0] Then ReDim $asFileList[UBound($asFileList) * 2]
     ; Add file name
     $asFileList[$asFileList[0]] = $sCurrentPath & $sName
    EndIf
   WEnd
   ; Close current search
   FileClose($hSearch)
  Else
   ConsoleWrite("! Ignoring: " & $asFolderList[$asFolderList[0][0]][0] & @CRLF)
   ; Reduce folder array count
   $asFolderList[0][0] -= 1
  EndIf
WEnd
; Remove any unused return list elements from last ReDim
ReDim $asFileList[$asFileList[0] + 1]
Return $asFileList
; Display results
;_ArrayDisplay($asFileList) ; Just for demo - will take a long time if you have a lot of files
EndFunc   ;==>_DriveList
Edited by crazycrash

Share this post


Link to post
Share on other sites



#2 ·  Posted (edited)

:) Obviously, reading a 100 MB file (using FileRead function) will trigger a crash ;) , thats because AutoIt will need around 100+ MB RAM to place the contents of your file...

:) My suggestion: You should try reading the file(s) in a loop. I'll provide an example below (completely off-topic, just to understand). It shows the mechanism of copying WinPE.cab from Z: drive to my destination (where the script resides) and can resume the copying, in case it is stopped due to some errors...(WinPE.cab is around 200 MB)

#include <WinAPI.au3>
Global $nBytes
Global Const $SIZE = FileGetSize("Z:WinPE.cab")

$FILER = FileOpen("WinPE.cab", 1)
$hFile = _WinAPI_CreateFile("Z:WinPE.cab", 2, 2)
If FileExists("WinPE.cab") = 0 then
For $x = 0 to FileGetSize("Z:WinPE.cab") Step 40960
$tBuffer = DllStructCreate("byte[40960]")
_WinAPI_SetFilePointer($hFile, $x)
_WinAPI_ReadFile($hFile, DllStructGetPtr($tBuffer), 40960, $nBytes)
$sText = BinaryToString(DllStructGetData($tBuffer, 1))
FileWrite($FILER, $sText)
Next
  else
    For $x = FileGetSize("WinPE.cab") to FileGetSize("Z:WinPE.cab") Step 40960
$tBuffer = DllStructCreate("byte[40960]")
_WinAPI_SetFilePointer($hFile, $x)
_WinAPI_ReadFile($hFile, DllStructGetPtr($tBuffer), 40960, $nBytes)
$sText = BinaryToString(DllStructGetData($tBuffer, 1))
FileWrite($FILER, $sText)
Next
EndIf
_WinAPI_CloseHandle($hFile)
FileClose($FILER)

$FILER2 = _WinAPI_CreateFile("WinPE.cab", 2, 4)
_WinAPI_SetFilePointer($FILER2, $SIZE)
_WinAPI_SetEndOfFile($FILER2)
_WinAPI_CloseHandle($FILER2)

Hell yes, FileRead isn't going to work for you...

Edited by MKISH

----------------------------------------

:bye: Hey there, was I helpful?

----------------------------------------

My Current OS: Win8 PRO (64-bit); Current AutoIt Version: v3.3.8.1

Share this post


Link to post
Share on other sites

MKISH,

Should you be creating the $tbuffer structure inside a loop?

kylomas


Forum Rules         Procedure for posting code

"I like pigs.  Dogs look up to us.  Cats look down on us.  Pigs treat us as equals."

- Sir Winston Churchill

Share this post


Link to post
Share on other sites

#4 ·  Posted (edited)

Thanks for the tip, that is exactly what I needed. However, after a good night of sleep the problem became even more trivial. If the first megabyte of a file consists of 100% NULL's It is highly likely corrupt, so that not only solves the memory problem but also speeds the whole thing up. Here's the latest version:

#cs ----------------------------------------------------------------------------
AutoIt Version: 3.3.6.1
Author:      Adrian
Script Function: Find corrupted NULL files
#ce ----------------------------------------------------------------------------
$logfile = FileOpen("filelist.csv", 1)
; Check if file opened for writing OK
If $logfile = -1 Then
MsgBox(0, "Error", "Unable to open file.")
Exit
EndIf

$CurrentDir = @WorkingDir
$FilesInFolder = RecursiveFileSearch(@WorkingDir)
ProgressOn("Search for corrupted NULL files", "Progress:", "Working...")

For $i = 1 To $FilesInFolder[0]
$fh = FileOpen($FilesInFolder[$i], 16)
$sString = FileRead($fh, 1048576); read first megabyte of file
FileClose($fh)
$sChar = "0"
$iCount = StringRegExpReplace($sString, $sChar, $sChar, 0)
$TotalNull = @extended
$TotalChars = StringLen($sString)
$PercentNull = Round(100 * $TotalNull / $TotalChars, 2)
FileWriteLine($logfile, $PercentNull & ';' & $FilesInFolder[$i])
$currentProgress = Round(100 * $i / $FilesInFolder[0], 0)
;MsgBox( 0, "", $currentProgress)
ProgressSet($currentProgress, $currentProgress & "%", $i & " of " & $FilesInFolder[0] & " processed...")
Next
FileClose($logfile)
Sleep(750)
ProgressOff()
#cs ----------------------------------------------------------------------------
AutoIt Version: 3.2.10.0
Author: WeaponX
Updated: 2/21/08
Script Function: Recursive file search
2/21/08 - Added pattern for folder matching, flag for return type
1/24/08 - Recursion is now optional
Parameters:
RFSstartdir: Path to starting folder
RFSFilepattern: RegEx pattern to match
".(mp3)" - Find all mp3 files - case sensitive (by default)
"(?i).(mp3)" - Find all mp3 files - case insensitive
"(?-i).(mp3|txt)" - Find all mp3 and txt files - case sensitive
RFSFolderpattern:
"(Music|Movies)" - Only match folders named Music or Movies - case sensitive (by default)
"(?i)(Music|Movies)" - Only match folders named Music or Movies - case insensitive
"(?!(Music|Movies)B)b.+" - Match folders NOT named Music or Movies - case sensitive (by default)
RFSFlag: Specifies what is returned in the array
0 - Files and folders
1 - Files only
2 - Folders only
RFSrecurse: TRUE = Recursive, FALSE = Non-recursive
RFSdepth: Internal use only
#ce ----------------------------------------------------------------------------
Func RecursiveFileSearch($RFSstartDir, $RFSFilepattern = ".", $RFSFolderpattern = ".", $RFSFlag = 1, $RFSrecurse = True, $RFSdepth = 0)
;Ensure starting folder has a trailing slash
If StringRight($RFSstartDir, 1) <> "" Then $RFSstartDir &= ""
If $RFSdepth = 0 Then
  ;Get count of all files in subfolders for initial array definition
  $RFSfilecount = DirGetSize($RFSstartDir, 1)
  ;File count + folder count (will be resized when the function returns)
  Global $RFSarray[$RFSfilecount[1] + $RFSfilecount[2] + 1]
EndIf
$RFSsearch = FileFindFirstFile($RFSstartDir & "*.*")
If @error Then Return
;Search through all files and folders in directory
While 1
  $RFSnext = FileFindNextFile($RFSsearch)
  If @error Then ExitLoop
  ;If folder and recurse flag is set and regex matches
  If StringInStr(FileGetAttrib($RFSstartDir & $RFSnext), "D") Then
   If $RFSrecurse And StringRegExp($RFSnext, $RFSFolderpattern, 0) Then
    RecursiveFileSearch($RFSstartDir & $RFSnext, $RFSFilepattern, $RFSFolderpattern, $RFSFlag, $RFSrecurse, $RFSdepth + 1)
    If $RFSFlag <> 1 Then
     ;Append folder name to array
     $RFSarray[$RFSarray[0] + 1] = $RFSstartDir & $RFSnext
     $RFSarray[0] += 1
    EndIf
   EndIf
  ElseIf StringRegExp($RFSnext, $RFSFilepattern, 0) And $RFSFlag <> 2 Then
   ;Append file name to array
   $RFSarray[$RFSarray[0] + 1] = $RFSstartDir & $RFSnext
   $RFSarray[0] += 1
  EndIf
WEnd
FileClose($RFSsearch)
If $RFSdepth = 0 Then
  ReDim $RFSarray[$RFSarray[0] + 1]
  Return $RFSarray
EndIf
EndFunc   ;==>RecursiveFileSearch
Edited by crazycrash

Share this post


Link to post
Share on other sites

Greetings kylomas. Although I know you are right, since the contents of $tBuffer will automatically change once the loop is executed, still I didn't face much problems. Furthermore, I created this script around an year ago, and I don't bother much with it... Still, many cheers and thanks for your correction.


----------------------------------------

:bye: Hey there, was I helpful?

----------------------------------------

My Current OS: Win8 PRO (64-bit); Current AutoIt Version: v3.3.8.1

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0