Sign in to follow this  
Followers 0

Overlapped (Asynchronous) & UnBuffered ReadFile Example

3 posts in this topic

#1 ·  Posted (edited)

HiHo Forum,

in my program SMF - Search my Files I use a special method to calculate fake (and fast) md5 hashes to identify duplicate files. With the standard settings SMF reads the first 8KB, 8KB from the middle and the last 8KB from a file and calculates a md5 hash on the data. Now I've noticed that the larger the files, the slower the calculation (makes sense :D). I assume that's because also the FileSetPointer() operation takes it's time for large files. Calculation drops from somewhere about 300 files / sec to 30 files / sec (of course depending on the filesize and also your machine's power).

Now I thought about a way to further improve search & hashing speed and stumbled over the possibility for an "Overlapped" (that's Microsoft's term for asynchronous) file access. You request several portions of a file and wait for the results to show up, the ReadFile() function itself returns instantaneously and does not wait for the return buffer to be filled (opposed to the standard behavior where the function only returns when the operation has been finished).

Currently in SMF I'm requesting the 3 blocks of data and after they've been read I calculate the hash. What I will try out and test is to request all 3 blocks in overlapped mode and start calculating the hash for each block independently (maybe even in a different thread as pointed out by Ward in :oops: ) as soon as they pop up. My assumption is that this might improve the performance for large files (> 10MB? I just have to give it a shoot), for smaller files I think the standard operation should be superior.

As I've not found any example for asynchronous file access on the forum I thought I just post the WIP code for those interested to take a look and help me improve it :rip:... It's just a crude and raw example, but at least it works (on my machines anyhow).

Yashied's most excellent is required for the example to work.

#region ;**** Directives created by AutoIt3Wrapper_GUI ****
#endregion ;**** Directives created by AutoIt3Wrapper_GUI ****
; [url=""][/url]
; [url=""][/url]
; [url=""][/url]
#include <StructureConstants.au3>
#include <WinAPIEx.au3>
#include <Memory.au3>
#include <array.au3>

Global $nBytes, $hFile
Global Const $ERROR_IO_INCOMPLETE = 996 ; Overlapped I/O event is not in a signaled state
Global Const $ERROR_IO_PENDING = 997 ; Overlapped I/O operation is in progress
$sFile = FileOpenDialog("Select a large file to open for overlapped (asynchronous) reading...", StringLeft(@WindowsDir, 3), "All (*.*)", 3)
If @error Then Exit
$sFile = StringReplace($sFile, "|", @CRLF)
If FileGetSize($sFile) < 1024 * 100 Then
    MsgBox(0, "", "Larger than 100kb would make sense...")
$aDrive = _WinAPI_GetDriveNumber(StringLeft($sFile, 2))
$aData = _WinAPI_GetDriveGeometryEx_RO($aDrive[1])
ConsoleWrite(_WinAPI_GetLastErrorMessage() & @TAB & @error & @CRLF)
; 'Bytes per Sector: ' & $aData[4]
; File access sizes, including the optional file offset in the OVERLAPPED structure, if specified,
; must be for a number of bytes that is an integer multiple of the volume sector size
$iBytesToRead = 16 * $aData[4]; ~ 8.192 bytes with 512 bytes per sector
; Because buffer addresses for read and write operations must be sector-aligned, the application must have direct control of how these buffers are allocated.
; One way to sector-align buffers is to use the VirtualAlloc function to allocate the buffers
$pBuffer_Mem = _MemVirtualAlloc(0, $iBytesToRead, $MEM_COMMIT, $PAGE_READWRITE) ;
$tBuffer = DllStructCreate("byte[" & $iBytesToRead & "];byte[" & $iBytesToRead & "];byte[" & $iBytesToRead & "]")
; $GENERIC_READ = 0x80000000
; $FILE_FLAG_OVERLAPPED = 0x40000000
; $FILE_FLAG_NO_BUFFERING = 0x20000000
Global Const $FILE_FLAG_OVERLAPPED = 0x40000000
Global Const $FILE_FLAG_NO_BUFFERING = 0x20000000
$iTimer = TimerInit()
$iFileGetSize = _WinAPI_GetFileSizeEx($hFile)
;ConsoleWrite(@CRLF & "+ Filesize " & $iFileGetSize & @CRLF & @CRLF)
$iTimer_0 = TimerDiff($iTimer)
; Global Const $tagOVERLAPPED = "int Internal;int InternalHigh;int Offset;int OffsetHigh;int hEvent"
$tOverlapped1 = DllStructCreate($tagOVERLAPPED)
$pOverlapped1 = DllStructGetPtr($tOverlapped1)
_WinAPI_ReadFile($hFile, DllStructGetPtr($tBuffer, 1), $iBytesToRead, $nBytes, $pOverlapped1)
$iTimer_1 = TimerDiff($iTimer)
$sTimer_1 = _WinAPI_GetLastError()
$tOverlapped2 = DllStructCreate($tagOVERLAPPED)
$pOverlapped2 = DllStructGetPtr($tOverlapped2)
$iOffset = Int(($iFileGetSize / 2) - ($iBytesToRead / 2 + 1))
$iOffset = (Floor($iOffset / $aData[4])) * $aData[4]
DllStructSetData($tOverlapped2, "Offset", _WinAPI_LoDWord($iOffset)) ; Setting "Filepointer" to the middle of the file (SetFilePointer is not valid for overlapped operations)
DllStructSetData($tOverlapped2, "OffsetHigh", _WinAPI_HiDWord($iOffset))
_WinAPI_ReadFile($hFile, DllStructGetPtr($tBuffer, 2), $iBytesToRead, $nBytes, $pOverlapped2)
$iTimer_2 = TimerDiff($iTimer)
$sTimer_2 = _WinAPI_GetLastError()
$tOverlapped3 = DllStructCreate($tagOVERLAPPED)
$pOverlapped3 = DllStructGetPtr($tOverlapped3)
$iOffset = $iFileGetSize - $iBytesToRead
$iOffset = (Floor($iOffset / $aData[4])) * $aData[4]
DllStructSetData($tOverlapped3, "Offset", _WinAPI_LoDWord($iOffset)) ; Setting "Filepointer" to the end of the file (SetFilePointer is not valid for overlapped operations)
DllStructSetData($tOverlapped3, "OffsetHigh", _WinAPI_HiDWord($iOffset))
_WinAPI_ReadFile($hFile, DllStructGetPtr($tBuffer, 3), $iBytesToRead, $nBytes, $pOverlapped3)
$iTimer_3 = TimerDiff($iTimer)
$sTimer_3 = _WinAPI_GetLastError()
$sRes1 = _WinAPI_GetOverlappedResult($hFile, $pOverlapped1, $nBytes)
$sRes1 = $sRes1 & @TAB & $nBytes
$iTimer_4 = TimerDiff($iTimer)
$sTimer_4 = _WinAPI_GetLastError()
$sRes2 = _WinAPI_GetOverlappedResult($hFile, $pOverlapped2, $nBytes)
$sRes2 = $sRes2 & @TAB & $nBytes
$iTimer_5 = TimerDiff($iTimer)
$sTimer_5 = _WinAPI_GetLastError()
$sRes3 = _WinAPI_GetOverlappedResult($hFile, $pOverlapped3, $nBytes)
$sRes3 = $sRes3 & @TAB & $nBytes
$iTimer_6 = TimerDiff($iTimer)
$sTimer_6 = _WinAPI_GetLastError()
$iTimer_7 = TimerDiff($iTimer)
$sTimer_7 = _WinAPI_GetLastError()
$sRes4 = _WinAPI_GetOverlappedResult($hFile, $pOverlapped1, $nBytes)
$sRes4 = $sRes4 & @TAB & $nBytes
$sRes5 = _WinAPI_GetOverlappedResult($hFile, $pOverlapped2, $nBytes)
$sRes5 = $sRes5 & @TAB & $nBytes
$sRes6 = _WinAPI_GetOverlappedResult($hFile, $pOverlapped3, $nBytes)
$sRes6 = $sRes6 & @TAB & $nBytes

_MemVirtualFree($pBuffer_Mem, $iBytesToRead, $MEM_DECOMMIT)
$iTimer = TimerInit()
$t_ReadFile_Standard = _ReadFile_Standard($sFile, $iFileGetSize)
ConsoleWrite("_WinAPI_CloseHandle - Before " & $sRes1 & @TAB & $sRes2 & @TAB & $sRes3 & @CRLF)
ConsoleWrite("_WinAPI_CloseHandle - After " & $sRes4 & @TAB & $sRes5 & @TAB & $sRes6 & @CRLF & @CRLF)
ConsoleWrite($iTimer_0 & @CRLF & $iTimer_1 & @CRLF & $iTimer_2 & @CRLF & $iTimer_3 & @CRLF & $iTimer_4 & @CRLF & $iTimer_5 & @CRLF & $iTimer_6 & @CRLF & $iTimer_7 & @CRLF & @CRLF & TimerDiff($iTimer) & @CRLF & @CRLF)
ConsoleWrite($sTimer_1 & @CRLF & $sTimer_2 & @CRLF & $sTimer_3 & @CRLF & $sTimer_4 & @CRLF & $sTimer_5 & @CRLF & $sTimer_6 & @CRLF & $sTimer_7 & @CRLF)
MsgBox(0, "", StringLeft(DllStructGetData($tBuffer, 1), 5) & StringRight(DllStructGetData($tBuffer, 1), 5) & @CRLF _
         & StringLeft(DllStructGetData($tBuffer, 2), 5) & StringRight(DllStructGetData($tBuffer, 2), 5) & @CRLF _
         & StringLeft(DllStructGetData($tBuffer, 3), 5) & StringRight(DllStructGetData($tBuffer, 3), 5) & @CRLF _
         & @CRLF & @CRLF _
         & StringLeft(DllStructGetData($t_ReadFile_Standard, 1), 5) & StringRight(DllStructGetData($t_ReadFile_Standard, 1), 5) & @CRLF _
         & StringLeft(DllStructGetData($t_ReadFile_Standard, 2), 5) & StringRight(DllStructGetData($t_ReadFile_Standard, 2), 5) & @CRLF _
         & StringLeft(DllStructGetData($t_ReadFile_Standard, 3), 5) & StringRight(DllStructGetData($t_ReadFile_Standard, 3), 5))
Func _ReadFile_Standard($sFile, $iFileGetSize, $iFlag = 0)
    ; Local $hFile = _WinAPI_CreateFile($Checksum_Filename, 2, 2, 7), $Checksum_Result, $nBytes
    ; FILE_FLAG_SEQUENTIAL_SCAN = 0x08000000
    Local $hFile = _WinAPI_CreateFileEx($sFile, 3, $GENERIC_READ, 7, $iFlag), $nBytes
    If $hFile = 0 Then Return SetError(4) ;"File was locked and could not be analyzed..."
    Local $tBuffer = DllStructCreate("byte[" & $iBytesToRead & "];byte[" & $iBytesToRead & "];byte[" & $iBytesToRead & "]")
    _WinAPI_ReadFile($hFile, DllStructGetPtr($tBuffer, 1), $iBytesToRead, $nBytes)
    $iOffset = Int(($iFileGetSize / 2) - ($iBytesToRead / 2 + 1))
    $iOffset = (Floor($iOffset / $aData[4])) * $aData[4]
    _WinAPI_SetFilePointerEx($hFile, $iOffset)
    _WinAPI_ReadFile($hFile, DllStructGetPtr($tBuffer, 2), $iBytesToRead, $nBytes)
    $iOffset = $iFileGetSize - $iBytesToRead
    $iOffset = (Floor($iOffset / $aData[4])) * $aData[4]
    _WinAPI_SetFilePointerEx($hFile, $iOffset) ; $iBytesToRead/2 +1
    _WinAPI_ReadFile($hFile, DllStructGetPtr($tBuffer, 3), $iBytesToRead, $nBytes)
    Return $tBuffer
EndFunc   ;==>_ReadFile_Standard

Func _WinAPI_GetDriveGeometryEx_RO($iDrive)

    Local $hFile = _WinAPI_CreateFileEx('.PhysicalDrive' & $iDrive, 3, 0, 0x01)

    If Not $hFile Then
        Return SetError(1, 0, 0)

    Local $tDGEX = DllStructCreate('int64;dword;dword;dword;dword;int64')
    Local $Ret = DllCall('kernel32.dll', 'int', 'DeviceIoControl', 'ptr', $hFile, 'dword', 0x000700A0, 'ptr', 0, 'dword', 0, 'ptr', DllStructGetPtr($tDGEX), 'dword', DllStructGetSize($tDGEX), 'dword*', 0, 'ptr', 0)

    If (@error) Or (Not $Ret[0]) Then
        $Ret = 0
    If Not IsArray($Ret) Then
        Return SetError(2, 0, 0)

    Local $Result[6]

    For $i = 0 To 5
        $Result[$i] = DllStructGetData($tDGEX, $i + 1)
    Return $Result
EndFunc   ;==>_WinAPI_GetDriveGeometryEx_RO


Just wanted to point out that this technique only makes sense in very special cases. The default synchronous FileRead utilizes the internal cache manager and for sure is much faster (and easier) to handle than this asynchronous FileRead method and should be the choice for 99.9% of your needs. For some further info take a look at these pages:

MSDN - ReadFile function

MSDN - Synchronous and Asynchronous I/O

MSKB - Asynchronous Disk I/O Appears as Synchronous

One observation I've made is that it seems like the _WinAPI_CloseHandle($hFile) call is blocking until the operation has been finished. Before that, the progress of the single requests can be monitored using the _WinAPI_GetOverlappedResult() function.

Best Regards

Edited by KaFu

Share this post

Link to post
Share on other sites

Very interesting KaFu. Thanks :D

_AdapterConnections()_AlwaysRun()_AppMon()_AppMonEx()_BinaryBin()_CheckMsgBox()_CmdLineRaw()_ContextMenu()_ConvertLHWebColor()/_ConvertSHWebColor()_DesktopDimensions()_DisplayPassword()_DotNet_Load()/_DotNet_Unload()_Fibonacci()_FileCompare()_FileCompareContents()_FileNameByHandle()_FilePrefix/SRE()_FindInFile()_GetBackgroundColor()/_SetBackgroundColor()_GetConrolID()_GetCtrlClass()_GetDirectoryFormat()_GetDriveMediaType()_GetFilename()/_GetFilenameExt()_GetHardwareID()_GetIP()_GetIP_Country()_GetOSLanguage()_GetSavedSource()_GetStringSize()_GetSystemPaths()_GetURLImage()_GIFImage()_GoogleWeather()_GUICtrlCreateGroup()_GUICtrlListBox_CreateArray()_GUICtrlListView_CreateArray()_GUICtrlListView_SaveCSV()_GUICtrlListView_SaveHTML()_GUICtrlListView_SaveTxt()_GUICtrlListView_SaveXML()_GUICtrlMenu_Recent()_GUICtrlMenu_SetItemImage()_GUICtrlTreeView_CreateArray()_GUIDisable()_GUIImageList_SetIconFromHandle()_GUIRegisterMsg()_GUISetIcon()_Icon_Clear()/_Icon_Set()_IdleTime()_InetGet()_InetGetGUI()_InetGetProgress()_IPDetails()_IsFileOlder()_IsGUID()_IsHex()_IsPalindrome()_IsRegKey()_IsStringRegExp()_IsSystemDrive()_IsUPX()_IsValidType()_IsWebColor()_Language()_Log()_MicrosoftInternetConnectivity()_MSDNDataType()_PathFull/GetRelative/Split()_PathSplitEx()_PrintFromArray()_ProgressSetMarquee()_ReDim()_RockPaperScissors()/_RockPaperScissorsLizardSpock()_ScrollingCredits_SelfDelete()_SelfRename()_SelfUpdate()_SendTo()_ShellAll()_ShellFile()_ShellFolder()_SingletonHWID()_SingletonPID()_Startup()_StringCompact()_StringIsValid()_StringRegExpMetaCharacters()_StringReplaceWholeWord()_StringStripChars()_Temperature()_TrialPeriod()_UKToUSDate()/_USToUKDate()_WinAPI_Create_CTL_CODE()_WinAPI_CreateGUID()_WMIDateStringToDate()/_DateToWMIDateString()Au3 script parsingAutoIt SearchAutoIt3 PortableAutoIt3WrapperToPragmaAutoItWinGetTitle()/AutoItWinSetTitle()CodingDirToHTML5FileInstallrFileReadLastChars()GeoIP databaseGUI - Only Close ButtonGUI ExamplesGUICtrlDeleteImage()GUICtrlGetBkColor()GUICtrlGetStyle()GUIEventsGUIGetBkColor()Int_Parse() & Int_TryParse()IsISBN()LockFile()Mapping CtrlIDsOOP in AutoItParseHeadersToSciTE()PasswordValidPasteBinPosts Per DayPreExpandProtect GlobalsQueue()Resource UpdateResourcesExSciTE JumpSettings INISHELLHOOKShunting-YardSignature CreatorStack()Stopwatch()StringAddLF()/StringStripLF()StringEOLToCRLF()VSCROLLWM_COPYDATAMore Examples...

Updated: 04/09/2015

Share this post

Link to post
Share on other sites

#3 ·  Posted (edited)

Just realized that the example fails on Win7. The reason is, that the _WinAPI_GetDriveGeometryEx() function requires Admin rights to work. I've changed the requested access rights for the _WinAPI_CreateFileEx() in that function from 0x80000000 ($GENERIC_READ) to 0 and it works fine :D. This is also documented in MSDN here ("Note The dwDesiredAccess parameter can be zero...").

Edited by KaFu

Share this post

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  
Followers 0

  • Similar Content

    • SugarBall
      By SugarBall
      Hi guys, i am having an issue reading a notepadd++ file...
      #include <FileConstants.au3> #include <MsgBoxConstants.au3> #include <WinAPIFiles.au3> Example() Func Example() ; Open the file for reading and store the handle to a variable. Local $hFileOpen = FileOpen("test123.html", $FO_READ) If $hFileOpen = -1 Then MsgBox($MB_SYSTEMMODAL, "", "An error occurred when reading the file.") Return False EndIf ; Read the contents of the file using the handle returned by FileOpen. Local $sFileRead = FileRead($hFileOpen) ; Close the handle returned by FileOpen. FileClose($hFileOpen) ; Display the contents of the file. MsgBox($MB_SYSTEMMODAL, "", "Contents of the file:" & @CRLF & $sFileRead) EndFunc ;==>Example It does seem that i get an error but why?
    • drego
      By drego
      It's been requested in the past to have multithreading to which the response was "It would take too much redesigning of Autoit" but what about Async? Multithreading and Async are two different things. This way we could put tasks in the background without having to fork processes. Right? Also better event handling would be nice rather than throwing everything in a while loop we could have some functionality like javascript which seems to be far more responsive and reliable as the more you add to your while loop the less change there is of your "event" getting caught for some reason (At least in my experience).
    • Obviator
      By Obviator
      i've been playing with this for awhile and have finally gotten where i can read the file i want, but i still can't WRITE the characters i need to write. i can open file "Contoso.vuw" and process to the end of file, displaying both the character position number and associated character, IF i open the file as read only. If i open the file as FileWrite("contoso.vuw", 1) the file runs to the end, but displays nothing. If i open as FileWrite("contoso.vuw", 16) i get a binary representation of all characters. i'm stuck.
      i need to replace characters 17, 18 and 19 with whatever the user types in (i can do that part), a 3-character userid. If i can get these three positions done i can do the other positions further down in the file (characters 2247, 2248, and 2249, all represented as "abc").
      Can someone point me in the right direction please.