Sign in to follow this  
Followers 0
Geir1983

Read file to (huge) array

20 posts in this topic

#1 ·  Posted (edited)

Hi

After the new 3.3.10 discontiuation of the PluginOpen i need to change my script a bit. I guess i can just use dllcall, but would be cool to just convert my dll into autoit.

What i need to do is read a big file containing an array of 32487834 32bit integers and place them in an array. It seems autoit only allows 16 million elements in an array so i was thinking splitting it over 2 or more arrays. My problem is how to do this, is there a function like memcopy or something i could use?

What i have so far:

; Create 32 bit int array
; Local $HR[32487834] Too big! 
Local $HR1[10829278]
Local $HR2[10829278]
Local $HR3[10829278]

For $idx=0 to UBound($HR1)-1
    Int($HR1[$idx], 1)
Next
$HR2 = $HR1
$HR3 = $HR1

; Read data from disk ¨123mb
$Data = FileReadToArray(FileOpen(@ScriptDir & "\HandRanks.dat" , $FO_BINARY)) ;

;Now insert $Data to $HR arrays

For reference this is the DLL:

int HR[32487834];

int InitData()
{
    memset(HR, 0, sizeof(HR));
    FILE * fin = fopen(“HANDRANKS.DAT”, “rb”);
    size_t bytesread = fread(HR, sizeof(HR), 1, fin);
    fclose(fin);
}

int GetHandValue(int* pCards)
{
    int p = HR[53 + *pCards++];
    p = HR[p + *pCards++];
    p = HR[p + *pCards++];
    p = HR[p + *pCards++];
    p = HR[p + *pCards++];
    p = HR[p + *pCards++];
    return HR[p + *pCards++];
}

Edited by Geir1983

Share this post


Link to post
Share on other sites

You might be better off using DllStructs, don't thing they have a limit.

for example. if your file is structured like so....

Test.txt

123
456
789
123
456
789
123
456
789
123
456
789

Then...

$strStruct = "int[32487834]"

$intStruct = DllStructCreate($strStruct)
$sFile = "Test.txt"

_FileToStruct($sFile, $intStruct)

For $i = 1 To 20
    ConsoleWrite(DllStructGetData($intStruct, 1, $i) & @LF)
Next

Func _FileToStruct($file, ByRef $struct)

    If Not IsDllStruct($struct) Then
        Exit MsgBox(0, "Error", "Not struct")
    EndIf

    $size = FileGetSize($file)
    $open = FileOpen($file)
    $index = 1

    Do
        $line = FileReadLine($open)
        DllStructSetData($struct, 1, Int($line), $index)
        $index += 1
    Until FileGetPos($open) >= $size

    FileClose($open)

EndFunc   ;==>_FileToStruct

AutoIt Absolute Beginners    Require a serial    Pause Script    Video Tutorials by Morthawt   ipify 

Monkey's are, like, natures humans.

Share this post


Link to post
Share on other sites

Thanks, this worked :)

My file is all binary so had to do some small changes to your example, reading 4 bytes at a time instead of line by line and then placing it into the array. However, it takes alot of time to read all the elements, about 130 seconds. Reading just the file itselft takes about 3-4 seconds. Is there a smarter way to populate the dllstruct instead of one element at the time?

Reading the array is not super fast, but I can do 140k quries (pluss some additional code) in about 2 seconds so im pretty happy with that. It is pretty close to what I had with PluginOpen (maybe even faster, but comparing on different machines).

$strStruct = "int[32487834]"

Global $intStruct = DllStructCreate($strStruct)
$sFile = @scriptdir & "\HandRanks.dat"

_FileToStruct($sFile, $intStruct)

Func _FileToStruct($file, ByRef $struct)

    If Not IsDllStruct($struct) Then
        Exit MsgBox(0, "Error", "Not struct")
    EndIf

    $size = FileGetSize($file)
    $open = FileOpen($file, $FO_BINARY)
    $index = 0
    $begin = TimerInit()
    Do

        $line = FileRead($open, 4)
        FileSetPos($line, $index*4, $FILE_BEGIN)
        DllStructSetData($struct, 1, Int($line), $index)
        $index += 1
    Until FileGetPos($open) >= $size

    $dif = TimerDiff($begin)
    ConsoleWrite("File placed in array in (ms): " & $dif & ", Number of elements read: " & $index & @CRLF)
    FileClose($open)

EndFunc   ;==>_FileToStruct

Share this post


Link to post
Share on other sites

#5 ·  Posted (edited)

yes, 32 bit integer always occupy 4 bytes in memory (binary mode).

Edit: This means my file is identical to the way the array is stored as a variable in memory. It will always have allocted the same amount of bytes (elements * size) in memory. (This gives also the same filesize no matter the content in the array, all elements with zero value still occupies the same 123mb (32487834 elements * 4 bytes). Some languages have functions for copying the memory contents directly to another area, but have not seen this in Autoit, it also seems different to this DllStruct.

Edited by Geir1983

Share this post


Link to post
Share on other sites

#7 ·  Posted (edited)

Hmm, that might work, ill test it.

I tried it now, but it is alot slower to read the data from the file than from the dllstruct. It is also alot more complex to calculate the address to read.with 140k queries (this is the most important part)timer went from 2 seconds to 8 seconds so its no good.

Is it possible to read all of the file to a variable and access it like fileread does? (with an offset and size?)

Edit: Actually i could remove some statements and improve the fileread method to 2.9s by removing INT() and redundant multiplying (from 1.9 when using the dllstruct). Maybe you have some more tips on optimizing?

Edit2: Nvm, I think i need the INT().. when i put it back in there im back to 8,5 sec

DllStruct:

Func AnalyzeHand(Byref $Card1)
    $p = DllStructGetData($intStruct, 1, 53 + $Card1[0])
    $p = DllStructGetData($intStruct, 1, $p + $Card1[1])
    $p = DllStructGetData($intStruct, 1, $p + $Card1[2])
    $p = DllStructGetData($intStruct, 1, $p + $Card1[3])
    $p = DllStructGetData($intStruct, 1, $p + $Card1[4])
    $p = DllStructGetData($intStruct, 1, $p + $Card1[5])
    $p = DllStructGetData($intStruct, 1, $p + $Card1[6])
    Return DllStructGetData($intStruct, 1, $p)
EndFunc

FileRead:

Func AnalyzeHand(Byref $Card1)
    FileSetPos($open, (53 + $Card1[0])*4, $FILE_BEGIN)
    $p = Int(FileRead($open, 4))
    FileSetPos($open, ($p+$Card1[1])*4, $FILE_BEGIN)
    $p = Int(FileRead($open, 4))
    FileSetPos($open, ($p+$Card1[2])*4, $FILE_BEGIN)
    $p = Int(FileRead($open, 4))
    FileSetPos($open, ($p+$Card1[3])*4, $FILE_BEGIN)
    $p = Int(FileRead($open, 4))
    FileSetPos($open, ($p+$Card1[4])*4, $FILE_BEGIN)
    $p = Int(FileRead($open, 4))
    FileSetPos($open, ($p+$Card1[5])*4, $FILE_BEGIN)
    $p = Int(FileRead($open, 4))
    FileSetPos($open, ($p+$Card1[6])*4, $FILE_BEGIN)
    Return Int(FileRead($open, 4))
EndFunc
Edited by Geir1983

Share this post


Link to post
Share on other sites

The Dll*() functions should only be used for APIs and not internally in AutoIt.


_AdapterConnections()_AlwaysRun()_AppMon()_AppMonEx()_BinaryBin()_CheckMsgBox()_CmdLineRaw()_ContextMenu()_ConvertLHWebColor()/_ConvertSHWebColor()_DesktopDimensions()_DisplayPassword()_DotNet_Load()/_DotNet_Unload()_Fibonacci()_FileCompare()_FileCompareContents()_FileNameByHandle()_FilePrefix/SRE()_FindInFile()_GetBackgroundColor()/_SetBackgroundColor()_GetConrolID()_GetCtrlClass()_GetDirectoryFormat()_GetDriveMediaType()_GetFilename()/_GetFilenameExt()_GetHardwareID()_GetIP()_GetIP_Country()_GetOSLanguage()_GetSavedSource()_GetStringSize()_GetSystemPaths()_GetURLImage()_GIFImage()_GoogleWeather()_GUICtrlCreateGroup()_GUICtrlListBox_CreateArray()_GUICtrlListView_CreateArray()_GUICtrlListView_SaveCSV()_GUICtrlListView_SaveHTML()_GUICtrlListView_SaveTxt()_GUICtrlListView_SaveXML()_GUICtrlMenu_Recent()_GUICtrlMenu_SetItemImage()_GUICtrlTreeView_CreateArray()_GUIDisable()_GUIImageList_SetIconFromHandle()_GUIRegisterMsg()_GUISetIcon()_Icon_Clear()/_Icon_Set()_IdleTime()_InetGet()_InetGetGUI()_InetGetProgress()_IPDetails()_IsFileOlder()_IsGUID()_IsHex()_IsPalindrome()_IsRegKey()_IsStringRegExp()_IsSystemDrive()_IsUPX()_IsValidType()_IsWebColor()_Language()_Log()_MicrosoftInternetConnectivity()_MSDNDataType()_PathFull/GetRelative/Split()_PathSplitEx()_PrintFromArray()_ProgressSetMarquee()_ReDim()_RockPaperScissors()/_RockPaperScissorsLizardSpock()_ScrollingCredits_SelfDelete()_SelfRename()_SelfUpdate()_SendTo()_ShellAll()_ShellFile()_ShellFolder()_SingletonHWID()_SingletonPID()_Startup()_StringCompact()_StringIsValid()_StringRegExpMetaCharacters()_StringReplaceWholeWord()_StringStripChars()_Temperature()_TrialPeriod()_UKToUSDate()/_USToUKDate()_WinAPI_Create_CTL_CODE()_WinAPI_CreateGUID()_WMIDateStringToDate()/_DateToWMIDateString()Au3 script parsingAutoIt SearchAutoIt3 PortableAutoIt3WrapperToPragmaAutoItWinGetTitle()/AutoItWinSetTitle()CodingDirToHTML5FileInstallrFileReadLastChars()GeoIP databaseGUI - Only Close ButtonGUI ExamplesGUICtrlDeleteImage()GUICtrlGetBkColor()GUICtrlGetStyle()GUIEventsGUIGetBkColor()Int_Parse() & Int_TryParse()IsISBN()LockFile()Mapping CtrlIDsOOP in AutoItParseHeadersToSciTE()PasswordValidPasteBinPosts Per DayPreExpandProtect GlobalsQueue()Resource UpdateResourcesExSciTE JumpSettings INISHELLHOOKShunting-YardSignature CreatorStack()Stopwatch()StringAddLF()/StringStripLF()StringEOLToCRLF()VSCROLLWM_COPYDATAMore Examples...

Updated: 04/09/2015

Share this post


Link to post
Share on other sites

Then do you have any other suggestion for the 32487834 element array?

Share this post


Link to post
Share on other sites

SQL database?


_AdapterConnections()_AlwaysRun()_AppMon()_AppMonEx()_BinaryBin()_CheckMsgBox()_CmdLineRaw()_ContextMenu()_ConvertLHWebColor()/_ConvertSHWebColor()_DesktopDimensions()_DisplayPassword()_DotNet_Load()/_DotNet_Unload()_Fibonacci()_FileCompare()_FileCompareContents()_FileNameByHandle()_FilePrefix/SRE()_FindInFile()_GetBackgroundColor()/_SetBackgroundColor()_GetConrolID()_GetCtrlClass()_GetDirectoryFormat()_GetDriveMediaType()_GetFilename()/_GetFilenameExt()_GetHardwareID()_GetIP()_GetIP_Country()_GetOSLanguage()_GetSavedSource()_GetStringSize()_GetSystemPaths()_GetURLImage()_GIFImage()_GoogleWeather()_GUICtrlCreateGroup()_GUICtrlListBox_CreateArray()_GUICtrlListView_CreateArray()_GUICtrlListView_SaveCSV()_GUICtrlListView_SaveHTML()_GUICtrlListView_SaveTxt()_GUICtrlListView_SaveXML()_GUICtrlMenu_Recent()_GUICtrlMenu_SetItemImage()_GUICtrlTreeView_CreateArray()_GUIDisable()_GUIImageList_SetIconFromHandle()_GUIRegisterMsg()_GUISetIcon()_Icon_Clear()/_Icon_Set()_IdleTime()_InetGet()_InetGetGUI()_InetGetProgress()_IPDetails()_IsFileOlder()_IsGUID()_IsHex()_IsPalindrome()_IsRegKey()_IsStringRegExp()_IsSystemDrive()_IsUPX()_IsValidType()_IsWebColor()_Language()_Log()_MicrosoftInternetConnectivity()_MSDNDataType()_PathFull/GetRelative/Split()_PathSplitEx()_PrintFromArray()_ProgressSetMarquee()_ReDim()_RockPaperScissors()/_RockPaperScissorsLizardSpock()_ScrollingCredits_SelfDelete()_SelfRename()_SelfUpdate()_SendTo()_ShellAll()_ShellFile()_ShellFolder()_SingletonHWID()_SingletonPID()_Startup()_StringCompact()_StringIsValid()_StringRegExpMetaCharacters()_StringReplaceWholeWord()_StringStripChars()_Temperature()_TrialPeriod()_UKToUSDate()/_USToUKDate()_WinAPI_Create_CTL_CODE()_WinAPI_CreateGUID()_WMIDateStringToDate()/_DateToWMIDateString()Au3 script parsingAutoIt SearchAutoIt3 PortableAutoIt3WrapperToPragmaAutoItWinGetTitle()/AutoItWinSetTitle()CodingDirToHTML5FileInstallrFileReadLastChars()GeoIP databaseGUI - Only Close ButtonGUI ExamplesGUICtrlDeleteImage()GUICtrlGetBkColor()GUICtrlGetStyle()GUIEventsGUIGetBkColor()Int_Parse() & Int_TryParse()IsISBN()LockFile()Mapping CtrlIDsOOP in AutoItParseHeadersToSciTE()PasswordValidPasteBinPosts Per DayPreExpandProtect GlobalsQueue()Resource UpdateResourcesExSciTE JumpSettings INISHELLHOOKShunting-YardSignature CreatorStack()Stopwatch()StringAddLF()/StringStripLF()StringEOLToCRLF()VSCROLLWM_COPYDATAMore Examples...

Updated: 04/09/2015

Share this post


Link to post
Share on other sites

Take no notice of guinness, he's an old fuddy duddy. :P

Have you considered converting your dat file to SQLite db?

That's pretty quick.


_AdapterConnections()_AlwaysRun()_AppMon()_AppMonEx()_BinaryBin()_CheckMsgBox()_CmdLineRaw()_ContextMenu()_ConvertLHWebColor()/_ConvertSHWebColor()_DesktopDimensions()_DisplayPassword()_DotNet_Load()/_DotNet_Unload()_Fibonacci()_FileCompare()_FileCompareContents()_FileNameByHandle()_FilePrefix/SRE()_FindInFile()_GetBackgroundColor()/_SetBackgroundColor()_GetConrolID()_GetCtrlClass()_GetDirectoryFormat()_GetDriveMediaType()_GetFilename()/_GetFilenameExt()_GetHardwareID()_GetIP()_GetIP_Country()_GetOSLanguage()_GetSavedSource()_GetStringSize()_GetSystemPaths()_GetURLImage()_GIFImage()_GoogleWeather()_GUICtrlCreateGroup()_GUICtrlListBox_CreateArray()_GUICtrlListView_CreateArray()_GUICtrlListView_SaveCSV()_GUICtrlListView_SaveHTML()_GUICtrlListView_SaveTxt()_GUICtrlListView_SaveXML()_GUICtrlMenu_Recent()_GUICtrlMenu_SetItemImage()_GUICtrlTreeView_CreateArray()_GUIDisable()_GUIImageList_SetIconFromHandle()_GUIRegisterMsg()_GUISetIcon()_Icon_Clear()/_Icon_Set()_IdleTime()_InetGet()_InetGetGUI()_InetGetProgress()_IPDetails()_IsFileOlder()_IsGUID()_IsHex()_IsPalindrome()_IsRegKey()_IsStringRegExp()_IsSystemDrive()_IsUPX()_IsValidType()_IsWebColor()_Language()_Log()_MicrosoftInternetConnectivity()_MSDNDataType()_PathFull/GetRelative/Split()_PathSplitEx()_PrintFromArray()_ProgressSetMarquee()_ReDim()_RockPaperScissors()/_RockPaperScissorsLizardSpock()_ScrollingCredits_SelfDelete()_SelfRename()_SelfUpdate()_SendTo()_ShellAll()_ShellFile()_ShellFolder()_SingletonHWID()_SingletonPID()_Startup()_StringCompact()_StringIsValid()_StringRegExpMetaCharacters()_StringReplaceWholeWord()_StringStripChars()_Temperature()_TrialPeriod()_UKToUSDate()/_USToUKDate()_WinAPI_Create_CTL_CODE()_WinAPI_CreateGUID()_WMIDateStringToDate()/_DateToWMIDateString()Au3 script parsingAutoIt SearchAutoIt3 PortableAutoIt3WrapperToPragmaAutoItWinGetTitle()/AutoItWinSetTitle()CodingDirToHTML5FileInstallrFileReadLastChars()GeoIP databaseGUI - Only Close ButtonGUI ExamplesGUICtrlDeleteImage()GUICtrlGetBkColor()GUICtrlGetStyle()GUIEventsGUIGetBkColor()Int_Parse() & Int_TryParse()IsISBN()LockFile()Mapping CtrlIDsOOP in AutoItParseHeadersToSciTE()PasswordValidPasteBinPosts Per DayPreExpandProtect GlobalsQueue()Resource UpdateResourcesExSciTE JumpSettings INISHELLHOOKShunting-YardSignature CreatorStack()Stopwatch()StringAddLF()/StringStripLF()StringEOLToCRLF()VSCROLLWM_COPYDATAMore Examples...

Updated: 04/09/2015

Share this post


Link to post
Share on other sites

Hmm, i guess it could be done, but i doubt it is faster to query a SQLite db than an DLLStruct from autoit? It would eliminate the long time needed to populate the array at startup though..

Share this post


Link to post
Share on other sites

Looks like the need to evaluate card hands (poker?).

Even assuming this is legit, an array won't do since AutoIt arrays are limited to 224 entries.

Again assuming legit application, DllStruct is the fastest way here.


This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites

Geir1983, have you tried something like this:

 

$hFile = FileOpen( "file", 16 ) ; 16 = binary mode
$sBinData = FileRead( $hFile )
$iBytes = @extended
FileClose( $hFile )

$tInts = DllStructCreate( "int[" & $iBytes/4 & "]" )
$pInts = DllStructGetPtr( $tInts )
$tBytes = DllStructCreate( "byte[" & $iBytes & "]", $pInts )
DllStructSetData( $tBytes, 1, $sBinData )

Now should $tInts contain your integers.

1 person likes this

Share this post


Link to post
Share on other sites

Lars, that is exactly the magic i was looking for, array populated in no time. Thanks :sorcerer:

John, i did a test populating the DllStruct (one by one) without FileRead operations (just inserted a constant value in every element), it still took nearly a minute. Thanks for all your help!  :thumbsup: 

Oh and btw this is legit:

'?do=embed' frameborder='0' data-embedContent>>

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0