squishy Posted June 2, 2007 Posted June 2, 2007 Hello! I am trying to read large binary files (specifically, .dbf files) into an array. However, using my current code, this takes FOREVER! I have started learning about using DLLStructs...would using these be any faster when loading each record? Any help would be appreciated... Thanks! expandcollapse popup#include <File.au3> #include <Date.au3> Local $fDBF,$data,$rsHeader[6],$dLastUpdate,$numfields,$temp,$fielddesc[2],$rsData[2],$x,$y,$numrecs,$numbytesheader,$reclen,$pos,$currecord,$curfield,$t $t = _NowCalc() $fDBF = FileOpen("G:\bvinfo.dbf",16) $data = FileRead($fDBF,32);get header $dLastUpdate = "200" & Asc(BinaryToString(BinaryMid($data,2,1))) & "/" & Asc(BinaryToString(BinaryMid($data,3,1))) & "/" & Asc(BinaryToString(BinaryMid($data,4,1))) $numrecs = Dec(Hex(BinaryMid($data,8,1) & BinaryMid($data,7,1) & BinaryMid($data,6,1) & BinaryMid($data,5,1))) $numbytesheader = Dec(Hex(BinaryMid($data,10,1) & BinaryMid($data,9,1))) $numfields = Int(Dec(Hex(BinaryMid($data,10,1) & BinaryMid($data,9,1))) / 32) - 1 $reclen = Dec(Hex(BinaryMid($data,12,1) & BinaryMid($data,11,1))) ReDim $fielddesc[$numfields+1][6] $pos = 32 For $x = 1 To $numfields;load field descriptions $data = FileRead($fDBF,32) $fielddesc[$x][0] = BinaryToString(BinaryMid($data,1,11));field name $fielddesc[$x][1] = BinaryToString(BinaryMid($data,12,1));field type $fielddesc[$x][2] = BinaryToString(BinaryMid($data,13,4));data address $fielddesc[$x][3] = Asc(BinaryToString(BinaryMid($data,17,1)));field length $fielddesc[$x][4] = BinaryToString(BinaryMid($data,18,1));decimal count ;$fielddesc[$x][5] = BinaryToString(BinaryMid($data,19,14));junk $pos = $pos + 32 Next ReDim $rsData[$numrecs+1][$numfields+1] $data = FileRead($fDBF,$numbytesheader - $pos);seek to beginning of data For $x = 1 To $numrecs $rsData[$x][0] = BinaryToString(FileRead($fDBF,$reclen)) If StringLeft($rsData[$x][0],1) <> "*" Then;active record $pos = 2;set start of data (first pos is deleted field) For $y = 1 To $numfields $curfield = StringMid($rsData[$x][0],$pos,$fielddesc[$y][3]) Switch $fielddesc[$y][1] Case "D";date $rsData[$x][$y] = StringLeft($curfield,4) & "/" & StringMid($curfield,5,2) & "/" & StringRight($curfield,2) Case "C";char $rsData[$x][$y] = StringStripWS($curfield,3) Case "N";numeric $rsData[$x][$y] = Number($curfield) Case "L";logical $rsData[$x][$y] = ($curfield = "T" Or $curfield = "Y") Case "@";timestamp $rsData[$x][$y] = $curfield Case "I";long $rsData[$x][$y] = $curfield Case "F";float $rsData[$x][$y] = Number($curfield) Case "O";double $rsData[$x][$y] = Number($curfield) Case "M";memo $rsData[$x][$y] = $curfield EndSwitch $pos = $pos + $fielddesc[$y][3] Next EndIf Next MsgBox(0,"Time Diff",_DateDiff("s",$t,_NowCalc()));display time to load file FileClose($fDBF) Troy.
randallc Posted June 2, 2007 Posted June 2, 2007 (edited) Hi, [EDIT] PS. if it is possible to do a single fileread and then handle the resulting binaryresult with binary mid etc?; do that for speed! - - or do the ones we have work with that binary from reading a whole binary file anyway??] I've not seen that done before; but i am not sure if you can get it faster or not. 1. Is it the read part which is slow, or the processing (the latter could be faster with vbscript subfunctions) 2. Could you possibly send me a sample dbf file to check [i have been trying to get tailRW working for just reading the end of a binary file, and this would help me too] 3. is it dbase4 or some other dbf file? Sorry I don't know your answer, but I would be interested to have a look Best, Randall [The read part would be 40x as fast on a small file; see attached file example] Edited June 3, 2007 by randallc ExcelCOM... AccessCom.. Word2... FileListToArrayNew...SearchMiner... Regexps...SQL...Explorer...Array2D.. _GUIListView...array problem...APITailRW
randallc Posted June 3, 2007 Posted June 3, 2007 PS - See new APITailRW in signature for option of direct binary tail or part read; I still do not recommend it here as for this purpose you need to read the whole file anyway and that would be a lot quicker. Best, randall ExcelCOM... AccessCom.. Word2... FileListToArrayNew...SearchMiner... Regexps...SQL...Explorer...Array2D.. _GUIListView...array problem...APITailRW
squishy Posted June 4, 2007 Author Posted June 4, 2007 Thanks, randallc, for the responses! Basically, I need to work on a few different files, all of which are .dbf format. I will need to essentially run queries on each file and return results to custom forms, adding functionality to some outdated software we use. The biggest problem I am running into is the speed factor...the smallest file we work with is around 9 MB, and the largest is over 100 MB. 1. Is it the read part which is slow, or the processing (the latter could be faster with vbscript subfunctions)I have tried reading the whole file first, which works fairly quickly, and then the processing slows it down greatly. I didn't think of using vbscript functions...just assumes AutoIt would run them just as fast... 2. Could you possibly send me a sample dbf file to check [i have been trying to get tailRW working for just reading the end of a binary file, and this would help me too]Not a problem...just send me a message with an account that can handle a 9MB email attachment... 3. is it dbase4 or some other dbf file?Not sure...it does follow the xbase file format for the header and record formats tho. Once again, thanks for your help! Perhaps we can work together to come up with some mutual solutions! Troy.
randallc Posted June 4, 2007 Posted June 4, 2007 (edited) Hi, As you may see, I fixed the tailRW for binary anyway, so don't need the 9Mb file! [vbscript for string manipulation can be 10-50x faster; never used it for binary, though! - not sure if it is supported] I wonder if the AutoIt inbuilt SqLiteExe Import would import the file [tho I doubt it - perhaps Excel would import it; convert it to csv, then use SqLite?]; the SqLiteExe Import runs at low-level speed, but no idea whether your format would be supported? @Piccaso might know.. [of course if that worked, you could just query it in Excel anyway?..] Best, Randall Edited June 4, 2007 by randallc ExcelCOM... AccessCom.. Word2... FileListToArrayNew...SearchMiner... Regexps...SQL...Explorer...Array2D.. _GUIListView...array problem...APITailRW
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now