Sign in to follow this  
Followers 0
erebus

7Z and ZIP real size

19 posts in this topic

Hello,

I have written a program that extracts 7Z and ZIP cabinets. My problem is that I have not found a way so as to read somehow these cabinets' contents and find the real size of the data contained, so as to know if my target disk has enough space before extracting.

I don't really care of the way to achieve this (either using DLL, COM or whatever), so any suggestion would be appreciated.

Thank you,

P.S. Yes, I 've already searched the forums and found nada. :)

Share this post


Link to post
Share on other sites



Great info, thank you.

However, do you have any idea of how to extract these data?

Share this post


Link to post
Share on other sites

#4 ·  Posted (edited)

Well I tried using this:

;Open in binary mode

$zipFile = FileOpen("somefile.zip", 16)

;Retrieve first 26 bytes

$header = FileRead($zipFile, 26)

;Retrieve 4 bytes representing uncompressed filesize

$uncompressedSizeBinary = BinaryMid ($header, 22)

But then I am stuck because the returned data doesn't convert to a string. I read somewhere that those 4 bytes could be in low/high DWORD format, so I have no clue.

EDIT: Okay it gets even more complicated if you read this text file on the Pkware website:

http://www.pkware.com/documents/casestudies/APPNOTE.TXT

compressed size: (4 bytes)

uncompressed size: (4 bytes)

The size of the file compressed and uncompressed,

respectively. When a decryption header is present it will

be placed in front of the file data and the value of the

compressed file size will include the bytes of the decryption

header. If bit 3 of the general purpose bit flag is set,

these fields are set to zero in the local header and the

correct values are put in the data descriptor and

in the central directory. If an archive is in ZIP64 format

and the value in this field is 0xFFFFFFFF, the size will be

in the corresponding 8 byte ZIP64 extended information

extra field. When encrypting the central directory, if the

local header is not in ZIP64 format and general purpose bit

flag 13 is set indicating masking, the value stored for the

uncompressed size in the Local Header will be zero.

Edited by weaponx

Share this post


Link to post
Share on other sites

A took it a little further but still no luck.

;Open in binary mode
$zipFile = FileOpen("filename.zip", 16)

;Retrieve first 26 bytes
$header = FileRead($zipFile, 26)

$string = ""

$localFileHeaderSignature = BinaryMid ($header, 1, 4)
ConsoleWrite($localFileHeaderSignature & @CRLF)
$string &= "$localFileHeaderSignature: " & $localFileHeaderSignature & @CRLF

$versionNeededToExtract = BinaryMid ($header, 5, 2)
ConsoleWrite($versionNeededToExtract & @CRLF)
$string &= "$versionNeededToExtract: " & $versionNeededToExtract & @CRLF

$generalPurposeBitFlag = BinaryMid ($header, 7, 2)
ConsoleWrite($generalPurposeBitFlag & @CRLF)
$string &= "$generalPurposeBitFlag: " & $generalPurposeBitFlag & @CRLF

$compressionMethod = BinaryMid ($header, 9, 2)
ConsoleWrite($compressionMethod & @CRLF)
$string &= "$compressionMethod: " & $compressionMethod & @CRLF

$lastModFileTime  = BinaryMid ($header, 11, 2)
ConsoleWrite($lastModFileTime & @CRLF)
$string &= "$lastModFileTime: " & $lastModFileTime & @CRLF

$lastModFileDate  = BinaryMid ($header, 13, 2)
ConsoleWrite($lastModFileDate & @CRLF)
$string &= "$lastModFileDate: " & $lastModFileDate & @CRLF

$crc32  = BinaryMid ($header, 15, 4)
ConsoleWrite($crc32  & @CRLF)
$string &= "$crc32: " & $crc32 & @CRLF

$compressedSize  = BinaryMid ($header, 19, 4)
ConsoleWrite($compressedSize  & @CRLF)
$string &= "$compressedSize: " & $compressedSize & @CRLF

$uncompressedSize  = BinaryMid ($header, 23, 4)
ConsoleWrite($uncompressedSize  & @CRLF)
$string &= "$uncompressedSize: " & $uncompressedSize & @CRLF

MsgBox(0,"","Actual compressed size: " & FileGetSize ("filename.zip") & " bytes" & @CRLF & @CRLF & $string)

Share this post


Link to post
Share on other sites

I had a sitaution where I needed info from zip files and I know I ended up using the command line version of 7zip (7za.exe) but I don't remember precisly how I returned the info I needed. I am sure that 7za will do it though.


George

Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.

Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.***

The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number.

Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else.

"Old age and treachery will always overcome youth and skill!"

Share this post


Link to post
Share on other sites

Thank you all for your help.

Ok, a command line executable would be a very dirty way to achieve this. I have been searching and testing around and after many hours, came up with this:

$dll = DllOpen("7-zip32.dll")
$steerwheel = DllCall($dll, "int", "SevenZipOpenArchive", "hwnd", 0, "str", "somefile.zip", "dword", 0)
$check = DllCall($dll, "int", "SevenZipGetArcFileSize", "int", $steerwheel[0])
$closearc = DllCall($dll, "int", "SevenZipCloseArchive", "str", $steerwheel[0])
DllClose($dll)
MsgBox(0,"",$check[0])

However this is not what I want, as the most I managed was to get the size of the compressed ZIP file in the above example. In any case, for someone with very limited DllCall skills like me, it is something...

I would like to achieve this result with 7-zip32.dll, as I am using it at the moment to other parts of my code for other jobs. However this is beyond my skills.

If anyone want to help, here is the dll file and here is the english documentation for it. Some info you may find also here.

Thank you all,

Share this post


Link to post
Share on other sites

Have you looked at

DWORD WINAPI SevenZipGetOriginalSize(HARC _harc) in the docs


George

Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.

Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.***

The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number.

Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else.

"Old age and treachery will always overcome youth and skill!"

Share this post


Link to post
Share on other sites

Yes, but didn't manage to get it work. Any ideas?

Share this post


Link to post
Share on other sites

#11 ·  Posted (edited)

<CODE REMOVED> See here:

#626436

Edited by weaponx

Share this post


Link to post
Share on other sites

#12 ·  Posted (edited)

Here is another version that uses the Win API:

<CODE REMOVED> See here:

#626436

Edited by weaponx

Share this post


Link to post
Share on other sites

... come on weaponx, wtf is that?!? :)

All that _WinAPI_ReadFile() ??? Align the structure an call it just once.

btw, you are using signed instead of unsigned types and as far as I know every zip file is made of as much parts as the number of files inside it (one zip file with two compressed files inside is in fact two merged zip files). There is no "main header" to get size from, you must collect data from all of the files inside some zip file (this should be done from central directory of the zip file and file headers there, not from local file header like you did)


♡♡♡

.

eMyvnE

Share this post


Link to post
Share on other sites

#14 ·  Posted (edited)

... come on weaponx, wtf is that?!? :)

I must have missed the code that you posted.

Also, the WinAPI example was just a rehash of some code monoceres posted. This example doesn't require endian conversion like FileRead does.

http://www.autoitscript.com/forum/index.php?showtopic=79986

I merely looked at the PKZIP structure described here and assumed uncompressed size was a total for all files:

http://www.pkware.com/documents/casestudies/APPNOTE.TXT

Edited by weaponx

Share this post


Link to post
Share on other sites

I must have missed the code that you posted.

Also, the WinAPI example was just a rehash of some code monoceres posted. This example doesn't require endian conversion like FileRead does.

http://www.autoitscript.com/forum/index.php?showtopic=79986

I merely looked at the PKZIP structure described here and assumed uncompressed size was a total for all files:

http://www.pkware.com/documents/casestudies/APPNOTE.TXT

That WinAPI example that you point to is written when it's written. If it was written now it would be written differently. Btw thanks for that link, some nice code there.

And you can use FileRead() for that too with no problems.

But obviously I wrote something wrong there for your first sentece here.

I can post some code if you think it would give me credibility to say that your code here is clumsy.


♡♡♡

.

eMyvnE

Share this post


Link to post
Share on other sites

That WinAPI example that you point to is written when it's written. If it was written now it would be written differently. Btw thanks for that link, some nice code there.

And you can use FileRead() for that too with no problems.

But obviously I wrote something wrong there for your first sentece here.

I can post some code if you think it would give me credibility to say that your code here is clumsy.

The WinAPI example was just that, an alternative to FileRead(). I don't need help fixing it nor do I intend to because FileRead is a lot simpler. You were correct that the header only contains information for the first file.

I am almost finished with a proper version that cycles through all files in the zip, I will post tomorrow.

Share this post


Link to post
Share on other sites

I have tested this with zip files containing multiple files and folders. It will return a 2-dimensional array in this format:

[0][0] = File count

[0][1] = -

[0][2] = Total uncompressed bytes

[0][3] = Total compressed bytes

[0][4] = Overall compression ratio

[x][0] = File name

[x][1] = CRC

[x][2] = File uncompressed bytes

[x][3] = File compressed bytes

[x][4] = File compression ratio

#include <Array.au3> ;Only used for display

Global $bDebug = true

$array = ZipInformation(@Scriptdir & "\filename2.zip")

_ArrayDisplay($array)

Func ZipInformation($sFilename)

    ;$fileSize = FileGetSize($sFilename)
    Local $aArray[1][5]
    
    Local $local_file_header_signature = 0x04034B50
    Local $central_file_header_signature = 0x02014B50
    Local $end_of_central_dir_signature = 0x06054B50
    
    ;Open in binary mode
    Local $zipFile = FileOpen($sFileName, 16)

    Local $header = FileRead($zipFile)
    
    Local $next_byte = 1
    
    Local $iCount = 1
    Local $iElement = 1
    
    While 1

        $signature_raw = BinaryMid($header, $next_byte, 4) ;1-4
        $signature = Bin2Num($signature_raw)
        $next_byte += 4
        
        ;Read file until signature is end of central directory
        Switch $signature
            Case $local_file_header_signature
                Debug("+Signature: " & Hex($signature) & " (local_file_header_signature)" & @CRLF)
            Case $central_file_header_signature
                Debug("+Signature: " & Hex($signature) & " (central_file_header_signature)" & @CRLF)
            Case Else
                ;Debug("+Signature: " & Hex($signature) & @CRLF)
                ;ExitLoop
                Return $aArray
        EndSwitch

        If $signature = $central_file_header_signature Then
            $versionmadeby  = BinaryMid($header, $next_byte, 2)
            Debug("Version Made By: " & Bin2Num($versionmadeby) & @CRLF)
            $next_byte += 2
        EndIf

        $versionNeededToExtract_raw = BinaryMid($header, $next_byte, 2) ;5-6
        Debug("Version Needed To Extract: " & Bin2Num($versionNeededToExtract_raw) & @CRLF)
        $next_byte += 2

        $generalPurposeBitFlag_raw = BinaryMid($header, $next_byte, 2) ;7-8
        Debug("General Purpose Bit Flag: " & Bin2Num($generalPurposeBitFlag_raw) & @CRLF)
        $next_byte += 2

        $compressionMethod_raw = BinaryMid($header, $next_byte, 2) ;9-10
        Debug("Compression Method: " & Bin2Num($compressionMethod_raw) & @CRLF)
        $next_byte += 2

        $lastModFileTime_raw = BinaryMid($header, $next_byte, 2) ;11-12
        Debug("Last Mod File Time: " & $lastModFileTime_raw & @CRLF)
        $next_byte += 2

        $lastModFileDate_raw = BinaryMid($header, $next_byte, 2) ;13-14
        Debug("Last Mod File Date: " & $lastModFileDate_raw & @CRLF)
        $next_byte += 2

        $crc32_raw = BinaryMid($header, $next_byte, 4) ;15-18
        Debug("Crc-32: " & $crc32_raw  & @CRLF)
        $next_byte += 4

        $compressedSize_raw = BinaryMid($header, $next_byte, 4) ;19-22
        $compressedSize = Bin2Num($compressedSize_raw)
        Debug("Compressed Size: " & $compressedSize & " bytes" & @CRLF)
        $next_byte += 4

        $uncompressedSize_raw = BinaryMid($header, $next_byte, 4) ;23-26
        $uncompressedSize = Bin2Num($uncompressedSize_raw)
        Debug("Uncompressed Size: " & $uncompressedSize & " bytes" & @CRLF)
        $next_byte += 4

        $filenamelength_raw = BinaryMid($header, $next_byte, 2) ;27-28
        $filenamelength = Bin2Num($filenamelength_raw)
        Debug("Filename Length: " & $filenamelength & " bytes" & @CRLF)
        $next_byte += 2

        $extrafieldlength_raw = BinaryMid($header, $next_byte, 2) ;29-30
        $extrafieldlength = Bin2Num($extrafieldlength_raw)
        Debug("Extra Field Length: " & $extrafieldlength & " bytes" & @CRLF)
        $next_byte += 2
        
        If $signature = $central_file_header_signature Then
            $filecommentlength_raw = BinaryMid($header, $next_byte, 2) ;29-30
            $filecommentlength = Bin2Num($filecommentlength_raw)
            Debug("File Comment Length: " & $filecommentlength & " bytes" & @CRLF)
            $next_byte += 2
            
            $disknumberstart_raw = BinaryMid($header, $next_byte, 2) ;29-30
            $disknumberstart = Bin2Num($disknumberstart_raw)
            Debug("Disk Number Start: " & $disknumberstart & @CRLF)
            $next_byte += 2
            
            $internalfileattributes_raw = BinaryMid($header, $next_byte, 2) ;29-30
            $internalfileattributes = Bin2Num($internalfileattributes_raw)
            Debug("Internal File Attributes: " & $internalfileattributes & @CRLF)
            $next_byte += 2
            
            $externalfileattributes_raw = BinaryMid($header, $next_byte, 4) ;29-30
            $externalfileattributes = Bin2Num($externalfileattributes_raw)
            Debug("External File Attributes: " & $externalfileattributes & @CRLF)
            $next_byte += 4
            
            $relativeoffsetoflocalheader_raw = BinaryMid($header, $next_byte, 4) ;29-30
            $relativeoffsetoflocalheader = Bin2Num($relativeoffsetoflocalheader_raw)
            Debug("Relative Offset Of Local Header: " & $relativeoffsetoflocalheader & @CRLF)
            $next_byte += 4
        EndIf

        ;------------------------------------------------------
        ;Retrieve dynamic number of bytes ($filenamelength)
        ;------------------------------------------------------
        $filename_raw = BinaryMid($header, $next_byte, $filenamelength) ;31
        $filename = BinaryToString ($filename_raw)
        Debug("Filename: " & $filename & @CRLF)
        $next_byte += $filenamelength

        ;------------------------------------------------------
        ;Retrieve dynamic number of bytes ($extrafieldlength)
        ;------------------------------------------------------
        $extrafield_raw = BinaryMid($header, $next_byte, $extrafieldlength)
        Debug("Extra Field: " & $extrafield_raw  & @CRLF)
        $next_byte += $extrafieldlength
        
        If $signature = $central_file_header_signature Then
            ;------------------------------------------------------
            ;Retrieve dynamic number of bytes ($filecommentlength)
            ;------------------------------------------------------
            $filecomment_raw = BinaryMid($header, $next_byte, $filecommentlength)
            Debug("File Comment: " & $filecomment_raw  & @CRLF)
            $next_byte += $filecommentlength        
        EndIf

        If $signature = $local_file_header_signature Then
            ;------------------------------------------------------
            ;Skip over file data
            ;------------------------------------------------------
            $next_byte += $compressedSize
            Debug("File Data: " & "<SKIPPED> " & $compressedSize & " bytes" & @CRLF)
            
            ;Add array element (skip folder names)
            If $crc32_raw > 0 Then

                Redim $aArray[$iElement+1][5]
                
                $aArray[0][0] = $iElement
                $aArray[0][2] += $uncompressedSize
                $aArray[0][3] += $compressedSize
                $aArray[0][4] = ($aArray[0][2]-$aArray[0][3])/$aArray[0][2]
                
                $aArray[$iElement][0] = $filename
                $aArray[$iElement][1] = $crc32_raw
                $aArray[$iElement][2] = $uncompressedSize
                $aArray[$iElement][3] = $compressedSize
                $aArray[$iElement][4] = ($uncompressedSize-$compressedSize)/$uncompressedSize
                
                $iElement += 1
            EndIf
        EndIf
        
        Debug("next_byte: " & $next_byte & @CRLF)
        Debug(@CRLF)
        
        $iCount += 1
    WEnd    

    FileClose($zipFile)
EndFunc

;If global debug variable is set, output to console
Func Debug($sString)
    If $bDebug Then
        ConsoleWrite($sString)
    EndIf
EndFunc

Func Bin2Num($4Bytes)
    $dllStruct2_Integer = DllStructCreate("int")
    $dllStruct2_Binary = DllStructCreate("byte[4]", DllStructGetPtr($dllStruct2_Integer))

    DllStructSetData($dllStruct2_Binary, 1, $4Bytes)
    Return DllStructGetData($dllStruct2_Integer, 1)
EndFunc

NOTE: This doesn't support Zip64

Share this post


Link to post
Share on other sites

Hi there

I'm sorry to bump this topic... but I have a few questions

1) Why on earth does this code executed alone return different results from the code in the precedent post (for the same variales obviously)?

$zipFile = FileOpen($sFileName, 16)
$header = FileRead($zipFile)
FileClose($header)
$compressedSize_raw = BinaryMid($header, 19, 4) ;19-22
$compressedSize = Bin2Num($compressedSize_raw)

$uncompressedSize_raw = BinaryMid($header, 23, 4) ;23-26
$uncompressedSize = Bin2Num($uncompressedSize_raw)

2) How do I replace $compressedSize_raw and $uncompressedSize_raw in $header ?

thanks in advance


Some Projects:[list][*]ZIP UDF using no external files[*]iPod Music Transfer [*]iTunes UDF - fully integrate iTunes with au3[*]iTunes info (taskbar player hover)[*]Instant Run - run scripts without saving them before :)[*]Get Tube - YouTube Downloader[*]Lyric Finder 2 - Find Lyrics to any of your song[*]DeskBox - A Desktop Extension Tool[/list]indifference will ruin the world, but in the end... WHO CARES :P---------------http://torels.altervista.org

Share this post


Link to post
Share on other sites

The zip header doesn't contain a value for total compressed size and total uncompressed size, it must be calculated by looping through all of the files listed in the header.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0