Jump to content
Sign in to follow this  
monoceres

FAF Archive format.

Recommended Posts

monoceres

First of all, this is not even close to be a fully fledged UDF but I thought I throw it up here anyway since I suspect it will be some time before I can work seriously on this again and I will forget it if I don't throw it up.

So what is FAF, FAF stands for Flexible Archive Format and is a new archive format that lets you store multiple files in one big archive. A strong side with the faf format is that each file can have its own compressions scheme and encryption inside the archive. Another thing is that each file is logically seperated so if one file goes corrupt inside the archive, chances are that the rest of the archive will remain intact.

As of now there are only two different types of ways to add the files to the archive and it's raw storing and some lightweight native windows compression (LZ1, thanks to trancexx). But I plan to add more algos and encryption.

The structure of the format can be viewed like this:

Posted Image

UDF:

FAF_Archiver.au3 (Previous downloads: 69)

Example:

Example.au3

To run the example, you'll need test.faf

I've also made an icon to be used with .faf archives.

http://monoceres.se/Uploads/faf.ico

Ps. I do know this sux compared to 'real' archive formats but this is made purely in autoit and have been a great learning experience for me and could be the same for others.

Enjoy :P

Edited by monoceres

Broken link? PM me and I'll send you the file!

Share this post


Link to post
Share on other sites
Authenticity

What is the purpose or the 'a' chunk? Only the the function capabilities can determine how to decompress it? I mean, if the function doesn't "know" this algorithm then it's deduced that the compressor's algorithm is the key? or you're using fixed and pre-assumed algorithm per flag variability?

Heh, sorry for all these questions. Nice share, thanks.

Share this post


Link to post
Share on other sites
monoceres

What is the purpose or the 'a' chunk? Only the the function capabilities can determine how to decompress it? I mean, if the function doesn't "know" this algorithm then it's deduced that the compressor's algorithm is the key? or you're using fixed and pre-assumed algorithm per flag variability?

Heh, sorry for all these questions. Nice share, thanks.

Thanks for the interest :P

Yeah, the 'a' chunk is so the extraction function can determine which algorithm it should use to extract the file from the raw data in the archive. I'm not very satisfied with this way so it'll probably change when I implement more algorithms and try to standardize the way of adding more algorithms.


Broken link? PM me and I'll send you the file!

Share this post


Link to post
Share on other sites
Flamingwolf

"... each file can has ..."

Lulz. ;D

Is there any compression here? Or is it just grouping them as a single .faf file?

Share this post


Link to post
Share on other sites
monoceres

"... each file can has ..."

Lulz. ;D

Hah! I'm tired :P

Is there any compression here? Or is it just grouping them as a single .faf file?

There is, if you want, I have so far included LZ1 compression since it's native in winxp+.

http://www.autoitscript.com/forum/index.ph...exx+compression


Broken link? PM me and I'll send you the file!

Share this post


Link to post
Share on other sites
Authenticity

What is the point to increase it's size using unicode strings? Not that it's so crucial but for the sake of greediness it should be. lol I'm not a racialistic person to kick away those wonderful languages but...

Share this post


Link to post
Share on other sites
monoceres

What is the point to increase it's size using unicode strings? Not that it's so crucial but for the sake of greediness it should be. lol I'm not a racialistic person to kick away those wonderful languages but...

I guess I just want to be able to shout: "Full unicode support!" :P

But seriously, this is another thing I thought about. Instead of removing the unicode strings I'm thinking of making them dynamic instead. Should save lotsa space. It's just that reading headers that keep changing size is annoying work, but it could really be worth it since every file entry now has a header size of more than 0.5 kB.

Edited by monoceres

Broken link? PM me and I'll send you the file!

Share this post


Link to post
Share on other sites
Authenticity

Yup, the unicode is more a gain than the minimal size ANSI characters may save. By the way, as well-known, the strength of the format or the compression ratio is revealed with the few hundreds MB + files so unicode will be much less sacrifice than a gain. Second thought, keep it ;p.

Share this post


Link to post
Share on other sites
FireFox

@monoceres

Nice example there, good job :P

Cheers, FireFox.


 

OS : Win XP SP2 (32 bits) / Win 7 SP1 (64 bits) / Win 8 (64 bits) | Autoit version: latest stable / beta.
Hardware : Intel(R) Core(TM) i5-2400 CPU @ 3.10Ghz / 8 GiB RAM DDR3.

My UDFs : Skype UDF | TrayIconEx UDF | GUI Panel UDF | Excel XML UDF | Is_Pressed_UDF

My Projects : YouTube Multi-downloader | FTP Easy-UP | Lock'n | WinKill | AVICapture | Skype TM | Tap Maker | ShellNew | Scriptner | Const Replacer | FT_Pocket | Chrome theme maker

My Examples : Capture toolIP Camera | Crosshair | Draw Captured Region | Picture Screensaver | Jscreenfix | Drivetemp | Picture viewer

My Snippets : Basic TCP | Systray_GetIconIndex | Intercept End task | Winpcap various | Advanced HotKeySet | Transparent Edit control

 

Share this post


Link to post
Share on other sites
JRSmile

maybe you get inspired by the MPQ file format just google a bit around, its amazing.


$a=StringSplit("547275737420796F757220546563686E6F6C75737421","")For $b=1 To UBound($a)+(-1*-1*-1)step(2^4/8);&$b+=1*2/40*µ&Asc(4)Assign("c",Eval("c")&Chr(Dec($a[$b]&$a[$b+1])))''Chr("a")&"HI"Next;time_U&r34d,ths,U-may=get$the&c.l.u.e;b3st-regards,JRSmile;MsgBox(0x000000,"",Eval("c"));PiEs:d0nt+*b3.s4d.4ft3r.1st-try:-)

Share this post


Link to post
Share on other sites
trancexx

I like it.

- unicode most definitely

- maybe using unsigned types in headers to eliminate possible errors that could occure otherwise

Apropos JRSmile's comment... I'm for header structures. MPQ works differently (as I read). Since this is for/in/to AutoIt and AutoIt doesn't have ByteInByte() func. hence making MPQ's way more insecure.


♡♡♡

.

eMyvnE

Share this post


Link to post
Share on other sites
monoceres

Sorry for the delay, been busy.

@Kastout, it's very easy, it's just a single call to the _FAF_AddFileToArchive() function :P

And yeah, the mpq format seems to be quite nice but it's not very suited for this thing.

@trancexx, unicode will not be removed and I might as well change the types to unsigned.


Broken link? PM me and I'll send you the file!

Share this post


Link to post
Share on other sites
trancexx

monoceres, you don't have FAF signature except magic (maybe too little), that could be used to determine what the file is. Header (file) should start with something like 0x46414603.

You could also use pascal style strings for strings in headers. That would save lots of space.

It would be cool that rasim finds time to modify his UnRARIt.au3 for this after you say that you are done with the FAF definitions.

...who knows


♡♡♡

.

eMyvnE

Share this post


Link to post
Share on other sites
JRSmile

what about an md5 sum of the file to see if it got corrupted.


$a=StringSplit("547275737420796F757220546563686E6F6C75737421","")For $b=1 To UBound($a)+(-1*-1*-1)step(2^4/8);&$b+=1*2/40*µ&Asc(4)Assign("c",Eval("c")&Chr(Dec($a[$b]&$a[$b+1])))''Chr("a")&"HI"Next;time_U&r34d,ths,U-may=get$the&c.l.u.e;b3st-regards,JRSmile;MsgBox(0x000000,"",Eval("c"));PiEs:d0nt+*b3.s4d.4ft3r.1st-try:-)

Share this post


Link to post
Share on other sites
martin

header, 3. Read it.

I'm probably just as blind because I don't see that.

Header 3 is

short magic; Should be 325

What header do you mean Manadar?

Serial port communications UDF Includes functions for binary transmission and reception.printing UDF Useful for graphs, forms, labels, reports etc.Add User Call Tips to SciTE for functions in UDFs not included with AutoIt and for your own scripts.Functions with parameters in OnEvent mode and for Hot Keys One function replaces GuiSetOnEvent, GuiCtrlSetOnEvent and HotKeySet.UDF IsConnected2 for notification of status of connected state of many urls or IPs, without slowing the script.

Share this post


Link to post
Share on other sites
jvanegmond

Header 3. It's a simple check to see if the file is what you'd expect and if it hasn't been severely damaged. Of course it could be expanded but that's another thing.

Share this post


Link to post
Share on other sites
JRSmile

I'm probably just as blind because I don't see that.

Header 3 is

What header do you mean Manadar?

I like standarts and a custom checksum is reinventing the weel we already have real fast standartised hashing algorythms which can easily be reprogrammed by other languages.

Just wanted to state that i ofcourse read before writing:-)

Best regards, j.


$a=StringSplit("547275737420796F757220546563686E6F6C75737421","")For $b=1 To UBound($a)+(-1*-1*-1)step(2^4/8);&$b+=1*2/40*µ&Asc(4)Assign("c",Eval("c")&Chr(Dec($a[$b]&$a[$b+1])))''Chr("a")&"HI"Next;time_U&r34d,ths,U-may=get$the&c.l.u.e;b3st-regards,JRSmile;MsgBox(0x000000,"",Eval("c"));PiEs:d0nt+*b3.s4d.4ft3r.1st-try:-)

Share this post


Link to post
Share on other sites
monoceres

monoceres, you don't have FAF signature except magic (maybe too little), that could be used to determine what the file is. Header (file) should start with something like 0x46414603.

You could also use pascal style strings for strings in headers. That would save lots of space.

It would be cool that rasim finds time to modify his UnRARIt.au3 for this after you say that you are done with the FAF definitions.

...who knows

Yeah, maybe the signature could be expanded, I really don't know the odds for it being the same in another unrelated file, but adding a few more bytes will definitely be more secure.

Yeah, but it will probably take some time, as I try to add more features I think of new ways of structuring all the time.

I even got some suggestion over at codeguru:

http://www.codeguru.com/forum/showthread.php?t=472050

what about an md5 sum of the file to see if it got corrupted.

If I'll add any checksum it will most definitely be crc. Maybe I even add crc to every file in the archive.

@all

Any suggestions for a nice encryption algo to use with this? I would prefer an UDF that works with either a pointer to a buffer or an autoit binary type. I'll probably just use an regular XOR encryption for starters as it would be fast & easy without having to include any library.


Broken link? PM me and I'll send you the file!

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

×

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.