Sign in to follow this  
Followers 0
netmask

Reproducible binary builds?

9 posts in this topic

Hi,

I have a need to create a reproducible build from the same au3 source, which boils down to being able to verify the MD5/SHA-1 of the resulting binary file and match across multiple compiles.

So far I've been unsuccessful; even if I use the same source, the resulting MD5 of the compiled binary is always different than the previous build.

I tried playing around with the compile options (not including any FileInstall(), not using UPX, etc).

Anyone ever had to do this with AutoIT or have any insight as to if it is at all possible?

 

Thanks!

Share this post


Link to post
Share on other sites



Why are you compiling it more than once and not just making copies of it?

The MD5 hash should be done on the source and then you can verify that the source hasn't been changed.


If I posted any code, assume that code was written using the latest release version unless stated otherwise. Also, if it doesn't work on XP I can't help with that because I don't have access to XP, and I'm not going to.
Give a programmer the correct code and he can do his work for a day. Teach a programmer to debug and he can do his work for a lifetime - by Chirag Gude
How to ask questions the smart way!

I hereby grant any person the right to use any code I post, that I am the original author of, on the autoitscript.com forums, unless I've specifically stated otherwise in the code or the thread post. If you do use my code all I ask, as a courtesy, is to make note of where you got it from.

Back up and restore Windows user files _Array.au3 - Modified array functions that include support for 2D arrays.  -  ColorChooser - An add-on for SciTE that pops up a color dialog so you can select and paste a color code into a script.  -  Customizable Splashscreen GUI w/Progress Bar - Create a custom "splash screen" GUI with a progress bar and custom label.  -  _FileGetProperty - Retrieve the properties of a file  -  SciTE Toolbar - A toolbar demo for use with the SciTE editor  -  GUIRegisterMsg demo - Demo script to show how to use the Windows messages to interact with controls and your GUI.  -   Latin Square password generator

Share this post


Link to post
Share on other sites

Whatever process involves "compiling" (and with AutoIt it isn't true compiling) it's always possible that the tools don't take care to initialize to a fixed value some unused parts of buffers in the build process. Relying on the whole image being predictable is probably beyond what can be expected from a tool chain unless this detail becomes part of the tools' specifications, which AFAICT is not the norm.


This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites

Why are you compiling it more than once and not just making copies of it?

The MD5 hash should be done on the source and then you can verify that the source hasn't been changed.

 

The use case is in a regulated industry.  We compile the source and the resulting binary has hash X.  The regulators compile the same source, and will check that their hash matches our hash X.

By simply looking at the hash of the source, you cannot tell anything about where the binary came from.

We already do this with C++ and Java with various hacks/options/manipulations.

Whatever process involves "compiling" (and with AutoIt it isn't true compiling) it's always possible that the tools don't take care to initialize to a fixed value some unused parts of buffers in the build process. Relying on the whole image being predictable is probably beyond what can be expected from a tool chain unless this detail becomes part of the tools' specifications, which AFAICT is not the norm.

 

As mentioned above, we already do it for certain mainstream languages, but I can understand that it would not be possible to do under AutoIT.

I guess I may end up just running the script on our servers instead of running the "compiled" AutoIT output

 

Thanks for the input!

Share this post


Link to post
Share on other sites

#5 ·  Posted (edited)

Just wanted to give a small update..

Digging further into the binary comparison, it seems (for my script anyway), that only 30 bytes between offsets 0xAF618 - 0xAF6D6 are different across rebuilds

And really, it is always exactly 30 bytes out of 123 bytes that are different after some sort of magic string of 7 bytes @ offset 0xAF610 ("AU3!EA06").

Now knowing this, I could definitely do some binary compares and ignore those 123 bytes altogether (and eventually try to figure out what they mean :P)

Edited by netmask

Share this post


Link to post
Share on other sites

There was talk of the way scripts were compiled being altered, the thinking was that When a compiled script got flagged by an antivirus, then all of them would be flagged.

As I recall some new method of compilation might mean that each compiled script would not result in the same signature.

Whether ot not this was implemented only the devs know, but you could test by trying an older version like 3.3.6.* for example


AutoIt Absolute Beginners    Require a serial    Pause Script    Video Tutorials by Morthawt

Monkey's are, like, natures humans.

Share this post


Link to post
Share on other sites

 

As mentioned above, we already do it for certain mainstream languages, but I can understand that it would not be possible to do under AutoIT.

I see this as a tool implementation detail, subject to change without prior notice. Open source software prefer to rely on hashing the source and let users compile their own copy using a publickly available tool, possibly also hashed.

 

By simply looking at the hash of the source, you cannot tell anything about where the binary came from.

I don't see how the procedure you mention effectively increases the confidence one can have in a product.

Download the source, check its hash, compile it with the kown tool and you're set.


This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites

I see this as a tool implementation detail, subject to change without prior notice. Open source software prefer to rely on hashing the source and let users compile their own copy using a publickly available tool, possibly also hashed.

I don't see how the procedure you mention effectively increases the confidence one can have in a product.

Download the source, check its hash, compile it with the kown tool and you're set.

 

I think you missed my use case:  *I* don't care what the binary hash is, our regulators do.  

For example;  we run a certain piece of management software of the server, and we provide the binary hashes of that product to certain regulatory agencies, along with the source code.  Those regulators then compile the source, then check our provided binary hashes and makes sure it matches theirs.  Then they'll audit our production server, calculate the hash on the running binaries, and again make sure they are what we said they were.

The conundrum is:  we say we're running X version of a binary, regulators need proof that we are in fact running X version of the binary.  Deterministic/reproducible builds is usually how this is solved.

Is it clearer now?

Thanks!

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0