Purge some build reproducibility sections and add TODOs authored by Richard Pospesel's avatar Richard Pospesel
......@@ -436,70 +436,21 @@ Corresponding bugs in the [Tor Browser issue tracker](https://gitlab.torproject.
## 5. Build Security and Package Integrity
**TODO** Re-write this section with up-to-date information about the current build-system, annoyances around code-signing requirements for verifying reproducibility, etc. We can probably outsource the background on reproducible builds elsewhere.
<!--
In the age of state-sponsored malware, [we believe](https://blog.torproject.org/blog/deterministic-builds-part-one-cyberwar-and-global-compromise) it is impossible to expect to keep a single build machine or software signing key secure, given the class of adversaries that Tor has to contend with.
For this reason, we have deployed a build system that allows anyone to use our source code to reproduce byte-for-byte identical binary packages to the ones that we distribute.
-->
### 5.1 Achieving Binary Reproducibility
The GNU toolchain has been working on providing reproducible builds for some time, however a large software project such as Firefox typically ends up embedding a large number of details about the machine it was built on, both intentionally and inadvertently.
Additionally, manual changes to the build machine configuration can accumulate over time and are difficult for others to replicate externally, which leads to difficulties with binary reproducibility.
For this reason, we decided to leverage the work done by the [Gitian Project](https://gitian.org/) from the Bitcoin community.
Gitian is a wrapper around Ubuntu's virtualization tools that allows you to specify an Ubuntu or Debian version, architecture, a set of additional packages, a set of input files, and a bash build scriptlet in an YAML document called a "Gitian Descriptor".
This document is used to install a qemu-kvm image, and execute your build scriptlet inside it.
We have created a set of wrapper scripts around Gitian to automate dependency download and authentication, as well as transfer intermediate build outputs between the stages of the build process.
Because Gitian creates a Linux build environment, we must use cross-compilation to create packages for Windows and macOS.
For Windows, we use mingw-w64 as our cross compiler.
For macOS, we use cctools and clang and a binary redistribution of the Mac OS 10.7 SDK.
The use of the Gitian system eliminates build non-determinism by normalizing the build environment's hostname, username, build path, uname output, toolchain versions, and time.
On top of what Gitian provides, we also had to address the following additional sources of non-determinism:
1. **Filesystem and archive reordering**
The most prevalent source of non-determinism in the components of the browser by far was various ways that archives (such as zip, tar, jar/ja, DMG, and Firefox manifest lists) could be reordered.
Many file archivers walk the file system in inode structure order by default, which will result in ordering differences between two different archive invocations, especially on machines of different disk and hardware configurations.
The fix for this is to perform an additional sorting step on the input list for archives, but care must be taken to instruct libc and other sorting routines to use a fixed locale to determine lexicographic ordering, or machines with different locale settings will produce different sort results.
We chose the 'C' locale for this purpose.
We created wrapper scripts for tar, zip, and DMG to aid in reproducible archive creation.
2. **Uninitialized memory in toolchain/archivers**
We ran into difficulties with both binutils and the DMG archive script using uninitialized memory in certain data structures that ended up written to disk.
Our binutils fixes were merged upstream, but the DMG archive fix remains an independent patch.
3. **Fine-grained timestamps and timezone leaks**
The standard way of controlling timestamps in Gitian is to use libfaketime, which hooks time-related library calls to provide a fixed timestamp.
However, due to our use of wine to run py2exe for python-based pluggable transports, pyc timestamps had to be addressed with an additional helper script.
The timezone leaks were addressed by setting the `TZ` environment variable to UTC in our descriptors.
4. **Deliberately generated entropy**
In two circumstances, deliberately generated entropy was introduced in various components of the build process.
First, the BuildID Debuginfo identifier (which associates detached debug files with their corresponding stripped executables) was introducing entropy from some unknown source.
We removed this header using objcopy invocations in our build scriptlets, and opted to use GNU DebugLink instead of BuildID for this association.
Second, on Linux, Firefox builds detached signatures of its cryptographic libraries using a temporary key for FIPS-140 certification.
A rather insane subsection of the FIPS-140 certification standard requires that you distribute signatures for all of your cryptographic libraries.
The Firefox build process meets this requirement by generating a temporary key, using it to sign the libraries, and discarding the private portion of that key.
Because there are many other ways to intercept the crypto outside of modifying the actual DLL images, we opted to simply remove these signature files from distribution.
There simply is no way to verify code integrity on a running system without both OS and co-processor assistance.
Download package signatures make sense of course, but we handle those another way (as mentioned above).
5. **LXC-specific leaks**
Gitian provides an option to use LXC containers instead of full qemu-kvm virtualization.
Unfortunately, these containers can allow additional details about the host OS to leak.
In particular, umask settings as well as the hostname and Linux kernel version can leak from the host OS into the LXC container.
We addressed umask by setting it explicitly in our Gitian descriptor scriptlet, and addressed the hostname and kernel version leaks by directly patching the aspects of the Firefox build process that included this information into the build.
It also turns out that some libraries (in particular: libgmp) attempt to detect the current CPU to determine which optimizations to compile in.
This CPU type is uniform on our KVM instances, but differs under LXC.
**TODO** No longer using gitian, use this section for rbm/tor-browser-build
### 5.2 Package Signatures and Verification
**TODO** review/edit
The build process generates a single sha256sums-unsigned-build.txt file that contains a sorted list of the SHA-256 hashes of every package produced for that build version.
Each official builder uploads this file and a GPG signature of it to a directory on a Tor Project's web server.
The build scripts have an optional matching step that downloads these signatures, verifies them, and ensures that the local builds match this file.
......@@ -515,15 +466,20 @@ In order to verify package integrity, the signature must be stripped off using t
### 5.3 Anonymous Verification
**TODO** Update this section, detail our current build verification process, probably rename the section to 'Build Verification'
<!--
Due to the fact that bit-identical packages can be produced by anyone, the security of this build system extends beyond the security of the official build machines.
In fact, it is still possible for build integrity to be achieved even if all official build machines are compromised.
By default, all tor-specific dependencies and inputs to the build process are downloaded over Tor, which allows build verifiers to remain anonymous and hidden.
Because of this, any individual can use our anonymity network to privately download our source code, verify it against public, signed, audited, and mirrored git repositories, and reproduce our builds exactly, without being subject to targeted attacks.
If they notice any differences, they can alert the public builders/signers, hopefully using a pseudonym or our anonymous bug tracker account, to avoid revealing the fact that they are a build verifier.
-->
### 5.4 Update Safety
**TODO** Verify this is accurate and update with any new changes
We make use of the Firefox updater in order to provide automatic updates to users.
We make use of certificate pinning to ensure that update checks cannot be tampered with by setting `security.cert_pinning.enforcement_level` to **2**, and we sign the individual MAR update files with keys that get rotated every year.
......
......