Skip to content

make arti availlable as build artifact

trinity-1686a requested to merge trinity-1686a/arti:reproducible-build into main

fix #122 (closed)

This merge request attempt to make builds reproducible. The binaries are statically linked, otherwise they can't be compatible with distos using different libc, that might not be binary compatible (for instance, Arch binaries can't be run on Debian due to older glibc)

I'll try to document why things where done like they are:

  • --target x86_64-unknown-linux-musl (static build): this is how static builds are made in Rust.

  • usage of vendored/bundled feature flags (static build): required to compile openssl and sqlite from sources, which is required for static builds (causes sigsev for sqlite, and linker issues with openssl)

  • Cargo.lock has been committed and is now tracked (reproducibility): without it, there is no guarantees someone won't compile with a different version of $crate, and get a different binary

  • Use of the image 'rust:1.54.0-alpine3.14' instead of 'rust:alpine' (reproducibility): any similar image is fine, however result might diverge between two builds made with 1.54.0-alpine, if using a newer/older version of alpine. Same for alpine3.14, for which rustc might be 1.54, or 1.55 when it's out. I think images in <version>>-alpine<version> are never reused to different images. If I'm wrong, the best solution might be to use image' sha256 directly.

  • CFLAGS/RUSTFLAGS targeting westmere (reproducibility): the target architecture must be fixed to get reproducibility. I choose westmere because it's the oldest architecture to support AES-NI. It's 11 yo, and should (untested) be compatible with AMD Bulldozer architecture (10 yo). It is compatible with newer AMD zen architecture (as well as anything intel after westmere). This choice is a bit arbitrary, it means builds won't work on older cpu, and won't use advanced vectorization instructions of newer cpu. I think westmere is old enough that it should cover most use cases, while not loosing too much in performance. Users can still compile themselves to get better compatibility or performance.

  • moving build dir to /arti (reproducibility): Rust captures the build path. Using --remap-path-prefix=<path>=/arti should have been enough, but is actually not, so the workaround is to move the repository to a fixed place ourselves.

  • symbolic link of /usr/local/cargo/registry/src (reproducibility): for some reason, the build is very sensitive to the file system this folder (which contains the source-code of crates we depend on) resides on. Using it on an ext4 fs, clearing the file system and doing it again gives the same result. But re-format the fs, and the build result changes, I have no idea why. tmpfs seems to all give the same result everywhere, so using symbolic links, this folder is put on a tmpfs. The issue seems to be mainly related to lzma and zstd (C dependencies). I was however not able to understand what is the problem exactly. Note that there appear to be no issue with openssl, despite it being C too.

  • /sys/fs/cgroup/ ????: registry/src is about 270Mio big, and rw tmpfs available in docker or in CI are 64Mio by default. For some reason, this is a mountpoint for a tmpfs that is rw, and has no size limit. This is a bug in docker which is patched on newer versions. This is a temporary workaround, the real solution is to make /dev/shm bigger (it's configurable in gitlab-ci runner's config file), and use that instead.

Testing locally with docker: it's possible to test in docker with maint/docker_reproducible_build.sh. Size of shm is increased (--shm-size=512m), so the lines about /sys/fs/cgroup in maint/reproducible_build.sh should be commented, in favor of those with /dev/shm.

There is currently no documentation on how to download the resulting bin. A link should be added somewhere (in the README maybe?). There is also no documentation (except in this text) about how to reproduce the builds at home (change which lines are commented and run the script, as said above, but it should be written elsewhere)

It would be nice if the sysadmin team managing gitlab-ci runners could increase shm size. How to is documented here, search for "shm_size". This would allow to use the exact same script in CI and docker, and to not rely on what is essentially a bug in docker to get a big enough tmpfs

Edited by trinity-1686a

Merge request reports