Build Windows/Mac CI infrastructure that is usable by all teams in the near future
The projects we have in Tor currently utilizes a mixture of different CI systems to ensure some form of quality assurance as part of the software development process:
- Jenkins (provided by TPA)
- Gitlab CI (currently Docker builders kindly provided by the FDroid project via Hans from The Guardian Project)
- Travis CI (used by some of our projects such as tpo/core/tor.git for Linux and MacOS builds)
- Appveyor (used by tpo/core/tor.git for Windows builds)
One big benefit that we have seen with Gitlab CI is how easy it is for each project to initially configure CI for their respective project and maintain it without sysadmin/CI-admin(?) involvement. This I believe is an important requirement here to distribute the workload of actually setting this up.
None of the goals of this ticket will solve the issue that Apple have recently announced the M1 processor and we have no way of virtualizing/emulating the ARM64 macOS builds, yet. This will have to be something we look into in the future. Other organizations will have this problem too, so we might be able to piggy-bag on them.
Jenkins have been hard for the network team to maintain and weasel have been a great help there. I am not sure how Jenkins is used by other teams right now, except that I know the web teams are utilizing it to publish changes to our websites to the production servers.
Travis CI recently announced a new scheme where MacOS builds will become a more scarce resource on their platform. This mixed with the wish to have faster builds for the network team is what triggered this post. We are already on some "free software beneficial plan" where they support us with more points, but it wont be enough for the network team to go through a month of MacOS builds for our needs, unfortunately.
Appveyor is very slow, and it often leads to frustrations amongst the network team members.
It would awesome if we could somehow reserve two (ideally) "fast" Debian-based machines on TPO infrastructure to build the following:
- Run Gitlab CI runners via KVM (initially with focus on Windows x86-64 and macOS x86-64). This will replace the need for Travis CI and Appveyor. This should allow both the network team, application team, and anti-censorship team to test software on these platforms (either by building in the VMs or by fetching cross-compiled binaries on the hosts via the Gitlab CI pipeline feature). Since none(?) of our engineering staff are working full-time on MacOS and Windows, we rely quite a bit on this for QA.
- Run Gitlab CI runners via KVM for the BSD's. Same argument as above, but is much less urgent.
- Spare capacity (once we have measured it) can be used a generic Gitlab CI Docker runner in addition to the FDroid builders.
- The faster the CPU the faster the builds.
- Lots of RAM allows us to do things such as having CoW filesystems in memory for the ephemeral builders and should speed up builds due to faster I/O.
I am by no means an expert on this, but I don't believe these machines can be virtual machines as we need to spawn other virtual machines using the "full virtualization" that is provided by "modern" x86-64 CPUs. It might be "recursive" virtualization works (some cloud providers have that), but I have no idea what the implications are for that, especially with the cluster management software we use for other physical hosts in TPO.
Please let me know if I need to add more details here :-)
I have no idea what label to put this in, so folks from TPA who organize these things are welcome to figure out where this belong the best.