it seems like we need to setup a special runner that would run as --privileged but not as root, so that privileged would not gain any extra privileges. this might be easier to do with podman as well. a simple way to reproduce outside gitlab CI is:
podman run -it --rm debian:latest
then:
apt updateapt install mmdebstrap
then:
root@4775f7cc1056:/# mmdebstrap bullseye - > a.tarI: automatically chosen mode: rootE: root mode requires mount which requires CAP_SYS_ADMINroot@4775f7cc1056:/# mmdebstrap --mode=unshare bullseye - > a.tar unshare: unshare failed: Operation not permittedE: unable to unshare the mount namespaceroot@4775f7cc1056:/#
that's for bootstrapping an image of course, maybe buildah would be more lenient and might be able to build a tarball in some other way. but i was kind of happy about my reproducible, minimalist build script...
i did manage to build an image with mmdebootstrap, so that's encouraging. in other words, if you have a way of generating a tarball, you can feed it to docker import and that works out of the box in our runners. the bootstrap job here, worked with mmdebstrap --mode=fakeroot, at the very least. so that's encouraging.
but buildah does need namespace support as well, as this more involved job shows:
this is probably the same "unshare" error as we were getting with mmdebstrap --mode=unshare earlier (which we worked around with fakeroot). i did try to allow containers to do their own userns thing inside docker, but couldn't quite figure it out. i tried kernel.unprivileged_userns_clone=1 (in tor-puppet.git's 08fd91af "try to enable userns mode to allow building containers inside CI") and that didn't work.
i'm still unclear on the userns option in the runner configuration. i thought --userns=host would enable the user namespaces inside the container, but according to this upstream doc, it's the opposite: that disables it.
from what i understand, we might need a "privileged" runtime for this to work. this sounds scary to me, but i was told it might be safe if we run Docker rootless or use podman (rootless as well?), both of which require a special runner.
I don't think I did. I tried another tool that claimed it could do the same thing (avoid D-in-D builds without running Docker in privileged mode) but it didn't work.
one thing i heard from @hacim is that kaniko has trouble with multi-stage builds. it otherwise works without DinD which is nice, but that's a major caveat...
I am not sure what you mean with "multi-stage" builds. I checked a bit on the Internet and it seems what I found tells me we might be able to work around the problems with "multi-stage" builds. But as I said, not sure what you had in mind and I am fine trying to cross that bridge once we have to.
That said: I was able to build all images in tpo/core/tor-ci-support with kaniko in our Gitlab CI. That's pretty promising. @anarcat: I would like to test a bit more with an own Gitlab registry. Could we set something like that up for testing purposes? I'd then push the images to that registry and check whether they are usable for core Tor CI purposes. And is so, then this could be a reasonable way forward, tackling #89 (closed) as well.
I am not sure what you mean with "multi-stage" builds. I checked a bit on the Internet and it seems what I found tells me we might be able to work around the problems with "multi-stage" builds. But as I said, not sure what you had in mind and I am fine trying to cross that bridge once we have to.
I am actually not absolutely certain about this, because it's based on something @hacim said, but in my book, multi-stage Docker builds are basically building a (or, multiple, to be accurate) Docker containers from a single Docker file which contains multiple FROM statements. The example in the above linked documentation is:
# syntax=docker/dockerfile:1FROM golang:1.16WORKDIR /go/src/github.com/alexellis/href-counter/RUN go get -d -v golang.org/x/net/html COPY app.go ./RUN CGO_ENABLED=0 GOOS=linux go build -a -installsuffix cgo -o app .FROM alpine:latest RUN apk --no-cache add ca-certificatesWORKDIR /root/COPY --from=0 /go/src/github.com/alexellis/href-counter/app ./CMD ["./app"]
That said: I was able to build all images in tpo/core/tor-ci-support with kaniko in our Gitlab CI. That's pretty promising. @anarcat: I would like to test a bit more with an own Gitlab registry. Could we set something like that up for testing purposes? I'd then push the images to that registry and check whether they are usable for core Tor CI purposes. And is so, then this could be a reasonable way forward, tackling #89 (closed) as well.
That is super exciting! I'll check with the team before enabling the registry though. We already have a lot of disk issues and I just want to make sure this doesn't grow nuts either...
and yeah, maybe that part of the conversation would belong to #89 (closed)... could you use Docker hub in the short term?
It's been a bit of a hassle due to a kaniko bug but I managed to get that work and tested tor's CI with my kaniko-built oldstable image: all is still green. So, I think this is a viable way forward, in particular as I tried out multi-stage builds as well and they worked fine.
@anarcat: as you are a fan of closing tickets I believe we are done here. ;p
@gk Do you need to use it for something? I'm very interested in getting this done as we could greatly speed up the netteam CI builds I think.
I want to use it for getting us away from our YOLO-mode wrt to Docker images we are currently in. :) Like, using random images from wherever which do whatever they are doing seems not an ideal solution for us as a project. I don't have a direct use for that for may day-to-day work which is why I need to work on that topic in my spare time/on my weekend (with all the caveats that implies).
But if the side-effect of that effort is greatly speeding up the network team's CI builds then even better. :)
@gk Do you need to use it for something? I'm very interested in getting this done as we could greatly speed up the netteam CI builds I think.
I want to use it for getting us away from our YOLO-mode wrt to Docker images we are currently in. :) Like, using random images from wherever which do whatever they are doing seems not an ideal solution for us as a project. I don't have a direct use for that for may day-to-day work which is why I need to work on that topic in my spare time/on my weekend (with all the caveats that implies).
If you have the capacity to work on this, I think you should, as part of
your day-to-day work, to be honest. It might not immediately be obvious
to you that it will be part of your work, but I am absolutely certain it
eventually will. ;)
But if the side-effect of that effort is greatly speeding up the network team's CI builds then even better. :)
That too: if it helps other teams, i think it can pass as work. ;)
# syntax=docker/dockerfile:1FROM golang:1.16WORKDIR /go/src/github.com/alexellis/href-counter/RUN go get -d -v golang.org/x/net/html COPY app.go ./RUN CGO_ENABLED=0 GOOS=linux go build -a -installsuffix cgo -o app .FROM alpine:latest RUN apk --no-cache add ca-certificatesWORKDIR /root/COPY --from=0 /go/src/github.com/alexellis/href-counter/app ./CMD ["./app"]
That does work, but I think its more legible to label your stages like this:
FROM golang:1.16 AS build
...
COPY --from=build /go/src/github.com/alexellis/href-counter/app ./
The basic idea of these is that you can build in one layer, then take
what you want from that layer, and then discard everything else. You end
up with only the specific binary in your final layer. Although if your
final one is all of alpine:latest you probably aren't saving a whole
lot.
Regarding multi-stage build problems with kaniko, I used kaniko for a
long time without issues, but if you browse their issue tracker, just
for the 'stage' keyword, you will see people have a lot of issues:
for me it was just frustrating because it wasn't doing what I expected
it to do, and it took a long time to finally peel back the layers to
determine the problem wasn't with something I was doing, but a problem
with kaniko itself. Once I got to that layer, I realized my only
solution was to not use kaniko, which I've stuck to since.
for me it was just frustrating because it wasn't doing what I expected
it to do, and it took a long time to finally peel back the layers to
determine the problem wasn't with something I was doing, but a problem
with kaniko itself. Once I got to that layer, I realized my only
solution was to not use kaniko, which I've stuck to since.
... but that means you need a privileged runner to build images, which
is what we're trying to avoid here. the point of using kaniko is that it
seems to be the only solution that allows normal runners to build
images.
maybe we could use kaniko for most containers and have exceptions for
staged builds?
...
On 2022-03-08 01:47:37, Micah Anderson (@hacim) wrote:
--
Antoine Beaupré
torproject.org system administration
for me it was just frustrating because it wasn't doing what I expected it to do, and it took a long time to finally peel back the layers to determine the problem wasn't with something I was doing, but a problem with kaniko itself. Once I got to that layer, I realized my only solution was to not use kaniko, which I've stuck to since.
... but that means you need a privileged runner to build images, which is what we're trying to avoid here. the point of using kaniko is that it seems to be the only solution that allows normal runners to build images.
maybe we could use kaniko for most containers and have exceptions for staged builds?
Staged builds work as far as I can tell. I played with the example you linked to (https://docs.docker.com/develop/develop-images/multistage-build/) including @hacim's proposed legibility improvements and, while it was a bit of a PITA to get that going (which was mostly due to the example not being up-to-date in Docker's doc), eventually it worked fine in a kaniko context. So, there is no need to think about any potential exceptions at this point IMO.
That said what I envision is that we assemble our kaniko images we use as base images ourselves with a reproducible build process. So, we have the option as well to be a proper free software project and fix stuff that breaks for us and get those fixes upstreamed. (Yes, I know that's easily said but I'd be willing to take on that responsibility if the need would arise at some point).
That said what I envision is that we assemble our kaniko images we use as base images ourselves with a reproducible build process. So, we have the option as well to be a proper free software project and fix stuff that breaks for us and get those fixes upstreamed. (Yes, I know that's easily said but I'd be willing to take on that responsibility if the need would arise at some point).
that sounds like a nice ticket to open. ;)
otherwise, yes, this seems like it's a ticket that could be closed, but i think we should document this more clearly in our documentation.
@gk could you see if you can submit a MR on the wiki replica to share how you did your magic "hey i built a docker container in gitlab ci" sauce? probably as a patch to service/ci.md...
@gk could you see if you can submit a MR on the wiki replica to share how you did your magic "hey i built a docker container in gitlab ci" sauce? probably as a patch to service/ci.md...