Skip to content
Snippets Groups Projects
Verified Commit 9a3d9cd2 authored by anarcat's avatar anarcat
Browse files

draft a minio based static-component replacement

parent 3fb7c3f7
No related branches found
No related tags found
No related merge requests found
......@@ -739,6 +739,59 @@ Next steps:
architecture, [see this comment from anarcat](https://gitlab.com/groups/gitlab-org/-/epics/1316#note_496404589) outlining
some of those concerns
### GitLab pages and Minio replacement
The above approach doesn't scale easily: the old GitLab pages
implementation relied on NFS to share files between the main server
and the GitLab pages server, so it was hard to deploy and scale.
The newer implementation relies on "object storage" (ie. [S3](https://en.wikipedia.org/wiki/Amazon_S3)) for
content, and pings the main GitLab rails app for configuration.
In [this comment](https://gitlab.com/groups/gitlab-org/-/epics/1316#note_497254184) of the related architecture update, it was
acknowledged that "*the transition from NFS to API seems like
something that eventually will reduce the availability of Pages*" but:
> it is not that simple because how Pages discovers configuration has
> impact on availability too. In environments operating in a high
> scale, NFS is actually a bottleneck, something that reduces the
> overall availability, and this is certainly true at GitLab. Moving
> to API allows us to simplify Pages <-> GitLab communication and
> optimize it beyond what would be possible with modeling
> communication using NFS.
>
> \[...\] But requests to GitLab API are also cached so GitLab Pages can
> survive a short outage of GitLab API. [Cache expiration policy is
> currently hard-coded in the codebase](https://gitlab.com/gitlab-org/gitlab-pages/-/blob/2cb80834597f8e0d818bb28b60e3338e0a3e6acb/internal/source/gitlab/cache/cache.go#L11), but once we address
> [issue #281](https://gitlab.com/gitlab-org/gitlab-pages/-/issues/281) we might be able to make it configurable for users
> running their GitLab on-premises too. This can help with reducing
> the dependency on the GitLab API.
Object storage itself (typically implemented with [minio](https://www.minio.io/)) is
itself [scalable](https://min.io/product/scalable-object-storage) and [highly available](https://min.io/product/active-data-replication-for-object-storage), including
Active-Active replicas. Object storage could also be used for other
artifacts like Docker images, packages, and so on.
*That* design would take an approach similar to the above, but
possibly discarding the [cache system](cach) in favor of GitLab pages as
caching frontends. In that sense:
* the **mirror** hosts could be replaced by the GitLab pages and
Minio
* the **source** hosts could be replaced by some parts of the [GitLab
Pages](https://docs.gitlab.com/ee/administration/pages/) system. unfortunately, that system relies on a custom
webserver, but it might be possible to bypass that and directly
access the on-disk files provided by the CI.
* there would be no **master** intermediate service
The architecture would look something like this:
![Static system redesign with Minio architecture diagram](static-component/architecture-gitlab-pages.png
)
This would deprecate the entire static-component architecture, which
would eventually be completely retired.
### Replacing Jenkins with GitLab CI as a builder
See the [Jenkins documentation](service/jenkins#gitlab-ci-replacement)
......
digraph gitlabpagesminio {
label="high availability GitLab pages and Minio design brainstorm, August 2021"
node [shape=record]
subgraph "clusterminio" {
label="Minio cluster"
labelloc=bottom
minio_0 [ label="Minio 0" ]
minio_1 [ label="Minio 1" ]
minio_etc [ label="Minio ..." ]
minio_n [ label="Minio n" ]
minio_0 -> minio_1 [label="sync" dir=none]
minio_0 -> minio_n [label="sync" dir=none]
minio_0 -> minio_etc [label="sync" dir=none]
minio_1 -> minio_n [label="sync" dir=none]
minio_1 -> minio_etc [label="sync" dir=none]
minio_etc -> minio_n [label="sync" dir=none]
}
subgraph "clusterhosts" {
label="hosts"
labelloc=bottom
GitLab [ label="<host> GitLab rails | <git> gitaly | <CI> CI" ]
runner [ label="GitLab runner"]
gitlab_pages [ label="GitLab pages" ]
GitLab:CI -> runner [label="pull"]
runner -> GitLab:host [label="push"]
GitLab:host -> minio_etc [label="push"]
gitlab_pages -> GitLab:host [label="caches config"]
gitlab_pages -> minio_etc [label="caches content"]
puppet
}
subgraph "clusterusers" {
label="users"
labelloc="bottom"
TPA
GitLab_user [label="GitLab user"]
public
}
TPA -> puppet
puppet -> GitLab:host [taillabel="deploys"]
puppet -> { runner, minio_etc, gitlab_pages }
GitLab_user -> GitLab:git
public -> gitlab_pages [ label="browses" ]
}
howto/static-component/architecture-gitlab-pages-minio.png

75.6 KiB

0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment