GitLab is used only for code review, issue tracking and project management. Canonical locations for source code are still https://gitweb.torproject.org/ https://git.torproject.org/ and git-rw.torproject.org.

Add documentation on our design and architecture.

...and put it in the doc/ directory.

This patch fixes #16
parent 2ed369b1
......@@ -35,3 +35,8 @@ Usage
2. Start a distributor, which distributes the backend's resources:
./rdsys-distributor -name salmon -config /path/to/config.json
More documentation
==================
* [Design and architecture](doc/architecture.md)
Design and architecture
=======================
If you are new to rdsys and seek to understand its design and architecture, this
document is for you. It helps you understand rdsys's information flow, its
layers of abstraction, and how the code is organised.
Layers of abstraction
---------------------
Rdsys aspires to implement the design philosophy of [The Clean
Architecture](https://blog.cleancoder.com/uncle-bob/2012/08/13/the-clean-architecture.html).
Applied to our domain of circumvention proxy distribution, this means the
following:
At the lowest layer, we have rdsys's *domain logic*. These are abstract
building blocks that implement rdsys's "axioms". An example is the notion of a
*resource* – a "thing" that rdsys hands out to users. This can be a Tor bridge,
a VPN proxy, a Snowflake proxy, or download links for circumvention software.
In Go, a resource is an interface (it's called `Resource` and is defined in
[domain.go](https://gitlab.torproject.org/tpo/anti-censorship/rdsys/-/blob/master/pkg/core/domain.go)),
which dictates what methods an object has to implement.
On top of the domain logic are *use cases*. These are actually useful
manifestations of the building blogs in the domain logic. An abstract resource
turns into a Tor bridge – a concrete "thing" that we can hand out to users.
Distributors are a different kind of use case because they turn the abstract
idea of hashrings into code that actually hands out things to users.
Rdsys's processes talk to each other via a *delivery mechanism* – currently
implemented as HTTP connections but this could also be domain sockets, or remote
procedure calls. The Go interface `Mechanism` (defined in
[ipc.go](https://gitlab.torproject.org/tpo/anti-censorship/rdsys/-/blob/master/pkg/delivery/ipc.go))
specifies the methods that a delivery mechanism must implement.
Finally, there's the *presentation* layer, which implements the user interfaces
that users interact with. An example is a Web page that allows users to request
resources. The presentation layer wraps the distributor code in the use cases
layer.
It is important to understand that software dependencies *only point downwards*
in the abstraction layers: The presentation layer uses code that's part of the
delivery and use cases layer but the use cases layer *must not* use code that's
in the presentation layer. This principle keeps the code reasonably easy to
maintain and allows for drop-in replacements of, say, the delivery layer.
The following diagram illustrates a simplification of the relationships
discussed above. The black arrows denote dependencies. Note how all
dependencies only point downwards.
```mermaid
graph TB
subgraph "Core logic"
Resource[Resource]
Hashring[Hashring]
end
subgraph "Delivery"
IPC[IPC mechanisms]
end
subgraph "Use cases"
Transports[Tor bridge, e.g. obfs4] --> Resource
Distributor[Distributor, e.g. Salmon] --> Hashring
Distributor --> Resource
Distributor --> IPC
end
subgraph "Presentation"
Frontend[Web frontend] --> Distributor
Frontend --> IPC
end
```
Software architecture
---------------------
Rdsys implements a microservice-based architecture and consists of one backend
process and several distributor processes. The distributors talk to the backend
over an HTTP streaming API. The backend exposes two APIs and one Web page:
1. A public API that lets resources register themselves.
2. A private API that supplies distributors with resources.
3. A Web page that shows the status of specific resources.
To test resources (or more specifically: Tor bridges), rdsys relies on
[bridgestrap](https://gitlab.torproject.org/tpo/anti-censorship/bridgestrap).
```mermaid
graph TB
subgraph "rdsys"
Backend[Backend] --> Bridgestrap[Bridgestrap]
Distributor1[Distributor 1] --> Backend
Distributor2[Distributor 2] --> Backend
Distributor3[Distributor 3] --> Backend
end
```
Information flow
----------------
This subsection discusses how a resource goes from being added to rdsys to being
handed out to users.
1. A resource can make it into rdsys in two ways:
1.1 By being passively pulled by rdsys (see
[kraken.go](https://gitlab.torproject.org/tpo/anti-censorship/rdsys/-/blob/master/internal/kraken.go)).
An example are Tor bridge descriptors, which rdsys parses periodically.
1.2 By actively registering itself. Resources can send an HTTP GET request
to rdsys to register themselves for distribution (see
[backend.go](https://gitlab.torproject.org/tpo/anti-censorship/rdsys/-/blob/master/internal/backend.go).
An example are [Snowflakes proxies](https://snowflake.torproject.org)
at least in theory because Snowflakes are stand-alone proxies without a
Tor process that would take care of "publishing" a resource.
2. Once a resource is in rdsys's backend, it's ready to be handed out to a
distributor because the backend does not handle resource distribution. Each
resource maps to one – and only one – distributor.
3. Distributors obtain resources from rdsys's backend by talking to an HTTP API.
This API provides distributors with an initial, deterministically-selected
set of resources, followed by resource updates, which are sent whenever the
set of resources changes, i.e. resources disappear, change, or are added.
4. The distribution of resources is at the discretion of distributors. It is
the distributor's responsibility to 1) smartly hand out resources to users,
2) implement intuitive UIs, and 3) thwart Sybil attacks.
Directory structure
-------------------
Here is an overview of rdsys's directory structure and the purpose of each
directory.
* bin/
* cmd/ (Command line tools that invoke the backend and distributors.)
* conf/ (Rdsys's configuration file.)
* doc/ (Documentation, e.g. the document you are reading right now.)
* internal/ (code that's internal to rdsys)
* pkg/ (code that can be reused by other projects)
- core/ (Rdsys's elementary building blocks; the lowest layer of abstraction.)
- usecases/ (Use cases that emerge from the building blocks, e.g. resources and distributors.)
- delivery/ (Rdsys's IPC mechanisms.)
- presentation/ (The user-facing frontends for distributors.)
If you are new to rdsys's code base, it's best to work your way up the layers
of abstraction. Read the code in the following order:
1. pkg/core/
2. pkg/usecases/
3. pkg/delivery/
4. pkg/presentation/
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment