Decide on user directory structure for arti relays

The current plan is to add relay support to the arti binary under the "relay" subcommand. Relays will behave differently than clients, so we will want:

separate configuration files: Proxies and relays need different configuration options, and while they might have a few in common, most won't be shared. Some options will have the same name but have different defaults. See #1616 (closed).
separate key store directories: I think we want the client's keys (such as hs keys) to be stored in a different directory from the relay keys.
separate state files: The relay and proxy shouldn't share things like guards.
separate cache directories: I'm not sure if there's a strong reason to use different cache directories, but if we're using different directories for everything else, it probably makes sense to use different cache directories too. Also arti doesn't currently allow two running processes to share a cache directory (#1497).

So essentially, I think we want all relay files to be separate from client/proxy files. This becomes a bit of a mess when they all live under the same "arti" application. This becomes even more of a mess if we don't want to change any of arti's existing files/directories, which would be a breaking change for current arti users, but would better accommodate relays.

Arti's current directory structure on Linux looks something like:

.config/arti/
|-- arti.d
`-- arti.toml

.local/share/arti/
|-- hss
|   |-- demo
|   `-- demo.lock
|-- keystore
|   |-- client
|   `-- hss
|-- pt_state
|   `-- obfs4proxy
`-- state
    |-- circuit_timeouts.json
    |-- guards.json
    `-- state.lock

.cache/arti/
|-- dir.lock
|-- dir.sqlite3
`-- dir_blobs

The question then is: How will arti-relay fit into this structure?

I can think of three paths forward, all with various downsides.

1. Separate application "namespaces"

NOTE: I use the term "namespace" here to refer to the general ("org", "torproject", "Arti") tuple, which on Linux resolves to just "arti" (ex ~/.config/arti/). I'm not sure if there's a better name to refer to these.

While arti's "proxy" subcommand will store files under the "arti" namespace, arti's "relay" subcommand will store files under the "arti-relay" namespace.

The directory structure would look something like:

.config/arti/
|-- arti.d
`-- arti.toml

.config/arti-relay/
|-- arti-relay.d
`-- arti-relay.toml

.local/share/arti/
|-- hss
|   |-- demo
|   `-- demo.lock
|-- keystore
|   |-- client
|   `-- hss
|-- pt_state
|   `-- obfs4proxy
`-- state
    |-- circuit_timeouts.json
    |-- guards.json
    `-- state.lock

.local/share/arti-relay/
|-- keystore
|   `-- relay
`-- state
    |-- circuit_timeouts.json
    |-- guards.json
    `-- state.lock

.cache/arti/
|-- dir.lock
|-- dir.sqlite3
`-- dir_blobs

.cache/arti-relay/
|-- dir.lock
|-- dir.sqlite3
`-- dir_blobs

Note that the "arti" name is hardcoded into tor-config, so we'd need to refactor tor-config and things using tor-config to allow multiple application "namespaces". We'd also need to add support for path variables such as ARTI_RELAY_CONFIG in addition to ARTI_CONFIG.

Advantages:

The existing arti files stay where they are (no name or path changes).
The relay and client/proxy files are kept completely separate.

Disadvantages:

Confusing that the "arti proxy" lives under "arti/" and "arti relay" lives under "arti-relay".
Typically a single application doesn't use two different application "namespaces".
Easy for users to misconfigure their relay by changing the client/proxy config instead of their relay config.

Open questions:

Is it confusing if the "arti proxy" subcommand lives under the name "arti" while the "arti relay" subcommand lives under the name "arti-relay"? People may think that updating .config/arti/arti.toml will update their relay config, but instead their changes will be silently ignored.
Similar to (a), is it confusing if we have both ARTI_CACHE and ARTI_RELAY_CACHE as path variables?
Can we rename the "arti/" directory to "arti-proxy/" or "arti-client/"?
Can we rename the "arti.toml" file to "arti-proxy.toml" or "arti-client.toml"?

2. Relay-specific subdirectory inside arti application "namespace"

Everything will be stored under the "arti" name, but relays will have their own subdirectories.

The directory structure would look something like:

.config/arti/
|-- arti.d
|-- arti.toml
|-- arti-relay.d
`-- arti-relay.toml

.local/share/arti/
|-- hss
|   |-- demo
|   `-- demo.lock
|-- keystore
|   |-- client
|   `-- hss
|-- pt_state
|   `-- obfs4proxy
|-- relay
|   |-- keystore
|   |   `-- relay
|   `-- state
|       |-- circuit_timeouts.json
|       |-- guards.json
|       `-- state.lock
`-- state
    |-- circuit_timeouts.json
    |-- guards.json
    `-- state.lock

**edit**: gabi mentions below that the keystore is already namsespaced,
          so clients/proxies and relays could share a top-level keystore

.cache/arti/
|-- dir.lock
|-- dir.sqlite3
|-- dir_blobs
`-- relay
    |-- dir.lock
    |-- dir.sqlite3
    `-- dir_blobs

Advantages:

The existing arti files stay where they are (no name or path changes).
Everything is under the "arti" namespace.

Disadvantages:

The directory structure is a mess.
Easy for users to misconfigure their relay by changing the client/proxy config instead of their relay config.

3. Option (2), but also move/rename existing arti files/directories

Everything will be stored under the "arti" name, and proxies/clients and relays will have their own subdirectories.

The directory structure would look something like:

.config/arti/
|-- client
|   |-- arti-client.d
|   |-- arti-client.toml
`-- relay
    |-- arti-relay.d
    `-- arti-relay.toml

.local/share/arti/
|-- client
|   |-- hss
|   |   |-- demo
|   |   `-- demo.lock
|   |-- keystore
|   |   |-- client
|   |   `-- hss
|   |-- pt_state
|   |   `-- obfs4proxy
|   `-- state
|       |-- circuit_timeouts.json
|       |-- guards.json
|       `-- state.lock
`-- relay
    |-- keystore
    |   `-- relay
    `-- state
        |-- circuit_timeouts.json
        |-- guards.json
        `-- state.lock

**edit**: gabi mentions below that the keystore is already namsespaced,
          so clients/proxies and relays could share a top-level keystore

.cache/arti/
|-- client
|   |-- dir.lock
|   |-- dir.sqlite3
|   `-- dir_blobs
`-- relay
    |-- dir.lock
    |-- dir.sqlite3
    `-- dir_blobs

Since we'd be making breaking changes, we'd probably also want to rename the path variables like ARTI_CONFIG to ARTI_CLIENT_CONFIG while we're at it. Then we could also add path variables like ARTI_RELAY_CONFIG. This could make any kind of automatic migration much more difficult.

Advantages:

Arguably the cleanest and most intuitive directory structure.
Hopefully minimizes confusion for users editing configuration files (a user editing .config/arti/client/arti-client.toml is less likely to think they're editing the relay config).

Disadvantages:

Requires moving/renaming existing files for arti users. This could be somewhat painful if we want to automatically migrate users to this new directory structure.

Open questions:

Would we need to do an automatic migration? Or can we just let users know in release notes?
How long would we keep the automatic migration code around for in the arti codebase?

Edited Oct 03, 2024 by opara