A possible approach: two-way pow-params descriptor exchange
Summary
There may be many ways to solve this problem, and here one of these proposals
is sketched.
In short, instead of a one-way data exchange between frontend and backend
descriptors, a two way is suggested:
The Oniobalance frontend will need to include the pow-params field on its
superdescriptor, deriving some of its params from the ones present on its
backend's descriptors.
Each backend will need to fetch the frontend descriptor and use some of its
params as their own, if they're configured with
HiddenServiceOnionbalanceInstance 1.
Descriptor fields
Here is suggested the authoritative behavior for each of the PoW parameters:
type:
Some alternatives here:
This configuration MUST be hardcoded, and be the same for all the backends
and by the frontend (config file setting).
This MUST be defined by the Onionbalance frontend, and each backend should use
the type param from the frontend descriptor.
As of August 2023, there's only one defined type (v1), so this may be an
issue for the future.
seed-b64:
This MUST be defined by the Onionbalance frontend, as a shared seed.
Each backend MUST use the seed-b64 param from the frontend descriptor.
suggested-effort:
Here things start to get more complicated.
The Onionbalance frontend MUST get the suggested-effort value from each
backend descriptor.
Then it MUST combine them into a single suggested-effort, like:
Doing a simple average.
Adopting the greater suggested-effort, which might be the best
approach, since attackers may be attacking specific introduction points.
expiration-time:
This SHOULD be derived by the Onionbalance frontend, but taking into account
the expiration-time from each backend.
There may be many ways to do this, like:
Getting the smallest expiration-time from all the backends.
Doing an average, if that makes any sense.
The frontend MAY also need to take into account the expiration-time for each
backend to schedule when it has to re-fetch descriptors and re-publish its
superdescriptor again.
Verification change
In order for PoW to work across backends, syncing pow-params with the
superdescriptor won't be enough.
Sections [POW_VERIFY] and [CLIENT_POW] from Proposal 327 also mentions
an ID variable,
the current 32-byte blinded public ID (KP_hs_blind_id) for the onion service
which is used to calculate and verify the puzzle solution.
Then, services configured with HiddenServiceOnionbalanceInstance 1 (i.e,
Onionbalance backends) MUST use the frontend's blinded public ID when verifying
solutions.
This sounds reasonable. I am a fan of "Adopting the greater suggested-effort, which might be the best approach, since attackers may be attacking specific introduction points." and "Getting the smallest expiration-time from all the backends."
It sounds like the only change to C-Tor is some way for it to use the using the frontend's blinded public ID for verification on the backends, yeah?
This sounds reasonable. I am a fan of "Adopting the greater suggested-effort, which might be the best approach, since attackers may be attacking specific introduction points." and "Getting the smallest expiration-time from all the backends."
+1
It sounds like the only change to C-Tor is some way for it to use the using the frontend's blinded public ID for verification on the backends, yeah?
I believe that implementing this proposal on C Tor would require the following changes:
Using the frontend's blinded public ID as you mentioned. That may be relatively simple to implement, since the C Tor backend service should already have all the needed information (from HiddenServiceOnionbalanceInstance and MasterOnionAddress configs).
Functions initialize_pow_defenses() and update_all_descriptors_pow_params() would need to use the latest seed-b64 from the frontend service. This is not available right now to a C Tor backend service. And implementing that would mean the backend service needs to fetch the frontend's descriptor to get this information (and falling back to some random value if the descriptor is unavailable).
Possibly some special care/attention to the frequencies/timings of descriptor updates, to avoid unwanted race conditions in the shared seed (see next comment).
In general, the most difficult thing here is how to properly manage the shared seed. And just for the sake of completeness, it seems there are two distinct approaches:
Using a shared seed generated by the frontend (like in my proposal above), which backend services should adopt.
Using a shared seed derived from a consensus parameter.
Option 1 has the following characteristics:
Pros:
Offers more guarantees than option 2.
Cons:
Backend services would need to fetch descriptors, which could increase complexity, lines of code and performance (as backend services would need to await the frontend's descriptor arrival event before publishing their own descriptor).
It may be tricky to properly implement it (see next comment).
And for option 2:
Pros:
Simpler, less complexity involved: no need for any backend service to fetch descriptors.
Cons:
This shared value would be shared by all unrelated PoW-enable Onionbalance setups.
No suitable consensus parameter exists. There's the share random value ([PUB-SHAREDRANDOM] from rend-spec-v3.txt and detailed on srv-spec.txt) that could be used to derive seed-b64, but this parameter only changes once a day, whereas the PoW seed need to be changed more often (every 2 hours according to prop327) to avoid precomputed PoW attacks.
From the two points above, it seems less effective and unsuitable for now.
Yeah, I think the consensus parameter option is out.
Forgive my ignorance: Does existing onionbalance v3 have any communication channel between the frontend and the backend instances, like the control port etc, or is it all done only through descriptor updates?
Forgive my ignorance: Does existing onionbalance v3 have any communication channel between the frontend and the backend instances, like the control port etc, or is it all done only through descriptor updates?
As I understand, it's all done through descriptor updates. Right now, Onionbalance v3 runs only in the frontend node, fetching the backend descriptors and then publishing it's own "superdescriptor".
Option 1*: create controlport command "please set this as seed for service xxxxx.onion", then descriptor fetching will reside in OB, not CTor.
Yes, this approach seems doable, but would require:
Changes on C Tor anyway (handling the ControPort command plus the underlying seed handling logic).
Running Onionbalance also in the backend nodes (as we probably would not want to expose the backends' ControlPorts to the frontend, say by using Onion Services; that would introduce complexity and weaken security).
Another approach to be considered is a two-way communication channel by by using a special, Client Authorization-enabled Onion Service managed by the frontend, and backends could fetch the seed using a simple API. But maybe this is as complex (or even more) than @sready's suggestion.
It may be time to flag @beth on if she has any opinions about exporting the seed, blinded public identity, and difficulty via control port or other mechanism.
I think the minimum change in c-tor would be a control port command that sets (seed, blinded_id, expiration_time) and disables the built-in seed rotation for that service.
The concept could be something like a "replacement service identity for proof of work".
Depending on how this all works, it could be useful to ask backends not to publish service descriptors. Any time effort is changing, each backend will be writing to the HsDir every five minutes. If we're trying to pick up these changes on the frontend, we'll also need to be reading from the HsDir roughly that fast. It adds up to a lot of HsDir traffic that might be worth replacing with something onionbalance-specific.
I think the minimum change in c-tor would be a control port command that sets (seed, blinded_id, expiration_time) and disables the built-in seed rotation for that service.
The concept could be something like a "replacement service identity for proof of work".
Sounds like a good idea, keeping business logic away from the tor implementation.
Depending on how this all works, it could be useful to ask backends not to publish service descriptors. Any time effort is changing, each backend will be writing to the HsDir every five minutes. If we're trying to pick up these changes on the frontend, we'll also need to be reading from the HsDir roughly that fast. It adds up to a lot of HsDir traffic that might be worth replacing with something onionbalance-specific.
This is also interesting, and would require descriptor information to be available by other means, like through a control port command.
We may be talking about control port commands that:
Disables descriptor publishing.
Sets (seed, blinded_id, expiration_time) and disables the built-in seed rotation for that service.
Fetches a descriptor from a local Onion Service (like HSFETCH, but without passing through HSDirs).
Having those as different commands would also give flexibility for either a "replacement service identity for proof of work" or a "replacement service for publishing descriptors".
We might even want a way for a control port event to be generated immediately any time the suggested_effort has changed. I think we want to avoid polling effort as much as possible, for two reasons:
a service that isn't under attack has a constant effort, so the polling is unnecessary
when we are under attack, effort is quite latency sensitive. Adding an extra 5 minutes to the existing 5 minute polling cycle would mean we respond slower to attacks than a non-onionbalance service would.
A problem that might arise with this proposal is backend services being unaware of a recently updated seed.
Suppose the following sequence, during a DoS attack (i.e, with non-negligible suggested-effort):
Some Onion Service backends (say b1, b2, ..., bn) refreshes their descriptors, updating their seed-b64 from the current value found in the frontend descriptor (r1).
The Onionbalance frontend refreshes it's descriptor, updating its seed-b64 (r2).
A client fetches the frontend's descriptor, parse pow-params, finds that suggested_effort != 0, solves the puzzle based on the latest seed-b64 available (r2) in the frontend descriptor.
Client sends PoW in INTRO1 cell ([INTRO1_POW]) to one of the backends above (say b2).
Service verifies PoW ([SERVICE_VERIFY]), which will fail, since the b2 frontend still does not know about the newer shared random r2, as it still relying on r1.
Some backends (like b2) finally refresh their descriptors. But now it's too late, the client already lost it's place in the queue.
Possible solutions may include:
Make backends to refresh their copies of the frontend's descriptor with greater frequency, say twice the frequency of it's regular update and considering the backend's expiration-time parameter.
Some sort of "seed rotation", in a way that the frontend announces both the current and the next seed-b64, but this could increase the effectiveness of a precomputed PoW attack. And may also require spec change and more lines of code than the first solution.
+1 for setting effort as the maximum of all backend efforts. In addition to the attack scenario mentioned, I think this will help ensure we only decrease the overall effort when we're not under attack. (This is desirable so newer/honest clients don't always get stuck behind older/dishonest clients in the queue. It's best to only decrease effort when we expect to complete the whole pqueue without drops.)
For seeds: The service already tracks both a current and a previous seed. Seeds rotate approximately every 2 hours. If possible backends should know about a new seed before it appears in the frontend descriptor. Some lag at each step is fine as long as order is correct.
I think the frontend should be in charge of the rotation clock, and generating each seed and expiration time. After it's generated a seed, it needs to push that seed to the backends before clients begin using it. After the seed is known to all backends, it can go into the frontend descriptor.
Is there maybe a different way the onionbalance server could publish the seed? Something separate from client-visible descriptors, but also maybe something less powerful than the control port. The backends would know when to fetch the next one if they get both seed-b64 and expiration-time from this feed.
This early seed would be of limited value to attackers, we might even be able to put it somewhere public.
For effort: It would work for the frontend to query this from backend's published descriptors, but maybe we want to look for a more efficient option? Effort changes are quite frequent. (HS_UPDATE_PERIOD == 300, or 5 minutes)
I don't know how feasible it is to have some communication channel between frontend and backend. Perhaps a single bidirectional request could pass efforts from backend to frontend and seeds the other way.
For seeds: The service already tracks both a current and a previous seed. Seeds rotate approximately every 2 hours. If possible backends should know about a new seed before it appears in the frontend descriptor. Some lag at each step is fine as long as order is correct.
+1
I think the frontend should be in charge of the rotation clock, and generating each seed and expiration time. After it's generated a seed, it needs to push that seed to the backends before clients begin using it. After the seed is known to all backends, it can go into the frontend descriptor.
+1
Is there maybe a different way the onionbalance server could publish the seed? Something separate from client-visible descriptors, but also maybe something less powerful than the control port. The backends would know when to fetch the next one if they get both seed-b64 and expiration-time from this feed.
This early seed would be of limited value to attackers, we might even be able to put it somewhere public.
For effort: It would work for the frontend to query this from backend's published descriptors, but maybe we want to look for a more efficient option? Effort changes are quite frequent. (HS_UPDATE_PERIOD == 300, or 5 minutes)
I don't know how feasible it is to have some communication channel between frontend and backend. Perhaps a single bidirectional request could pass efforts from backend to frontend and seeds the other way.
The frontend could keep a dedicated, client authorization-enabled private Onion Service as the communication channel with backends. That would be an entirely different architecture for Onionbalance:
Instead of an one-way "fetch descriptors from HSDirs", a bi-directional link between frontend and backends would handle the needed exchanges.
Each backend would also be required to run an "onionbalance-backend" application to control the tor backend process.
It would be expected that these exchanges would happen with a greater frequency than HS_UPDATE_PERIOD.
I don't know how much work would be involved in making this change to onionbalance, but the new architecture sounds good to me from an hs_pow perspective.
I'd just caution against relying on fast polling again, HS_UPDATE_PERIOD is pretty short compared to most of Tor's timers. If possible we should try and push effort changes from backend to frontend.
I don't know how much work would be involved in making this change to onionbalance
Neither I do... I'm trying to evaluate it based some stages, roughly speaking:
Isolating the current descriptor handling into a "legacy" operation mode, so we don't disrupt existing setups and give time for a migration.
Creating the "new" operation mode (front and backend) code. Bulk of the work maybe.
Plugging the new mode into the existing frontend code, which would have both operation modes available for a while, and controlled by a configuration option.
But I'm not sure if that's a good way to attack the problem.
Anyway, such estimation could be helpful to know how to proceed in terms of existing team availability or the need for a grant proposal; and timeline.
Another question: would that need a torspec? Maybe only the new control commands need to be specified as required API calls (for an Arti implementation, in the future)?
but the new architecture sounds good to me from an hs_pow perspective.
Great! An advantage of this architecture is that it would be suited for further dynamic descriptor updates, as well as paving the way for fixing the cross-certs issue (tpo/core/torspec#155) through prop307 or other solution.
And not publishing the backend descriptors would not only alleviate HSDirs a bit but also make it difficult for "random third parties" to "grab a onion service's introduction points and claim that they are introduction points for a different service" (see prop307's Introduction).
I'd just caution against relying on fast polling again, HS_UPDATE_PERIOD is pretty short compared to most of Tor's timers. If possible we should try and push effort changes from backend to frontend.
Right. If I understood correctly, an onionbalance-backend process could monitor effort on it's local service using MetricsPort (say every 2 minutes), pushing the value to the frontend whenever a significant change is detected.
Sounds good to me! Yes I agree we could poll MetricsPort, I think the only benefits to adding a control port event would be some saved CPU cycles and we wouldn't require metrics to be enabled. (Fewer configuration dependencies, I guess.) Might be fine to just use metrics, especially if onionbalance ends up wanting to poll metrics for any other reason.
I think the only benefits to adding a control port event would be some saved CPU cycles and we wouldn't require metrics to be enabled. (Fewer configuration dependencies, I guess.)
Yes, two (or three) good advantages.
Might be fine to just use metrics, especially if onionbalance ends up wanting to poll metrics for any other reason.
Yes, exactly. General metrics may be useful, say in helping the frontend to determine whether a backend is in bad health and should be (temporarily) removed from the pool. But this could be implemented later on.
Do you have any tips in what could be used for the protocol between front and backends? Maybe something stateless over TCP? One question I have is how a backend could prove it has in fact control of a given .onion address, perhaps maybe by signing (some) requests with the .onion key (and with replay protection etc?).
I've been thinking in ways that this could be done. I'm not saying I will or
won't do it yet. I'm just figuring out both a push and a pull approach.
Push approach
That would be a one-way backend API for the frontend(s) to connect:
Each backend has an API exposed by an authenticated .onion service through a
UNIX socket. So only with the auth key (or direct socket access) is possible
to connect to this service.
No limit in the number of fontends that can connect (although more than one
active frontend could race when setting backend params).
This API would be implemented by a Python script connected through Tor over a
socket. So not many C Tor changes besides some additional Control Port commands.
The API Could be RESTful or something else (does not matter at this point).
This API would have only a very limited set of actions related to load
balancing; initially only regarding PoW and some basic metrics.
The API would accept get request to query:
Basic info for each onionbalance-enable Onion Services (including PoW params).
The current descriptor for each Onion Service or just a subset of it's fields
(like intro points information).
The API would accept put requests to:
Set PoW params (seed, expiration, suggested effort).
Also,
Backends would not need to publish their descriptors
(PublishHidServDescriptors torrc option could be set to 0), to alleviate
HSDirs.
Frontends could decide not to include introduction points on the
superdescriptor, like for unhealthy backends, or prioritize those with
best metrics, in case there are more backends than it fits in a superdescriptor.
With this setup it's easier to have more than Onionbalance frontend, although
only one would be the "active one"; it don't cover how one frontend would
be aware that the active one has gone and should take it's place (that would
be another, separate feature).
Pull approach
This would be a two-way info system: a MetricsPort(-like) info service through
authenticated .onions:
This API would be implemented by a Python script connected through Tor over a
socket. So not many C Tor changes besides some additional Control Port commands.
A frontend would:
Offer the prev/next seed, prev/next suggested effort and prev/next expiration
time (defining a next update window) on it's info service.
Connect to each backend's info service to query for params. Then would update
the values they offer.
Backends would:
Offer the current effective suggested effort on their info
services.
Query the frontend info service to look for new params.
This:
Is more prone to out-of-sync conditions, and would not support fallback
frontends easily.
Reduces the attack surface area (no commands received by the backends).
Seems easier to implement, but hard to get it right.
Common for both approaches
Both approaches requires:
Frontend(s) and backends to have a their clocks in sync.
We're not entirely sure about the best design here.
And especially we're not sure this should be implemented in the first place, because:
Some people came with an idea to a new design, where Onion Service backends could be listed in a descriptor, and clients could then pick those randomly instead of connecting in the main address, in a transparent manner to users. This behavior would be similar to Alt-Svc, and each backend would have it's own PoW parameters, so no need to sync parameters between instances.
Load balancing in Arti won't use Onionbalance, but will probably be built-in, whatever technology it uses (Onionbalancing or descriptor-based).
A descriptor-based load balancing mechanism
This is one of the potential things we can do regarding load balancing (roughly edited notes from many discussions).
Rationale
The problem space seems to be: how to make a service available through many distinct .onion addresses?
Solutions may either require propagating PoW parameters between nodes (complex), or have independent nodes with their own parameters (simpler).
CAA field in the descriptor from the ACME for Onions is an example of what can be done: descriptors start to work in a DNS-like manner, but without being centralized/hierarchical documents.
So it could make sense to add a "this service is also available at these .onions" in a descriptor.
Compatibility
Recursive lookup would happen in the client.
Would be backward compatible. Old Tor clients would fallback to the main
address.
Newer clients could pick randomly from the list of backends.
In the future, when the feature is widely deployed, it would allow for publishing-only frontends that would not accept connections, but would just publish the descriptors, and clients would pick the backends whenever they're in a descriptor.
Impact
Small penalty of additional HSDir lookup, but would give load balance with
independent PoW on each backend.
This will be transparent: won't change the .onion address in the user
level.
This could be an Alt-Svc-like behavior for Onion Services, but in the descriptor level.
May also have other applications beyond load balancing (such as for instant
messengers such as Quiet that need to have fallback introduction procedure
for new users).
Example
Onion Services A could in it's descriptor could say I'm available also in
Onion Service B and C.
Implementation
Implementation details needs to account for:
How requests would be evenly distributed across backend services. Simpler solution is to let clients randomly pick backends found in a descriptor, and optionally to make frontends randomly sort backends when building a descriptor.
How connections would be transparent (users would still "see" A's address).
To sum up
This could remove the need for Onionbalance.
This could allow Onion Service A to be only the publisher.
This would support PoW.
Would be similar to Alt-Svc for HTTP, but in the descriptor level.
What's next?
This still needs a design, a proposal, discussion and implementation, for which we unfortunately lack time.
Implementation in either C Tor or Arti may not involve a lot of effort, but it's would still be work to do.
I just had another idea in how PoW could be implemented for Onionbalance.
It consists in having another pow-params scheme in the descriptor (like
v2), being almost the same as v1, but supporting and a (list of)
introduction points as the last parameter(s).
This could be used by Onionbalance to query for pow-params at each backend
descriptor, and include those in the superdescriptor.
As not all introduction points would need a corresponding pow-params entry:
only those with active PoW.
Pros:
Might not need heavy changes in Onionbalance. It would need to fetch
descriptors within the expiration-time windows, and add any
pow-params found in the v2 syntax.
No need for fancy client/service architecture in Onionbalance.
No need to new ControlPort commands on C Tor.
No need for a controller script in the load-balancer backends.
Cons:
Can increase descriptor size, but that can be mitigated by allowing
multiple introduction points to be specified in a single pow-params
line, allowing grouping all intro points from a given backend instance.
May need some changes in C Tor as it's unprepared for proof-of-work
extensibility. It would need to support more than one pow-params line,
and also be able to select the right pow-params v2 when connecting to a
specific introduction point.
There is a lot of "C Tor fatigue", so we may not want to implement this.
Anyway, I'm just documenting the idea. We may also need to first survey PoW
adoption and check if people are intending to run it with Onionbalance as well.
This is an interesting idea! I like the idea of having some way to separate out pow-params by intro point. I'd really rather not combine this with the algorithm type, if it's at all possible to keep these concepts separate. This may be a good prompt for a broader discussion on extensibility for the hsdesc, since it seems like we should make it easier for somewhat-external modules to include data in the descriptors. but also maybe we could redefine the params to be per-intro-point anyway, since C tor has sparse implementation of any verification as it is.