Add flatfile DirMgr
This is a first shot at implementing the changes needed for #267 (closed).
-
Definition of a
DirProvider
trait, abstracting theDirMgr
. This is not complete, and in particular there is an issue with the client callingbootstrap()
: the call cannot be abstracted in the trait, becauseasync fn
's are not (yet) supported. -
Changes in
TorClient
to use aDirProvider
instead of aDirMgr
. Not complete, due to the above (currently has a nasty hack of keeping a reference to the actualDirMgr
). -
Implementation of a minimalist
DirProvider
calledFlatFileDirMgr
, that only loads the directory from cache files. Hidden behind alightarti
feature. -
Changes in
TorClient
to useFlatFileDirMgr
under alightarti
feature. -
Addition of a couple of methods to
UnvalidatedMdConsensus
, to help manipulating the relays inFlatFileDirMgr
. -
Addition of
FromStr
forAuthority
, to allow us to load an authority from a file. The use of it is actually outside of Arti, in our client code, feeding the authorities toTorClient
through configuration.
All comments and suggestions for improvements welcome.
Merge request reports
Activity
mentioned in issue #267 (closed)
requested review from @nickm
Thanks, I'll have a look today! Some questions that I have already:
Does this conflict with !317 (closed)?
Can you fix the errors in the CI pipeline? (It looks like there are some compilation problems)
Does this conflict with !317 (closed)?
It should not. !317 (closed) was the path we were first following, which @tharvik developed into a generic feature. This PR does not need it as we are "shortcutting" the whole DirMgr and implementing a minimal one for our purposes.
Can you fix the errors in the CI pipeline? (It looks like there are some compilation problems)
Looking into it. Apparently
Sync
andSend
not implemented by theDirProvider
trait (I did not see this locally, using arti as a lib).- Resolved by cgrigis
- Resolved by cgrigis
99 99 /// A Result as returned by this crate. 100 100 pub type Result<T> = std::result::Result<T, Error>; 101 101 102 /// Trait for DirMgr implementations 103 pub trait DirProvider<R: Runtime> { I wonder if it makes sense to split this trait into two traits: one for an actual provider of directories (it would get the
latest_netdir
andevents
functions), and one for the common setup and configuration API (it would get thebootstrap
andreconfigure
functions). Having separate traits there could help us down the road if we want to exposelatest_netdir
to other crates without giving them free rein to reconfigure the manager.changed this line in version 3 of the diff
Having separate traits there could help us down the road if we want to expose
latest_netdir
to other crates without giving them free rein to reconfigure the manager.Personally I'm not sure that usecase warrants a separate trait; we could just implement that with our own
struct
that only let you access those two methods.If we think it's feasible that the thing providing a netdir might be a different thing to the thing providing the ability to bootstrap, splitting would make sense, but I don't see that happening (?)
- Resolved by cgrigis
- Resolved by cgrigis
- crates/tor-dirmgr/src/flatfiledirmgr.rs 0 → 100644
20 use std::collections::HashSet; 21 use std::fs; 22 use std::path::Path; 23 use std::sync::Arc; 24 25 use crate::config::DirMgrConfig; 26 use crate::err::Error; 27 use crate::event::DirEvent; 28 use crate::{DirProvider, Result}; 29 30 // 1/CHURN_FRACTION is the threshold of the consensus relays that we can remove with the churn 31 const CHURN_FRACTION: usize = 6; 32 33 /// A directory manager that loads the directory information from flat files read from the cache 34 /// directory. 35 pub struct FlatFileDirMgr<R: Runtime> { Would it be possible for this type to exist as a freestanding provider outside of the
arti
source code? Possibly we could make a mechanism inarti-client
for setting up aTorClient
with a nonstandardDirMgr
, rather than having the code inarti-client
have to always know about every availableDirProvider
that a developer might want to use.I ask because my big motivation behind this
Provider
architecture is to give a way for projects with unusual needs to override arti's default behavior without making them upstream their code into the Arti codebase.(If it would help, I'd be happy to export
SharedMutArc
andFlagPublisher
from tor-dirmgr if an appropriate feature (maybe something likeexpose-helpers
) is enabled.)That would be ideal, so that we don't give extra burden to the
Arti
maintainers with our special use case, and also so that we can be more agile in updating it.If I understand correctly, we would need a method to create a
TorClient
passing it aDirProvider
. Is that what you had in mind?Oh hm. I think that would work, but it might be a good idea to think about a
builder
-style mechanism. (There is a parallel conversation happening here in #350 (closed))changed this line in version 9 of the diff
I like the
builder
implementation that is in progress in !337 (merged)!To advance a bit on my end, I worked on an implementation that temporarily uses an extra paremeter, just to see what it entails. It should be easily adapted to the builder once !337 (merged) is merged. This allowed me to externalize our whole
FlatFileDirMgr
, at the expense of some extra exports.Here are some notes:
- I did not find an easy way to pass a
DirProvider
instance, due to the fact that the way it is constructed withincreate_inner()
uses freshly created objects not available before (namely, thecircmgr
). Instead, I pass a closure, which is a constructor that will be used when appropriate to create thedirmgr
. Do you see an easier way? - I limited the exposed APIs to the ones I needed to compile our
FlatFileDirMgr
from an external crate. This probably will need to be adjusted, and hidden behind a feature as you suggested. - The CI fails because I did not adapt all the other
TorClient
call sites; I wanted to have your opinion first.
Can you have a look and let me know what you think? Thanks!
- I did not find an easy way to pass a
added 2 commits
@nickm drew my attention to this. I think this is an interesting problem that deserves a good solution.
I read your docs about why you're doing this, especially https://github.com/c4dt/lightarti-rest#custom-tor-consensus , and I have some questions. I decided to pose them here but feel free to redirect me.
Some of my questions are probably daft - please forgive me. Also please forgive my tendency to ask architectural "why have you done it like this" questions well after implementation...
I didn't (with a cursory search) find a document describing who signs what and why, in your new scheme. Obviously you sign your mobile phone app. Then presumably the mobile phone app contains public key(s) - presumably yours, with which you sign your reduced-size directory download. I think from reading the MR that you are using the Tor directory consensus signing protocol, but feeding your own public keys in via the configuration mechanism?
I'm not sure I much like the "churn" design. Firstly, not signing this seems a clear oversight. But, this is just a bandwidth optimisation. Often it is better to apply general tools than to invent an ad-hoc compression/update scheme. So here, maybe what you ought to do is sign a new complete new consensus and distribute updates via something like the rsync protocol? (rsync "offline" updates distributed via https, maybe) Secondly, the approach you give here seems to risk giving different clients different effective consensuses, which might make them more distinguishable than ideal.
I don't see a discussion of the distribution mechanism for these consensuses (and the churn updates). I guess you're just using plain https? So your lightarti consensus distribution service is a spof for your application?
I didn't see a mechanism for updating the directory in a running
TorClient
. Is the idea that you recreate aTorClient
for every request? That seems like it might be more expensive than ideal.As for the code here in this arti MR: There are some slightly awkward interface splits: eg, this arrangement seems to imply exposing the arti directory cache format as the external interface? And the "churn file" format becomes part of the interface to arti here too. The generator for this is in the lightarti-rest tree. I'm not sure these things are a showstopper but it's a thing that we need to be aware of. ICBW but I don't think we (Arti devs) promise not to change these formats.
@wouter I believe this one is for you. :)
Hi Ian,
As Wouter is quite busy, I'll have a first go at explaining what we're doing here and give some background. I hope it makes it easier for Wouter and not more complicated to navigate your questions and my (potentially wrong) answers.
First of all, as to the source of all of this: the goal was to add an anonymization layer for the SwissCovid app that would make it possible to share some statistics. So once a day the app would connect through Tor to the Swiss server, and send the statistics. Using tor, the Swiss server would not be able to know who it is.
As this was a 'once per day' data transfer, having to download 10MB of directory information from the tor servers every day was not acceptable for the SwissCovid app: as some people have limited data plans, the goal was to reduce traffic to a minimum, so that this could be accepted.
So the idea came up to have a reduced directory information that can be downloaded once per week.
The next problem was that even by creating the directory information only from the 100 most reliable servers, some of them would still drop out. So as an optimisation exercice we added this churn file that only indicates which servers from the current directory information are not valid anymore.
Unfortunately, the Swiss government was not interested in all of this. So development stopped after an initial proof-of-concept. However, at C4DT.org we had some spare cycles, so we decided to revive the project and make it so that it runs with the latest upstream code. Hoping that it would also be useful to other people.
Which lets me answer your questions:
I didn't (with a cursory search) find a document describing who signs what and why, in your new scheme. Obviously you sign your mobile phone app. Then presumably the mobile phone app contains public key(s) - presumably yours, with which you sign your reduced-size directory download. I think from reading the MR that you are using the Tor directory consensus signing protocol, but feeding your own public keys in via the configuration mechanism?
You got most of what we're doing. I also added https://github.com/c4dt/lightarti-rest/issues/89 because that's an important part. Even though I'm not really sure it will add much to the security, as we're downloading it over https, and we have to trust the server anyway - see also below.
In our first PoC we trust the Swiss server - or currently the https://github.com/c4dt/lightarti-directory. If there is interest, we would also very much like to give the user of the library a choice to use either the prepared directory information, or download it directly from the tor nodes.
I'm not sure I much like the "churn" design. Firstly, not signing this seems a clear oversight. But, this is just a bandwidth optimisation. Often it is better to apply general tools than to invent an ad-hoc compression/update scheme. So here, maybe what you ought to do is sign a new complete new consensus and distribute updates via something like the rsync protocol? (rsync "offline" updates distributed via https, maybe) Secondly, the approach you give here seems to risk giving different clients different effective consensuses, which might make them more distinguishable than ideal.
If the server is malicious, signing the churn will not really help, will it? Because the server can sign a wrong, or different, churn anyway. What would probably be the best is to publish the hash of the files to a reliable bulletin-board (:cough: blockchain :cough:). So that at least all clients can be sure that they get the same files and are not deanonymized by a special churn file.
I don't see a discussion of the distribution mechanism for these consensuses (and the churn updates). I guess you're just using plain https? So your lightarti consensus distribution service is a spof for your application?
Yes - as you'll understand from the history of the project, that seemed reasonable to do. The only reason we do sign the directory information was to being able to feed it to arti. From my understanding of security we would not lose much security if we wouldn't sign it at all. But that depends a lot on how the signing is implemented on the server side: if the signing is implemented in a way that doesn't allow somebody owning the server to sign files, then it would be better to sign.
I didn't see a mechanism for updating the directory in a running
TorClient
. Is the idea that you recreate aTorClient
for every request? That seems like it might be more expensive than ideal.Absolutely. As the first PoC was supposed to make 1 GET request per day, that didn't seem to be a problem. Again, if we get enough interest, I'm more than happy to make it better in that regards.
As for the code here in this arti MR: There are some slightly awkward interface splits: eg, this arrangement seems to imply exposing the arti directory cache format as the external interface? And the "churn file" format becomes part of the interface to arti here too. The generator for this is in the lightarti-rest tree. I'm not sure these things are a showstopper but it's a thing that we need to be aware of. ICBW but I don't think we (Arti devs) promise not to change these formats.
Just thinking out loud: could we take our implementation of the DirMgr in our own repo, lightarti-rest? Would that remove part of the pain? Then we would only ask you to accept the changes to have a trait, but the implementation would not be needed in here.
I hope that helped a bit, and I hope @wouter will not have to correct me on too many points...
Linus
added 2 commits
added 83 commits
-
6c7d1011...1572fc52 - 80 commits from branch
tpo/core:main
- 3038ef94 - Add basic trait, use it in client
- 934141f9 - Implement minimalist FlatFileDirMgr
- c94e612a - Silence some "unused import" and "dead code" warnings.
Toggle commit list-
6c7d1011...1572fc52 - 80 commits from branch
mentioned in issue #350 (closed)