Skip to content
Snippets Groups Projects

Add flatfile DirMgr

Merged cgrigis requested to merge cgrigis/arti:add-flatfile-dirmgr into main

This is a first shot at implementing the changes needed for #267 (closed).

  • Definition of a DirProvider trait, abstracting the DirMgr. This is not complete, and in particular there is an issue with the client calling bootstrap(): the call cannot be abstracted in the trait, because async fn's are not (yet) supported.

  • Changes in TorClient to use a DirProvider instead of a DirMgr. Not complete, due to the above (currently has a nasty hack of keeping a reference to the actual DirMgr).

  • Implementation of a minimalist DirProvider called FlatFileDirMgr, that only loads the directory from cache files. Hidden behind a lightarti feature.

  • Changes in TorClient to use FlatFileDirMgr under a lightarti feature.

  • Addition of a couple of methods to UnvalidatedMdConsensus, to help manipulating the relays in FlatFileDirMgr.

  • Addition of FromStr for Authority, to allow us to load an authority from a file. The use of it is actually outside of Arti, in our client code, feeding the authorities to TorClient through configuration.

All comments and suggestions for improvements welcome. :smile: (Given that I am very new to Rust, I am sure I wrote a lot of "unrusty" things :upside_down: ).

Merge request reports

Loading
Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
  • Nick Mathewson
  • Nick Mathewson
    Nick Mathewson @nickm started a thread on an outdated change in commit 5bdb537c
  • 99 99 /// A Result as returned by this crate.
    100 100 pub type Result<T> = std::result::Result<T, Error>;
    101 101
    102 /// Trait for DirMgr implementations
    103 pub trait DirProvider<R: Runtime> {
    • I wonder if it makes sense to split this trait into two traits: one for an actual provider of directories (it would get the latest_netdir and events functions), and one for the common setup and configuration API (it would get the bootstrap and reconfigure functions). Having separate traits there could help us down the road if we want to expose latest_netdir to other crates without giving them free rein to reconfigure the manager.

    • cgrigis changed this line in version 3 of the diff

      changed this line in version 3 of the diff

    • Contributor

      Having separate traits there could help us down the road if we want to expose latest_netdir to other crates without giving them free rein to reconfigure the manager.

      Personally I'm not sure that usecase warrants a separate trait; we could just implement that with our own struct that only let you access those two methods.

      If we think it's feasible that the thing providing a netdir might be a different thing to the thing providing the ability to bootstrap, splitting would make sense, but I don't see that happening (?)

    • Please register or sign in to reply
  • Nick Mathewson
  • Nick Mathewson
  • Nick Mathewson
    Nick Mathewson @nickm started a thread on an outdated change in commit 3dd695af
  • 20 use std::collections::HashSet;
    21 use std::fs;
    22 use std::path::Path;
    23 use std::sync::Arc;
    24
    25 use crate::config::DirMgrConfig;
    26 use crate::err::Error;
    27 use crate::event::DirEvent;
    28 use crate::{DirProvider, Result};
    29
    30 // 1/CHURN_FRACTION is the threshold of the consensus relays that we can remove with the churn
    31 const CHURN_FRACTION: usize = 6;
    32
    33 /// A directory manager that loads the directory information from flat files read from the cache
    34 /// directory.
    35 pub struct FlatFileDirMgr<R: Runtime> {
    • Would it be possible for this type to exist as a freestanding provider outside of the arti source code? Possibly we could make a mechanism in arti-client for setting up a TorClient with a nonstandard DirMgr, rather than having the code in arti-client have to always know about every available DirProvider that a developer might want to use.

      I ask because my big motivation behind this Provider architecture is to give a way for projects with unusual needs to override arti's default behavior without making them upstream their code into the Arti codebase.

      (If it would help, I'd be happy to export SharedMutArc and FlagPublisher from tor-dirmgr if an appropriate feature (maybe something like expose-helpers) is enabled.)

    • Author Contributor

      That would be ideal, so that we don't give extra burden to the Arti maintainers with our special use case, and also so that we can be more agile in updating it.

      If I understand correctly, we would need a method to create a TorClient passing it a DirProvider. Is that what you had in mind?

    • Oh hm. I think that would work, but it might be a good idea to think about a builder-style mechanism. (There is a parallel conversation happening here in #350 (closed))

    • cgrigis changed this line in version 9 of the diff

      changed this line in version 9 of the diff

    • Author Contributor

      I like the builder implementation that is in progress in !337 (merged)!

      To advance a bit on my end, I worked on an implementation that temporarily uses an extra paremeter, just to see what it entails. It should be easily adapted to the builder once !337 (merged) is merged. This allowed me to externalize our whole FlatFileDirMgr, at the expense of some extra exports.

      Here are some notes:

      • I did not find an easy way to pass a DirProvider instance, due to the fact that the way it is constructed within create_inner() uses freshly created objects not available before (namely, the circmgr). Instead, I pass a closure, which is a constructor that will be used when appropriate to create the dirmgr. Do you see an easier way?
      • I limited the exposed APIs to the ones I needed to compile our FlatFileDirMgr from an external crate. This probably will need to be adjusted, and hidden behind a feature as you suggested.
      • The CI fails because I did not adapt all the other TorClient call sites; I wanted to have your opinion first.

      Can you have a look and let me know what you think? Thanks!

    • Author Contributor

      Just pushed updates that use the new TorClientBuilder.

    • Please register or sign in to reply
  • Okay, I've looked over the code, and left some comments. The first commit seems pretty reasonable; for the others I'd like to talk about how to structure things.

    Also I'd like it if @eta and/or @Diziet can offer opinions about the architecture here.

  • cgrigis added 3 commits

    added 3 commits

    • 75df25ae - Add basic trait, use it in client
    • abecbc19 - Implement minimalist FlatFileDirMgr
    • 7bc606ce - Add `FromStr` for `Authority`

    Compare with previous version

  • cgrigis added 2 commits

    added 2 commits

    • 0153ce93 - Add basic trait, use it in client
    • aaf899ff - Implement minimalist FlatFileDirMgr

    Compare with previous version

    • @nickm drew my attention to this. I think this is an interesting problem that deserves a good solution.

      I read your docs about why you're doing this, especially https://github.com/c4dt/lightarti-rest#custom-tor-consensus , and I have some questions. I decided to pose them here but feel free to redirect me.

      Some of my questions are probably daft - please forgive me. Also please forgive my tendency to ask architectural "why have you done it like this" questions well after implementation...

      I didn't (with a cursory search) find a document describing who signs what and why, in your new scheme. Obviously you sign your mobile phone app. Then presumably the mobile phone app contains public key(s) - presumably yours, with which you sign your reduced-size directory download. I think from reading the MR that you are using the Tor directory consensus signing protocol, but feeding your own public keys in via the configuration mechanism?

      I'm not sure I much like the "churn" design. Firstly, not signing this seems a clear oversight. But, this is just a bandwidth optimisation. Often it is better to apply general tools than to invent an ad-hoc compression/update scheme. So here, maybe what you ought to do is sign a new complete new consensus and distribute updates via something like the rsync protocol? (rsync "offline" updates distributed via https, maybe) Secondly, the approach you give here seems to risk giving different clients different effective consensuses, which might make them more distinguishable than ideal.

      I don't see a discussion of the distribution mechanism for these consensuses (and the churn updates). I guess you're just using plain https? So your lightarti consensus distribution service is a spof for your application?

      I didn't see a mechanism for updating the directory in a running TorClient. Is the idea that you recreate a TorClient for every request? That seems like it might be more expensive than ideal.

      As for the code here in this arti MR: There are some slightly awkward interface splits: eg, this arrangement seems to imply exposing the arti directory cache format as the external interface? And the "churn file" format becomes part of the interface to arti here too. The generator for this is in the lightarti-rest tree. I'm not sure these things are a showstopper but it's a thing that we need to be aware of. ICBW but I don't think we (Arti devs) promise not to change these formats.

    • Author Contributor

      @wouter I believe this one is for you. :)

    • Hi Ian,

      As Wouter is quite busy, I'll have a first go at explaining what we're doing here and give some background. I hope it makes it easier for Wouter and not more complicated to navigate your questions and my (potentially wrong) answers.

      First of all, as to the source of all of this: the goal was to add an anonymization layer for the SwissCovid app that would make it possible to share some statistics. So once a day the app would connect through Tor to the Swiss server, and send the statistics. Using tor, the Swiss server would not be able to know who it is.

      As this was a 'once per day' data transfer, having to download 10MB of directory information from the tor servers every day was not acceptable for the SwissCovid app: as some people have limited data plans, the goal was to reduce traffic to a minimum, so that this could be accepted.

      So the idea came up to have a reduced directory information that can be downloaded once per week.

      The next problem was that even by creating the directory information only from the 100 most reliable servers, some of them would still drop out. So as an optimisation exercice we added this churn file that only indicates which servers from the current directory information are not valid anymore.

      Unfortunately, the Swiss government was not interested in all of this. So development stopped after an initial proof-of-concept. However, at C4DT.org we had some spare cycles, so we decided to revive the project and make it so that it runs with the latest upstream code. Hoping that it would also be useful to other people.

      Which lets me answer your questions:

      I didn't (with a cursory search) find a document describing who signs what and why, in your new scheme. Obviously you sign your mobile phone app. Then presumably the mobile phone app contains public key(s) - presumably yours, with which you sign your reduced-size directory download. I think from reading the MR that you are using the Tor directory consensus signing protocol, but feeding your own public keys in via the configuration mechanism?

      You got most of what we're doing. I also added https://github.com/c4dt/lightarti-rest/issues/89 because that's an important part. Even though I'm not really sure it will add much to the security, as we're downloading it over https, and we have to trust the server anyway - see also below.

      In our first PoC we trust the Swiss server - or currently the https://github.com/c4dt/lightarti-directory. If there is interest, we would also very much like to give the user of the library a choice to use either the prepared directory information, or download it directly from the tor nodes.

      I'm not sure I much like the "churn" design. Firstly, not signing this seems a clear oversight. But, this is just a bandwidth optimisation. Often it is better to apply general tools than to invent an ad-hoc compression/update scheme. So here, maybe what you ought to do is sign a new complete new consensus and distribute updates via something like the rsync protocol? (rsync "offline" updates distributed via https, maybe) Secondly, the approach you give here seems to risk giving different clients different effective consensuses, which might make them more distinguishable than ideal.

      If the server is malicious, signing the churn will not really help, will it? Because the server can sign a wrong, or different, churn anyway. What would probably be the best is to publish the hash of the files to a reliable bulletin-board (:cough: blockchain :cough:). So that at least all clients can be sure that they get the same files and are not deanonymized by a special churn file.

      I don't see a discussion of the distribution mechanism for these consensuses (and the churn updates). I guess you're just using plain https? So your lightarti consensus distribution service is a spof for your application?

      Yes - as you'll understand from the history of the project, that seemed reasonable to do. The only reason we do sign the directory information was to being able to feed it to arti. From my understanding of security we would not lose much security if we wouldn't sign it at all. But that depends a lot on how the signing is implemented on the server side: if the signing is implemented in a way that doesn't allow somebody owning the server to sign files, then it would be better to sign.

      I didn't see a mechanism for updating the directory in a running TorClient. Is the idea that you recreate a TorClient for every request? That seems like it might be more expensive than ideal.

      Absolutely. As the first PoC was supposed to make 1 GET request per day, that didn't seem to be a problem. Again, if we get enough interest, I'm more than happy to make it better in that regards.

      As for the code here in this arti MR: There are some slightly awkward interface splits: eg, this arrangement seems to imply exposing the arti directory cache format as the external interface? And the "churn file" format becomes part of the interface to arti here too. The generator for this is in the lightarti-rest tree. I'm not sure these things are a showstopper but it's a thing that we need to be aware of. ICBW but I don't think we (Arti devs) promise not to change these formats.

      Just thinking out loud: could we take our implementation of the DirMgr in our own repo, lightarti-rest? Would that remove part of the pain? Then we would only ask you to accept the changes to have a trait, but the implementation would not be needed in here.

      I hope that helped a bit, and I hope @wouter will not have to correct me on too many points...

      Linus

    • Please register or sign in to reply
  • cgrigis added 1 commit

    added 1 commit

    • a235880e - Implement minimalist FlatFileDirMgr

    Compare with previous version

  • cgrigis added 2 commits

    added 2 commits

    • 10da9bf8 - Add basic trait, use it in client
    • 6c7d1011 - Implement minimalist FlatFileDirMgr

    Compare with previous version

  • cgrigis added 83 commits

    added 83 commits

    • 6c7d1011...1572fc52 - 80 commits from branch tpo/core:main
    • 3038ef94 - Add basic trait, use it in client
    • 934141f9 - Implement minimalist FlatFileDirMgr
    • c94e612a - Silence some "unused import" and "dead code" warnings.

    Compare with previous version

  • cgrigis added 2 commits

    added 2 commits

    • 00c59a8f - Implement minimalist FlatFileDirMgr
    • b84b43e8 - Silence some "unused import" and "dead code" warnings.

    Compare with previous version

  • cgrigis added 2 commits

    added 2 commits

    • 9cb68dce - Implement minimalist FlatFileDirMgr
    • 199871bd - Silence some "unused import" and "dead code" warnings.

    Compare with previous version

  • mentioned in issue #350 (closed)

  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Please register or sign in to reply
    Loading