doc/CONFIG-NOTES.md: config proposal
This is my draft proposal to address #285 (closed). The text is identical to that in #285 (comment 2770848)
Merge request reports
Activity
mentioned in issue #285 (closed)
assigned to @Diziet
- doc/CONFIG-NOTES.md 0 → 100644
71 ``` 72 (borrowing notation from `fehler` for clarity; actual code will be `-> Result`) 73 74 For embedders, `arti-client` should provide a method a bit like this 75 (name TBD) 76 ``` 77 #]throws] 78 pub fn tor_and_caller_config_from_usual_files<T>() -> (TorClientCfg, T) 79 where T: Deserialize 80 { 81 #[derive(Deserialize)] 82 struct CombinedConfig { 83 #[flatten] tor: TorConfig, 84 #[flatten] rest: T, 85 } 86 ... load a CombinedCnfig from the usual Tor config files ... Sure it does, and you can re-serialize: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=56529f2371f3c982850784b54e3bb412
And yes, there is a namespacing issue: you can't have sections that are provided by our code which have the same names as the caller's sections. serde's behaviour is to take the first when there are clashes. So the suggestion I am making causes our sections to override the caller's. That may not be right. But I think whichever way we choose for that we can address that with appropriate documentation.
- doc/CONFIG-NOTES.md 0 → 100644
93 I suggest that `TorClientCfg` should be a non-exhaustive but otherwise transparent struct (i.e. with `pub` fields). 94 95 The "sections" in it (ie, the sub-struct names and types) are already public, since they are part of the config file format. Restructuring this would be a breaking change regardless of the Rust API. So the current scheme does not add any additional API abstraction - it merely demands boilerplate code. 96 97 ### `config` vs `figment` 98 99 Currently we use `config`. I looked at others briefly and rather more closely at `figment`. Certainly `figment is a lot richer. I think it might be able to produce better errors. I have to say though that when ended up using a library that used `figment` I found it a bit awkward and confusing. 100 101 For now I suggest retaining `config`. At least, the code that knows we're using it will be fairly self-contained. 102 103 ## Miscellaneous 104 105 * `TorClientCfg`. Suggest renaming to `TorClientConfig`. If we do things right, users won't normally need to type this name. 106 107 * Default config file location is currently `./arti-config.toml`. This should 108 be changed (to XDG dirs probably) - Comment on lines +107 to +108
I don't think that's what the default is now: it's currently in arti-config:
/// Return a filename for the default user configuration file. pub fn default_config_file() -> Option<PathBuf> { CfgPath::new("${ARTI_CONFIG}/arti.toml".into()).path().ok() }
So that should in theory use XDG dirs on XDG platforms.
- Resolved by Ian Jackson
- doc/CONFIG-NOTES.md 0 → 100644
42 43 ### Discussion 44 45 To make our config loading code reuseable, in everything from a slightly extended version of `arti`, to some entirely different embedding of our Tor, code, we need to separate out: 46 47 * Code that knows about configuration sources (how to read a file, default locations of files, how to access the command line) 48 49 * Code that knows about configuration *settings* 50 51 (Separate in the sense that most entrypoints should be one of these or the other, and they should be brought together only in convenience methods.) 52 53 ### Proposal 54 55 All of the `FooConfig` structs should be made directly `Deserialize` and `Serialize` (subject to serde cargo feature of course). (See above re abolition of `FooConfigBuilder` as a separate public type.) This means that much of the boilerplate field-copying (and conversions) in arti-config can be deleted. 56 57 The knowledge of the default config file location(s) and sources should be (exposed) in `arti-client` (not `arti`), with the implication that we hope that most embedders will use it. So, right now it's in
arti-config
, on the theory thatarti-config
is where you look if you want your application to support the same configuratoin format as arti, and not otherwise. It sounds like in this proposal you're lowering part of arti-config intoarti-client
, but I'm not sure exactly how much.Everything that knows about what's in
TorClientCfg
at least, I think, since we want all of that logic to be shared with everyone who usesarti-client
.I think what is left in
arti-config
is mostly the cmdline handling, and maybe some of the logging. If we want a principled distinction I think it's probably thatarti-config
is "stuff that is specific to the arti command line program, including its logging config, and parsing of config from the command line". So reuseable by an embedder other than arti, but we don't much mind if embedders do this stuff differently, so it can be off in another crate.But, I am not wedded to the precise crate structure at this level. What I do think we want is that if an embedder has configurable Tor client, the easy path for them ends up with their Tor client configurable using the same files that arti uses. Whether the rest of their program's config is their own section in that config, or in a different file, or whatever, is something I don't think we need to have an opinion aobut.
- doc/CONFIG-NOTES.md 0 → 100644
10 11 We should make client construction using default config file locations easy (at leasat as easy as construction with hardcoded defaults). That will make it easy for embedders to provide user-tweakable config. 12 13 We should add some warnings to the docs for configuration settings which shrink anonmity sets. (Or possibly blanket warnings.) This should aim to discourage not only users but also embedders from making poor choices. 14 15 ## Validation (startup API simplification) 16 17 ### Discussion 18 19 Few callers will want to validate the configuration separately from using it to make a Tor client (or running copy of the arti executable). The separation between `FooConfigBuilder` (partially-validated[1] configuration read from file(s) and/or set up with builder pattern methods) and `FooConfig` (validated configuration for Foo) doubles the number of config structs in our API. ([1] I say "partially" because the types of individual fields are right, although their values may be inappropriate.) 20 21 This distinction prevents (typestate-style) a programmer from writing code which directly constructs a client from an only-partially-validated config, but a program which does this is not buggy (if we make the validation happen at construction time) - and actually providing that API is easier for the caller. And because the `FooConfig` is already partially validated, there is 22 23 ### Proposal 24 25 We rename `FooConfigBuilder` to `FooConfig`. The current `FooConfig` type will become a private `foo::ValidatedConfig` type. (The two types will usually still need to be separate because defaulting means the builder ought usually to contain many `Option`, whereas the operational code doesn't want to see those.) - doc/CONFIG-NOTES.md 0 → 100644
16 17 ### Discussion 18 19 Few callers will want to validate the configuration separately from using it to make a Tor client (or running copy of the arti executable). The separation between `FooConfigBuilder` (partially-validated[1] configuration read from file(s) and/or set up with builder pattern methods) and `FooConfig` (validated configuration for Foo) doubles the number of config structs in our API. ([1] I say "partially" because the types of individual fields are right, although their values may be inappropriate.) 20 21 This distinction prevents (typestate-style) a programmer from writing code which directly constructs a client from an only-partially-validated config, but a program which does this is not buggy (if we make the validation happen at construction time) - and actually providing that API is easier for the caller. And because the `FooConfig` is already partially validated, there is 22 23 ### Proposal 24 25 We rename `FooConfigBuilder` to `FooConfig`. The current `FooConfig` type will become a private `foo::ValidatedConfig` type. (The two types will usually still need to be separate because defaulting means the builder ought usually to contain many `Option`, whereas the operational code doesn't want to see those.) 26 27 Config validation errors will be reported during client construction. 28 29 If desired we could provide a `validate()` method for callers that want to double check a config without trying to instantiate it. Or maybe the "create without bootstrapping" entrypoint will be good for this. 30 31 The `Deserialize` impl on configs will be applied to the builder struct (now `FooConfig`) rather than the "built" struct, via `#[builder(derive(Deserialize))]`. (Defaults will not need to be specified to serde, since the fields of the builder are `Option`.) - doc/CONFIG-NOTES.md 0 → 100644
24 25 We rename `FooConfigBuilder` to `FooConfig`. The current `FooConfig` type will become a private `foo::ValidatedConfig` type. (The two types will usually still need to be separate because defaulting means the builder ought usually to contain many `Option`, whereas the operational code doesn't want to see those.) 26 27 Config validation errors will be reported during client construction. 28 29 If desired we could provide a `validate()` method for callers that want to double check a config without trying to instantiate it. Or maybe the "create without bootstrapping" entrypoint will be good for this. 30 31 The `Deserialize` impl on configs will be applied to the builder struct (now `FooConfig`) rather than the "built" struct, via `#[builder(derive(Deserialize))]`. (Defaults will not need to be specified to serde, since the fields of the builder are `Option`.) 32 33 Since the "built" config is now private, the builder-generated `build` method becomes private. If we want to do complicated defaulting (eg, defaulting that depends on other options) or the like, we may need to hand-roll that (or parts of it). 34 35 ## Visibility of `FooConfig` structs 36 37 We have decided that `tor-*` crates have much more unstable APIs than `arti-*`. Configuration is exposed via the config file format, and therefore a semantically equivalent view should be avialable via the APIs. 38 39 So each `FooConfig` needs to be re-exported by `arti-client`, and semver-breaking changes to those are semver breaks for arti. I believe we do that now; I still think it's a good idea.
(This is apropros re-exporting.) I confess I didn't check. I'm not sure if all the crates are handled the same way, either.
(I assume when you say FooConfig you only mean the tor-* FooConfig structures, not the ones currently in arti-config.)
Yes.
- doc/CONFIG-NOTES.md 0 → 100644
80 { 81 #[derive(Deserialize)] 82 struct CombinedConfig { 83 #[flatten] tor: TorConfig, 84 #[flatten] rest: T, 85 } 86 ... load a CombinedCnfig from the usual Tor config files ... 87 ``` 88 89 On the principle of not having important knowledge only entangled in some more complicated machinery, there should probably be a ```fn usual_config_files() -> Vec<PathBuf>``` (or something). 90 91 ## Transparency of `TorClientCfg` 92 93 I suggest that `TorClientCfg` should be a non-exhaustive but otherwise transparent struct (i.e. with `pub` fields). 94 95 The "sections" in it (ie, the sub-struct names and types) are already public, since they are part of the config file format. Restructuring this would be a breaking change regardless of the Rust API. So the current scheme does not add any additional API abstraction - it merely demands boilerplate code. - doc/CONFIG-NOTES.md 0 → 100644
95 The "sections" in it (ie, the sub-struct names and types) are already public, since they are part of the config file format. Restructuring this would be a breaking change regardless of the Rust API. So the current scheme does not add any additional API abstraction - it merely demands boilerplate code. 96 97 ### `config` vs `figment` 98 99 Currently we use `config`. I looked at others briefly and rather more closely at `figment`. Certainly `figment is a lot richer. I think it might be able to produce better errors. I have to say though that when ended up using a library that used `figment` I found it a bit awkward and confusing. 100 101 For now I suggest retaining `config`. At least, the code that knows we're using it will be fairly self-contained. 102 103 ## Miscellaneous 104 105 * `TorClientCfg`. Suggest renaming to `TorClientConfig`. If we do things right, users won't normally need to type this name. 106 107 * Default config file location is currently `./arti-config.toml`. This should 108 be changed (to XDG dirs probably) 109 110 * `arti_defaults.toml`: this duplicates the default settings baked into the code. This is kind of inevitable if we want to supplyy an example, but there is a big risk of divergence. Either (a) there should be a test case that the defaults match the baked-in ones or (b) there should be no baked-in ones and instead things should be read from this file. Also, it is in danger of being taken as an example config file, which is not great if we ever want to change the defaults. Suggest we comment out every setting. (The test case or run-time defaulting will now need to use a mechanically-uncommented versions.) I think that test-case for "(a)", or something like it, exists as
arti_config::options::test::default_config
.(Another purpose of having
arti_defaults.toml
is to make sure that the defaults in the code match what we think they are, and that the encoding matches what we think it is.)I'm okay with making things commented out by default.
I think I'm good with this design, modulo questions above, though of course I'd also like to see @eta's feedback.
Three things to think about before proceeding:
- Is there a way to do any of this incrementally as a series of smaller MRs? Ideally, we'd do the riskiest parts first, so that we can be sure that all of our ideas work out.
- What example code should we provide? (I think doing some example code for this would be a valuable way to confirm that the API is ergonomic.)
- How does this interact with the
reconfigure()
design?
Is there a way to do any of this incrementally as a series of smaller MRs?
There are a number of different things going on here and yes I think they can be split up. The most risky parts are (i) conflating the builder structs with what the API thinks of the config and (ii) the high-level changes about how the files are read. I think (i) can be tried out with one crate first without causing anything other than an API break. (ii) can be done separately from the rest.
What example code should we provide? (I think doing some example code for this would be a valuable way to confirm that the API is ergonomic.)
I'm hoping that
arti
will become even more trivial than it is already :-). arti then becomes the prototypical embedder.How does this interact with the reconfigure() design?
I confess I didn't think about this properly but I don't think there are fundamental problems. I will go and do some more reading and update my proposal.
requested review from @eta
- doc/CONFIG-NOTES.md 0 → 100644
1 # Configuration of tor-* and arti (API) 2 3 *Discussion proposal* 4 5 ## Overall philosophy 6 7 There should be one coherent schema/data model for our configuration. It should be extensible enough that the arti executable can reuse the machinery used in arti-client, and that other embedders can also use it. 8 9 The schema should be reified in serde implementations, our config reader, and builder pattern methods etc. - i.e., all these methods should be equivalent. There shouldn't be different names or semantics depending how the config is fed into our code. 10 11 We should make client construction using default config file locations easy (at leasat as easy as construction with hardcoded defaults). That will make it easy for embedders to provide user-tweakable config. 11 We should make client construction using default config file locations easy (at leasat as easy as construction with hardcoded defaults). That will make it easy for embedders to provide user-tweakable config. 11 We should make client construction using default config file locations easy (at least as easy as construction with hardcoded defaults). That will make it easy for embedders to provide user-tweakable config.
- Resolved by Ian Jackson
- doc/CONFIG-NOTES.md 0 → 100644
6 7 There should be one coherent schema/data model for our configuration. It should be extensible enough that the arti executable can reuse the machinery used in arti-client, and that other embedders can also use it. 8 9 The schema should be reified in serde implementations, our config reader, and builder pattern methods etc. - i.e., all these methods should be equivalent. There shouldn't be different names or semantics depending how the config is fed into our code. 10 11 We should make client construction using default config file locations easy (at leasat as easy as construction with hardcoded defaults). That will make it easy for embedders to provide user-tweakable config. 12 13 We should add some warnings to the docs for configuration settings which shrink anonmity sets. (Or possibly blanket warnings.) This should aim to discourage not only users but also embedders from making poor choices. 14 15 ## Validation (startup API simplification) 16 17 ### Discussion 18 19 Few callers will want to validate the configuration separately from using it to make a Tor client (or running copy of the arti executable). The separation between `FooConfigBuilder` (partially-validated[1] configuration read from file(s) and/or set up with builder pattern methods) and `FooConfig` (validated configuration for Foo) doubles the number of config structs in our API. ([1] I say "partially" because the types of individual fields are right, although their values may be inappropriate.) 20 21 This distinction prevents (typestate-style) a programmer from writing code which directly constructs a client from an only-partially-validated config, but a program which does this is not buggy (if we make the validation happen at construction time) - and actually providing that API is easier for the caller. And because the `FooConfig` is already partially validated, there is changed this line in version 3 of the diff
- doc/CONFIG-NOTES.md 0 → 100644
4 5 ## Overall philosophy 6 7 There should be one coherent schema/data model for our configuration. It should be extensible enough that the arti executable can reuse the machinery used in arti-client, and that other embedders can also use it. 8 9 The schema should be reified in serde implementations, our config reader, and builder pattern methods etc. - i.e., all these methods should be equivalent. There shouldn't be different names or semantics depending how the config is fed into our code. 10 11 We should make client construction using default config file locations easy (at leasat as easy as construction with hardcoded defaults). That will make it easy for embedders to provide user-tweakable config. 12 13 We should add some warnings to the docs for configuration settings which shrink anonmity sets. (Or possibly blanket warnings.) This should aim to discourage not only users but also embedders from making poor choices. 14 15 ## Validation (startup API simplification) 16 17 ### Discussion 18 19 Few callers will want to validate the configuration separately from using it to make a Tor client (or running copy of the arti executable). The separation between `FooConfigBuilder` (partially-validated[1] configuration read from file(s) and/or set up with builder pattern methods) and `FooConfig` (validated configuration for Foo) doubles the number of config structs in our API. ([1] I say "partially" because the types of individual fields are right, although their values may be inappropriate.) So I don't think having builders is actually bad per se, although I've never understood the need to "validate" configuration when using them (this is more of a complaint against
derive_builder
). An ideal design would reject invalid values when calling the setter method, instead of when callingbuild()
at the end, i.e.:Foo::builder().field(...)?.build()
instead of
Foo::builder().field(...).build()?
In a lot of cases, though, I don't think there even are invalid values (and we might be able to constrain those that exist using the type system, so they're catchable at compile time) — which would let us get rid of the need to do this at all.
One issue is that there are sometimes settings that are individually allowable, but not mutually permissible. For example, there are some settings that we should only allow you to reconfigure if you are on a nonstandard Tor network (not with the default authorities).
(This isn't the case for most options, but it does come up.)
not mutually permissible
Right, so the thing with the builder methods necessarily has "invalid" states. I think that is OK. And given that, it is not necessary to completely validate the valid range (say) of every config parameter.
I guess I'm saying validating individual values early is rather more work and probably doesn't buy very much. I'm not against it but I'm not convinced that it is worth making every setter method on the builder fallible.
- doc/CONFIG-NOTES.md 0 → 100644
15 ## Validation (startup API simplification) 16 17 ### Discussion 18 19 Few callers will want to validate the configuration separately from using it to make a Tor client (or running copy of the arti executable). The separation between `FooConfigBuilder` (partially-validated[1] configuration read from file(s) and/or set up with builder pattern methods) and `FooConfig` (validated configuration for Foo) doubles the number of config structs in our API. ([1] I say "partially" because the types of individual fields are right, although their values may be inappropriate.) 20 21 This distinction prevents (typestate-style) a programmer from writing code which directly constructs a client from an only-partially-validated config, but a program which does this is not buggy (if we make the validation happen at construction time) - and actually providing that API is easier for the caller. And because the `FooConfig` is already partially validated, there is 22 23 ### Proposal 24 25 We rename `FooConfigBuilder` to `FooConfig`. The current `FooConfig` type will become a private `foo::ValidatedConfig` type. (The two types will usually still need to be separate because defaulting means the builder ought usually to contain many `Option`, whereas the operational code doesn't want to see those.) 26 27 Config validation errors will be reported during client construction. 28 29 If desired we could provide a `validate()` method for callers that want to double check a config without trying to instantiate it. Or maybe the "create without bootstrapping" entrypoint will be good for this. 30 As for
Deserialize
, I think theDeserialize
implementation should reject invalid values (possible viaserde::de::Error::custom
, probably by usingdeserialize_with
on fields that require checking).
- doc/CONFIG-NOTES.md 0 → 100644
54 55 All of the `FooConfig` structs should be made directly `Deserialize` and `Serialize` (subject to serde cargo feature of course). (See above re abolition of `FooConfigBuilder` as a separate public type.) This means that much of the boilerplate field-copying (and conversions) in arti-config can be deleted. 56 57 The knowledge of the default config file location(s) and sources should be (exposed) in `arti-client` (not `arti`), with the implication that we hope that most embedders will use it. 58 (`tor-config` can continue to be the actual implementation of env vars, default path lookup, etc.) 59 60 Individual `tor-*` crates will retain their knowledge of their own configuration. arti-config will retain the the knowledge of executable-sepcific config settings (notably logging), and can be reused by shallow embedders if they like. 61 62 ### API suggestions 63 64 The most simple ones that just read a Tor config should look like this: 65 66 ``` 67 impl TorClientCfg { 68 #[throws] pub fn from_files() -> Self; 69 #[throws] pub fn builtin_defaults() -> Self; - doc/CONFIG-NOTES.md 0 → 100644
57 The knowledge of the default config file location(s) and sources should be (exposed) in `arti-client` (not `arti`), with the implication that we hope that most embedders will use it. 58 (`tor-config` can continue to be the actual implementation of env vars, default path lookup, etc.) 59 60 Individual `tor-*` crates will retain their knowledge of their own configuration. arti-config will retain the the knowledge of executable-sepcific config settings (notably logging), and can be reused by shallow embedders if they like. 61 62 ### API suggestions 63 64 The most simple ones that just read a Tor config should look like this: 65 66 ``` 67 impl TorClientCfg { 68 #[throws] pub fn from_files() -> Self; 69 #[throws] pub fn builtin_defaults() -> Self; 70 } 71 ``` 72 (borrowing notation from `fehler` for clarity; actual code will be `-> Result`) - doc/CONFIG-NOTES.md 0 → 100644
63 64 The most simple ones that just read a Tor config should look like this: 65 66 ``` 67 impl TorClientCfg { 68 #[throws] pub fn from_files() -> Self; 69 #[throws] pub fn builtin_defaults() -> Self; 70 } 71 ``` 72 (borrowing notation from `fehler` for clarity; actual code will be `-> Result`) 73 74 For embedders, `arti-client` should provide a method a bit like this 75 (name TBD) 76 ``` 77 #]throws] 78 pub fn tor_and_caller_config_from_usual_files<T>() -> (TorClientCfg, T) I'm really unsure about this whole "the caller will want to merge their config file format with ours and load them from the same file" idea: it strikes me that almost no serious embedder would want to use such an API.
What might be more useful is an API that let you make a
TorClientCfg
from aserde_json::Value
or similar.What might be more useful is an API that let you make a TorClientCfg from a serde_json::Value or similar.
In my proposal
TorClientCfg
isDeserialize
so you can indeed convert it (fallibly) fromserde_json::Value
.I'm really unsure about this whole "the caller will want to merge their config file format with ours and load them from the same file" idea: it strikes me that almost no serious embedder would want to use such an API.
The arti executable itself wants it. Anyone who wants to do something "vaguely like arti" ought to be provided with the same facility. This is just a consequence of the general API principle that functionalities which people might want to use separately shouldn't be provided only via an API which welds them together.
Eg, maybe you want a version of arti which is also a web reverse proxy for onion services, so it will be "just like arti but with this extra config stuff for the web routing"
- Resolved by Ian Jackson
I think the intent behind this design doc is generally very good, although I think my ideal vision for what config should look like in Arti differs a bit.
Here are some general summary points I have of the proposal, as is (in addition to the more specific inline review comments):
- I agree with @Diziet that it's confusing having separate unvalidated and validated config formats.
- However, I think the way to solve this is to do away with the entire concept of an unvalidated / incorrect configuration, at the type level.
- For example, the
Deserialize
implementation for a configuration type should fail if presented with values of the correct type that are otherwise incorrect. - This could be implemented in practice by having a bunch of newtype wrappers, or by using
#[serde(deserialize_with)]
.
- I'm not so sure how great the
derive_builder
crate is. Currently, I'm leaning towards the idea that replacing it with a bunch of manual boilerplate might be a good thing in the long run.- One specific problem: the fact that
build()
is fallible, instead of the individual setter methods being fallible. (See the relevant review thread about this.) - More generally, though, having that crate codegen a bunch of builder stuff for us that's hard for us to modify / tweak to our specific needs seems like a pretty large downside, especially given we care about API stability / design.
- Autogenerating documentation also doesn't lead to the best results, either.
- One specific problem: the fact that
As for how I'd propose improving things myself: I think it could be a nice idea to get rid of the builder pattern and just move to regular old setter methods — instead of starting out with a blank slate and constructing things from there, users would start with a default config and call
set_xyz
to change things.In addition, it might be nice to have a "merge configurations" function (as in, merge A with B, preferring A's settings if they both differ from the defaults, otherwise use the one that isn't the default), which might even replace a lot of what the
config
crate does for us.- I agree with @Diziet that it's confusing having separate unvalidated and validated config formats.
- However, I think the way to solve this is to do away with the entire concept of an unvalidated / incorrect configuration, at the type level.
This might be possible, though I'm not sure.
(It won't do away with the need for validation at reconfiguration time, though, since not all transitions are going to be supported.)
- I'm not so sure how great the
derive_builder
crate is. Currently, I'm leaning towards the idea that replacing it with a bunch of manual boilerplate might be a good thing in the long run.
I'd lean towards this as a "long run" idea rather than a "right now" idea. One of the challenges with manual boilerplate is that it's easier to mess it up: setting the wrong option, or having the wrong return type, for instance.
Another possibility is to improve
derive_builder
to meet our needs, by adding support for manual documentation overrides and infalliblebuild()
methods. (I think it might already support fallible setters, but I could be wrong there.)In addition, it might be nice to have a "merge configurations" function (as in, merge A with B, preferring A's settings if they both differ from the defaults, otherwise use the one that isn't the default), which might even replace a lot of what the
config
crate does for us.It's actually important that we distinguish "has the default value" from "is explicitly set to the default value" here: If we want to apply configuration A on top of configuration B, then it matters whether A has left option X unset, or whether A has explicitly set it to the default value.
do away with the entire concept of an unvalidated / incorrect configuration, at the type level
I think this is the most fundamental point here. I agree that making invalid states unrepresentable is generally a good idea.
But, here, I don't think it is practical or adviseable.
Firstly, it's not practical because of the point @nickm makes about combinations of settings possibly being invalid. If invalid combinations are unrepresentable then either setter methods would have to be called in a particular order (and, depending, there might not even be a possible order), or we would have to provide multi-setter methods, possibly in multiple combinations.
Secondly, it is actually useful to be able to represent a possibly-invalid configuration. That allows a program to read a configuration from disk and fix it up programmatically, for example. I think it will be easier to produce good error messages if we can have an in-memory representation of what everything is and where it came from - and if we do more of the validation in our own bespoke code, rather than in a
Deserialize
callback.Stepping back: the principle of making invalid states unrepresentable is great when the invalidity might come from a bug (including a failure to check something). It allows the compiler to spot these bugs and render them statically impossible. But the scheme I propose already has separate types for unvalidated and validated config, so this is already the case within our own code. But at the API level, config data is not a complex data structure that is constructed programatically: it's very data-like, and often comes in from the outside. There is nothing wrong with having an ability to represent unvalidated data if you can be sure not to use it by mistake.
improve derive_builder
It's true that my proposal doesn't automatically give us good documentation for our config. The result will be tolerable, but it will leave people inferring the necessary TOML syntax by extrapolating from setter method names and types. Really we want a bespoke doc which describes the TOML format, but which is generated from the doc comments we write in-code (or some other single source).
derive_builder
can't do that right now. We might need to find or write a tool for this. But I don't think this is on a critical path for 0.1.0. UnlikeDeserialize
, the use ofderive_builder
doesn't become part of our API.As for the
build
method being fallible or not: my suggestion is to hide this method entirely. Users would pass in a hopefully-validTorClientCfg
and it would be converted internally to a known-valid representation, using whatever combination of autogenerated and handwritten code is most convenient. Right now that would be a combination of the autogeneratedbuild
plus hand-rolled validation.mentioned in commit 4aa442de