At the moment descriptor is getting posted at MIN_REND_INITIAL_POST_DELAY (30) seconds after onion service initialization.
For the use case of real-time one-time services (like OnionShare, etc) one has to wait for 30 seconds until this onion service can be reached. Besides, if a client tries to reach the service before its descriptor is ever published, tor client gets stuck preventing user from reaching this service after descriptor is published. Like this:
Could not pick one of the responsible hidden service directories to fetch descriptors, because we already tried them all unsuccessfully.
I propose to lower MIN_REND_INITIAL_POST_DELAY to 3-5 secs for ephemeral services. It seems to be enough for one-shot services to stabilize.
Not sure if it's really bad to do so - tell me if it is. If it's not good idea to make such short delay for all ephemeral services, we can pass this delay as a parameter for ADD_ONION command so that applications which need low delay can tune it.
Please see a patch below for making this delay as short as 3 seconds for ephemeral services.
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Child items ...
Show closed items
Linked items 0
Link issues together to show that they're related.
Learn more.
Have you tested that the actual delay here is about 30 seconds? I remember people saying that the whole rend_consider_services_upload() function is borked. I think that would be nice to verify.
Now, if we believe that this delay actually offers security and we reduce it for ADD_ONION services, why not reduce it for all services? We don't really know the threat model of all the people who use ADD_ONION, so I'm not sure if we should take such a global decision.
Personally, I feel this delay can indeed increase security in some use cases, but I also don't like the reachability effect that you mentioned.
I think that your sugestion of making this a parameter of ADD_ONION might be a good approach. Although this assumes that all the people who use ADD_ONION actually understand the security threats here, which is quite doubtful...
Have you tested that the actual delay here is about 30 seconds? I remember people saying that the whole rend_consider_services_upload() function is borked. I think that would be nice to verify.
I did and it takes exactly 30 seconds. Yes, it is kind of unclear from the code that this delay will be actually 30 seconds.
Now, if we believe that this delay actually offers security and we reduce it for ADD_ONION services, why not reduce it for all services? We don't really know the threat model of all the people who use ADD_ONION, so I'm not sure if we should take such a global decision.
But we don't know the actual security benefit of having it 30sec either.
I think that your sugestion of making this a parameter of ADD_ONION might be a good approach. Although this assumes that all the people who use ADD_ONION actually understand the security threats here, which is quite doubtful...
Yes, but it requires more code. :) I proposed this approach to leave all the crazy logic only for those who need it.
I think one should understand what delay should be set and why (maybe someone do and I don't). But if it still unclear it's better to go on with ADD_ONION flag.
That 30 seconds delay pre-dates my knowledge of hidden service so I have no clue why it was chosen. I doubt very much it was about mitigating any kind of "startup time correlation" attack because a random delay was added to it but then we realized it was borked so in the end it always been 30 seconds...
So indeed... why keep a delay at all for services? Descriptor publication to a directory only happens once all intro points circuit are ready and those are established at startup. If you think about it, it's actually a very specific pattern to detect a service startup. As a guard you see 5 circuits being established at once (yes we need 3 IPs but we launch 2 extras for better luck) then 2 of them dies quickly and you have a 6th circuit almost 30 seconds after the initial launch of those... I'm not saying that by removing that delay we'll make it go away but I really don't see the point of the delay here to hide anything. Thus, I'm all for removing it unless armadev had a reason to add that delay :) 10 years ago :).
Also let's keep in mind that this would be very useful to answer for prop224.
So indeed... why keep a delay at all for services? Descriptor publication to a directory only happens once all intro points circuit are ready and those are established at startup. If you think about it, it's actually a very specific pattern to detect a service startup. As a guard you see 5 circuits being established at once (yes we need 3 IPs but we launch 2 extras for better luck) then 2 of them dies quickly and you have a 6th circuit almost 30 seconds after the initial launch of those... I'm not saying that by removing that delay we'll make it go away but I really don't see the point of the delay here to hide anything. Thus, I'm all for removing it unless armadev had a reason to add that delay :) 10 years ago :).
Being more specific, this lower bound of 30s was introduced by commit b3f846b313b3cf3191e3a9a54ec1c97227393d3d which reads:
In very rare situations new hidden service descriptors were published earlier than 30 seconds after the last change to the service, with the 30 seconds being the current voodoo saying that a descriptor is stable.
So I don't see any reason to trust the voodoo and thus have this delay. :)
Also this delay makes it more distinguishable for a passive adversary (ISP) whether a client just set up an onion service or not.
Being more specific, this lower bound of 30s was introduced by commit b3f846b313b3cf3191e3a9a54ec1c97227393d3d which reads:
Sorry for a typo, it's 33f846b3.
Eventually I've gone wrong, it was introduced even before.
It has jumped to 30s from 5s due to "load on authorities".
11d89141:
+ o Minor bugfixes (hidden services):+ - Upload hidden service descriptors slightly less often, to reduce+ load on authorities.
"Load on authorities" is not the point anymore because we don't use V0 since 0.2.2.1-alpha. Thus I think it's safe to drop it back to at least 5s (3s?) for all services. Or even remove it at all?
Being more specific, this lower bound of 30s was introduced by commit b3f846b313b3cf3191e3a9a54ec1c97227393d3d which reads:
Sorry for a typo, it's 33f846b3.
Eventually I've gone wrong, it was introduced even before.
It has jumped to 30s from 5s due to "load on authorities".
11d89141:
{{{
o Minor bugfixes (hidden services):
Upload hidden service descriptors slightly less often, to reduce
load on authorities.
}}}
"Load on authorities" is not the point anymore because we don't use V0 since 0.2.2.1-alpha. Thus I think it's safe to drop it back to at least 5s (3s?) for all services. Or even remove it at all?
I think ideally any such change should be accompanied by a prop224 patch and a mailing list discussion. It would be great if someone added a section to prop224 specifying this behavior, and made a [tor-dev] thread introducing the patch.
I think ideally any such change should be accompanied by a prop224 patch and a mailing list discussion. It would be great if someone added a section to prop224 specifying this behavior, and made a [tor-dev] thread introducing the patch.
Agreed, removing of this delay is too "radical" and should be moved to prop224.
Anyway I think that it's safe to restore it back to 5s level and enjoy plain-old-services without useless 30s delay now.
This would also be appreciated by Single Onion Service operators (legacy/trac#17178 (moved)), I've had complaints from those using the test code that descriptor upload takes a while.
However, the threat here is that hidden services that have unstable introduction points now upload their descriptors 6x more often.
Why don't we make the initial upload 5s, and every upload after that 30s?
Or even better, some kind of exponential backoff to a few minutes - if you've changed your intro points ten times, we really don't want your eleventh descriptor any time soon.
However, the threat here is that hidden services that have unstable introduction points now upload their descriptors 6x more often.
Why don't we make the initial upload 5s, and every upload after that 30s?
Or even better, some kind of exponential backoff to a few minutes - if you've changed your intro points ten times, we really don't want your eleventh descriptor any time soon.
Yes, rend_consider_services_upload() function is borked and it's hard to tell what's going on. It's not 6x more often. 30 seconds is the initial delay (after descriptor became dirty). Actual upload period seems (sic!) to be [30s , 30s + rand(2*1h)]. The lower boundary is what is fixed. And fixed high for no actual benefit or security reason, IMO (see comment:8).
However, the threat here is that hidden services that have unstable introduction points now upload their descriptors 6x more often.
Why don't we make the initial upload 5s, and every upload after that 30s?
Or even better, some kind of exponential backoff to a few minutes - if you've changed your intro points ten times, we really don't want your eleventh descriptor any time soon.
There is a retry timeout for IP circuits if too many fails (see INTRO_CIRC_RETRY_PERIOD). We rely on that for an upper limit of descriptor upload. If an IP keeps failing after a short period (5 minutes), then the IP circuit building retry timeout mechanism kicks in and thus you won't see a zillion descriptor publication. Maybe that's not perfect but that's imo something different from the 30 seconds delay added at startup time.
Now, if your IPs keep failing after the 5 minutes retry window (ex: circuit is closed because bad network), well you indeed need to rebuild a new descriptor with a new IP and publish it but that's OK imo. Adding a delay to publication won't help here because we already have that 5 minutes "wait period" in the first place to avoid too many tries.
I have this feeling that we might be at the point of going on tor-dev@ with this discussion because some of us wants an initial delay or get rid of it or only do something in 224?
My impression of the tor-dev discussion is that we wanted to lower the initial post delay for all hidden services, not just ephemeral hidden services.
Please revise the patch to make this happen, or let me know that's not what we agreed on tor-dev.
Sure, done. Just not sure that we have really agreed upon it.
By the way, when I was testing this patch I found out that 3s in enough for an onion service to stabilize and is much better for UX compared to 5s.