it seems that some HSes are getting DoSed by an attacker who establishes a few circuits and then sends thousands of RELAY_BEGIN cells through them. It's basically a variant of legacy/trac#15515 (moved), but fortunately this can be fixed by patching the HS.
Here are some fixes ideas from Yawning:
a) Some variation on "implement a hard cap on the number of simultaneous streams that can be associated to a given circuit before we start rejecting RELAY_BEGINs". Will break apps if the hard cap is too low due to web browsers wanting to open tons of TCP/IP connections (limiting it to something like... 16/32/64 with it being tunable may be ok, consult Mike?)
b) Apply throttling to RELAY_BEGINs over a given circuit. Something like "allow up to N streams to complete immediately, then each subsequent stream will be delayed for x seconds, doubling for each additional RELAY_BEGIN, resetting after y seconds". Annoying since queuing is required (OOM hazard here?).
c) "If you want to be more resilient to this, use an AF_UNIX backed HS". This should shift the part that crumples to the app code, at which point it's kind of not our problem (the HS code might fall down for other reasons in this case, so I don't see this as being a real solution...)
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Child items
0
Show closed items
No child items are currently assigned. Use child items to break down this issue into smaller parts.
Linked items
0
Link issues together to show that they're related.
Learn more.
Personally I'm in favor of "a" since it's the easiest to implement, and if it's tunable, people can choose the right number for their HS. Something that's more automagic like "b" would be nicer, but the implementation gets more complicated.
As far as I can tell the impact of this is only a DoS, either through totally bogging the application/tor down, or by causing exhaustion of file descriptors/unused ports on the loopback interface (which is why I suggest "c", though realistically I suspect that the app or tor will die before most systems manage to exhaust the available port space).
Trac: Cc: N/Ato yawning Description: Hello,
it seems that some HSes are getting DoSed by an attacker who establishes a few circuits and then sends thousands of RELAY_BEGIN cells through them. It's basically a variant of legacy/trac#15515 (moved), but fortunately this can be fixed by patching the HS.
Here are some fixes ideas from Yawning:
a) Some variation on "implement a hard cap on the number of simultaneous streams that can be associated to a given circuit before we start rejecting RELAY_BEGINs". Will break apps if the hard cap is too low due to web browsers wanting to open tons of TCP/IP connections (limiting it to something like... 16/32/64 with it being tunable may be ok, consult Mike?)
b) Apply throttling to RELAY_BEGINs over a given circuit. Something like "allow up to N streams to complete immediately, then each subsequent stream will be delayed for x seconds, doubling for each additional RELAY_BEGIN, resetting after y seconds". Annoying since queuing is required (OOM hazard here?).
c) "If you want to be more resilient to this, use an AF_UNIX backed HS". This should* shift the part that crumples to the app code, at which point it's kind of not our problem.
to
Hello,
it seems that some HSes are getting DoSed by an attacker who establishes a few circuits and then sends thousands of RELAY_BEGIN cells through them. It's basically a variant of legacy/trac#15515 (moved), but fortunately this can be fixed by patching the HS.
Here are some fixes ideas from Yawning:
a) Some variation on "implement a hard cap on the number of simultaneous streams that can be associated to a given circuit before we start rejecting RELAY_BEGINs". Will break apps if the hard cap is too low due to web browsers wanting to open tons of TCP/IP connections (limiting it to something like... 16/32/64 with it being tunable may be ok, consult Mike?)
b) Apply throttling to RELAY_BEGINs over a given circuit. Something like "allow up to N streams to complete immediately, then each subsequent stream will be delayed for x seconds, doubling for each additional RELAY_BEGIN, resetting after y seconds". Annoying since queuing is required (OOM hazard here?).
c) "If you want to be more resilient to this, use an AF_UNIX backed HS". This should shift the part that crumples to the app code, at which point it's kind of not our problem (the HS code might fall down for other reasons in this case, so I don't see this as being a real solution...)
Not a very sophisticated attack; this is the kind of port/fd exhaustion thing that ought to work against any webserver. (RELAY_BEGIN is to Tor approximately as SYN is to TCP: welcome to the 1990s.)
The only HS-specific wrinkles here are:
* We can't do IP-based filtering.
* We can do circuit-aware mitigations.
I like 'a' as an immediate countermeasure, especially with a configurable limit. I like 'c' as a recommendation. Let me also suggest:
d) do something similar to our OOM prevention code, where when we run low on sockets, we kill certain circuits and their connections based on some reasonable metric.
e) think of some way to signal to the application which requests are arriving on the same circuit as other requests.
f) just to be sure, we should try this on a test network with profiling enabled to make sure there isn't some O(1) thing going on here in our own code.
Trac: Keywords: tor-hs dos deleted, tor-hs dos SponsorR SponsorU added Summary: Hidden service DoS by hammering RELAY_BEGIN to Hidden service socket exhaustion by opening many connections
In general DoSes and DDoSes can be mitigated by making sure the server has more capacity than the client. From these follow my two suggestions :
Donncha's summer of privacy project (to allow load balancing HSes) would allow server owners to just add more servers to the HS.
Introducing a Hashcash (proof-of-work) based system into HS connections should significantly decrease client capacity. This would ensure that the attacker needs significantly more hardware than the HS hoster. Of course it's important that the hashcash settings are tunable. (This is a very long term solution, as it would require all users of the HS to upgrade to whatever version first introduces it)
Re Nick's 'e', see also legacy/trac#4700 (moved). I think we definitely do want to build this somehow. We could imagine doing it over the control port; we could imagine designing an exthsport or some godawful thing similar to the extorport; or we could imagine doing some sort of in-band signaling where the first k bytes of the connection are the identifier, and then we give people a little shim that does access control before passing on connections to the 'real' service. A fourth, better, option would be great too.
More broadly, this particular connection flooding attack is just one example out of many in the DoS field. Another variation is simply sending a whole lot of drop cells, and using up the bandwidth of the hs, or using up the bandwidth of its entry guard.
We should keep two principles in mind while we're pondering solutions here:
We should think of ways to introduce or exploit asymmetry between the attacker and defender. For example, if our response to the attacker doing k things is to make sure the defender can handle n>k things, the defenders are going to be constantly sad.
One defense that we haven't brought to bear as much as we might is the initial rendezvous decision: when the service gets the intro2 cell, it can decide whether to perform the rendezvous. For a service that's under too much load, that's the right point for opting out of further load, because it controls all levels of interaction. (The corollary is that bugs like legacy/trac#15515 (moved) are really bad news because they threaten the service's control at this stage of the interaction.)
Of course, ultimately, if all of the users are anonymous, and you want to serve all of them, and some of them are noisy, you will be unable to serve all of them. We can offer heuristics to the service operators so they can specify their preference on which ones to not serve. I'm having trouble seeing a future where we have an on-by-default one-size-fits-many heuristic though.
For this particular situation, if I'm reading the logs right, the attacker sends 100 or so begin requests, and then pauses until he receives a circuit-level sendme cell giving him permission to send the next batch. So if our hack is of the form "hang up if there are 2000 concurrent connection requests", we'll never get to that state because of our flow control. That said, offering an off-by-default knob for "if you get totally flooded, close the circuit", that the people who are experiencing this attack in the wild can try, is a fine idea for a short-term hack. It's not yet clear to me what exactly should be the trigger for this knob though.
"d" is problematic, because we'd have to pick an appropriate metric. A HTTP server might want to discard the oldest connections, but an IRC server would want to stop taking new ones (or it becomes possible to kick users off).
In general, I'm skeptical of proof of work systems. I don't think they make much sense when your legitimate users are on mobile phones and your illegitimate users have botnets. Making the attack more expensive for clients is a compelling idea if it could be tuned to be useful. That is probably something we'd want to do at the circuit (not stream) level.
I'll also propose:
g) Add a "rendezvous approver" control API, which gives an opted-in controller the chance to approve or deny all rendezvous circuit and stream requests before they're acted upon. This would allow us to make more complex and useful mitigations as third party software.
g) Add a "rendezvous approver" control API, which gives an opted-in controller the chance to approve or deny all rendezvous circuit and stream requests before they're acted upon. This would allow us to make more complex and useful mitigations as third party software.
Can you flesh out the decision process that such a mitigator might use? If you have anything non-trivial, a new ticket is probably best here.
I agree that trying to hard-code a complicated rendezvous-level mitigator inside Tor is too ambitious for now, since it's not clear what we would want it to do. So I find the idea of delegating it to the controller appealing.
Speaking of the mitigator, the original HS design had the services giving out tokens to preferred users, who then use the token to get access during times of high load. There's lots of work to do for making sure the tokens are blinded, figuring out the usability aspects for normal Tor Browser users, etc, but I just wanted to remind us of the original vision. Which I guess underscores how little of the original vision we ever got around to building.
g) Add a "rendezvous approver" control API, which gives an opted-in controller the chance to approve or deny all rendezvous circuit and stream requests before they're acted upon. This would allow us to make more complex and useful mitigations as third party software.
Can you flesh out the decision process that such a mitigator might use? If you have anything non-trivial, a new ticket is probably best here.
I created legacy/trac#16059 (moved), including a list of potential use cases. Most of them are not very helpful against DoS on a HTTP-based service, but others may have ideas.
For this particular situation, if I'm reading the logs right, the attacker sends 100 or so begin requests, and then pauses until he receives a circuit-level sendme cell giving him permission to send the next batch. So if our hack is of the form "hang up if there are 2000 concurrent connection requests", we'll never get to that state because of our flow control. That said, offering an off-by-default knob for "if you get totally flooded, close the circuit", that the people who are experiencing this attack in the wild can try, is a fine idea for a short-term hack. It's not yet clear to me what exactly should be the trigger for this knob though.
I asked Mike about this and got: "If tbb is making more than 6 concurrent hs streams, that's a tbb bug.", with the caveat that "Some of these hs chat protocols may be dumber.. I think the early torchat made one http post per message".
So I still think providing a tunable (default-off) max-current-connection knob makes sense here since it forces the attacker into doing more work.
i know this may be a slightly uninformed answer but is it not strrange for any client to be dropping that many (~100) begin requests in one batch? wouldnt this in itself be an identifier of the potential attacker?
if some mitigating code was added to create an exception if the begin request count is >100 and then refuse the sendme cell?
is there a middle ground where the server can identify a potentially risky amount of begin requests and perhaps refuse on that basis causing the client to not send the circuit-level sendme cell basically leaving the connection on that basis to time out. this would result in only regular traffic from user based connections being allowed at normal request rates and the risky level reqquests perhaps meet a form of authentication feature to work as a buffer for offending connections. be that a capture or just a dead link redirect.
that way as the flow controll will allow regular trafic but the mitigation haphazardly represented above would narrow the band of successive or concurrent trquests from one client within a time period.
would there be any case in wich 100 begin requests would be sent at once at least to a regular HS by the average client at one time?
disallowing bulk requests if they are irregular should not present many issues if im not mistaken (i probably am mistaken, im quite a new researcher in the feild).
i am clearly by no means on the same level of understanding as the previous posters but sometimes the stupid guy sais something that makes you rethink and maybe ill do that.
Refactor the code that touches origin_circuit_t.p_streams to use TOR_LIST_* so it's easier to audit the stream counter code that got added (per discussion with nickm).
Test it.
Add a changes file.
Update the man page.
The code does the simplest thing that will not break in conjunction with browsers using multiple connections to minimize load latency (the "6" figure per mike), taking into account that circuit lifespan can be quite long now with the legacy/trac#15482 (moved) sketch patch being shipped in Tor Browser.
More complicated behavior is indeed possible, but people who want such things are free to figure out what exactly Firefox does wrt to keeping connections around/opening new ones/etc, so they can implement the different behavior themselves.
Refactor the code that touches origin_circuit_t.p_streams to use TOR_LIST_* so it's easier to audit the stream counter code that got added (per discussion with nickm).
Test it.
Add a changes file.
Update the man page.
The code does the simplest thing that will not break in conjunction with browsers using multiple connections to minimize load latency (the "6" figure per mike), taking into account that circuit lifespan can be quite long now with the legacy/trac#15482 (moved) sketch patch being shipped in Tor Browser.
More complicated behavior is indeed possible, but people who want such things are free to figure out what exactly Firefox does wrt to keeping connections around/opening new ones/etc, so they can implement the different behavior themselves.
Tried it, I get this error: Failed to parse/validate config: Unknown option 'HiddenServiceMaxStreams'. Failing.
Tried it, I get this error: Failed to parse/validate config: Unknown option 'HiddenServiceMaxStreams'. Failing.
I forgot to change config.c, I just pushed a fixup commit to my branch, but this still isn't something that most people should try unless they're willing to debug it.
OK, I tried to reproduce the attack. The naive attack of sending 10k RELAY_BEGIN cells on a single circuit, seems to overwhelm Tor for a few seconds, and it gets worse depending on the underlying application. I imagine that with a web server, the whole system will be overwhelmed.
Then, I did a bit of testing with Yawning's branch. With the naive attack, it seems that Yawning's branch works as intended (ignores superfluous RELAY_BEGIN cells) but it doesn't stop the DoS. That is, the whole system still goes at 100% CPU just because of cell processing (I think).
If we change Yawning's patch to tear down the circuit after the max number of streams have been encountered, then it seems to work better.
We discussed making this behavior more configurable by having two switches:
HiddenServiceMaxStreams: The maximum number of simultaneous streams on an HS circuit.
HiddenServiceMaxStreamsCloseCircuit: If set, then when HiddenServiceMaxStreams is triggered, we close the respective circuit. If not set, we just ignore requests for superfluous streams. (Default: off)
(The positive thing of not killing the circuit above, is that the circuit will recover once the number of streams goes below the threshold)
Then, I did a bit of testing with Yawning's branch. With the naive attack, it seems that Yawning's branch works as intended (ignores superfluous RELAY_BEGIN cells) but it doesn't stop the DoS. That is, the whole system still goes at 100% CPU just because of cell processing (I think).
Well, it stops the DoS in that, despite the load being at 100%, new connections (and existing ones) will continue to get serviced (if they weren't it would be a sign that our circuit level scheduling is broken, which it isn't).
If the goal here is to keep CPU load down when bad guys can send arbitrary traffic down one circuit, then a trivial way for the bad guys to drive load up to 100% would be to spam DROP cells. There's ever so slightly less processing involved for DROP cells than ignored BEGIN cells, but not enough that that variant of the attack isn't possible to mount.
If we change Yawning's patch to tear down the circuit after the max number of streams have been encountered, then it seems to work better.
We discussed making this behavior more configurable by having two switches:
HiddenServiceMaxStreams: The maximum number of simultaneous streams on an HS circuit.
HiddenServiceMaxStreamsCloseCircuit: If set, then when HiddenServiceMaxStreams is triggered, we close the respective circuit. If not set, we just ignore requests for superfluous streams. (Default: off)
(The positive thing of not killing the circuit above, is that the circuit will recover once the number of streams goes below the threshold)
Some people on IRC seem to think that once a circuit trips the threshold we should stop sending SENDME cells. I don't particularly think that the behavior with my branch is broken though, and I'm still refactoring the linked list code so I'll think about that later.
Note to the peanut gallery: There is still NO guarantee that I got the stream accounting I added correct, and using my branch is still not advised.
Here is some profiling output of the attack without Yawning's patch:
+ 58.34% tor circuit_receive_relay_cell+ 5.11% tor logv+ 4.93% tor smartlist_contains+ 1.43% tor connection_bucket_refill+ 1.15% tor curve25519_donna+ 0.94% tor assert_circuit_ok+ 0.91% tor scan_unsigned+ 0.89% tor smartlist_remove+ 0.88% tor log_fn_+ 0.74% tor tor_vsnprintf+ 0.64% tor tor_snprintf+ 0.64% tor tor_memeq+ 0.62% tor node_get_prim_orport+ 0.62% tor tor_mutex_release+ 0.57% tor write_all+ 0.45% tor buf_datalen+ 0.42% tor siphash24+ 0.41% tor connection_exit_begin_conn+ 0.39% tor assert_cpath_layer_ok+ 0.35% tor run_connection_housekeeping+ 0.34% tor relay_send_command_from_edge_+ 0.34% tor assert_connection_ok+ 0.33% tor compute_weighted_bandwidths+ 0.31% tor connection_handle_read+ 0.30% tor tor_gettimeofday+ 0.28% tor node_is_dir+ 0.27% tor command_process_cell+ 0.26% tor connection_or_process_cells_from_inbuf+ 0.26% tor tor_localtime_r+ 0.25% tor rend_service_set_connection_addr_port+ 0.25% tor circuitmux_find_map_entry
whereas with Yawning's patch and max streams set to 100:
+ 18.20% tor logv+ 4.52% tor curve25519_donna+ 3.82% tor tor_mutex_release+ 3.46% tor scan_unsigned+ 2.49% tor node_get_prim_orport+ 2.03% tor assert_circuit_ok+ 2.03% tor tor_memeq+ 1.89% tor tor_vsnprintf+ 1.89% tor log_fn_+ 1.80% tor tor_snprintf+ 1.43% tor compute_weighted_bandwidths+ 1.29% tor relay_send_command_from_edge_+ 1.29% tor write_all+ 1.24% tor tor_localtime_r+ 1.24% tor siphash24+ 1.06% tor node_is_dir+ 1.01% tor channel_update_xmit_queue_size+ 0.97% tor get_options+ 0.92% tor connection_edge_process_relay_cell+ 0.92% tor circuit_receive_relay_cell+ 0.92% tor assert_cpath_layer_ok+ 0.92% tor tor_mutex_acquire+ 0.88% tor connection_or_process_cells_from_inbuf+ 0.88% tor tor_addr_copy+ 0.83% tor tor_gettimeofday+ 0.78% tor crypto_cipher_crypt_inplace+ 0.69% tor node_get_mutable_by_id+ 0.69% tor relay_header_unpack+ 0.69% tor circuitmux_find_map_entry+ 0.65% tor router_add_running_nodes_to_smartlist+ 0.65% tor smartlist_contains+ 0.65% tor tor_malloc_zero_+ 0.60% tor connection_exit_begin_conn+ 0.60% tor base32_encode+ 0.55% tor relay_crypt+ 0.55% tor tor_vsscanf+ 0.51% tor node_get_nickname+ 0.51% tor rend_service_set_connection_addr_port+ 0.51% tor assert_buf_ok+ 0.51% tor smartlist_remove+ 0.46% tor scheduler_adjust_queue_size+ 0.46% tor fetch_var_cell_from_buf+ 0.46% tor ewma_notify_xmit_cells+ 0.46% tor tor_addr_compare_masked+ 0.46% tor crypto_digest_get_digest+ 0.46% tor aes_crypt_inplace+ 0.41% tor channel_tls_handle_cell+ 0.41% tor get_options_mutable+ 0.41% tor assert_connection_ok
Refactor the code that touches origin_circuit_t.p_streams to use TOR_LIST_* so it's easier to audit the stream counter code that got added (per discussion with nickm). (Moved to legacy/trac#16134 (moved))
Test it. (Behaves as expected per asn, though CPU consumption is still high, and exact behavior not finalized yet.)
Argue over what exactly should happen when the threshold is tripped (what I do now? kill the circuit? mess with circuit level flow control? Display MLP ascii art in the logs?).
Add a changes file.
Update the man page.
Squash and split into a commit that adds the feature and one that handles the ADD_ONION integration.
When working on legacy/trac#16134 (moved), I found that I'm not doing book keeping in circuit_mark_for_close() but I view that as acceptable (no point in zeroing out the number of streams when the circuit is getting torn down anyway).