Commit 577e6a02 authored by Hiro's avatar Hiro 🏄
Browse files

Update overloaded state support article

parent 19223d97
......@@ -44,7 +44,12 @@ timeout:n
```
Check ``$ man resolve.conf`` for more information.
3\. Consider enabling ``MetricsPort`` to understand what is happening. Please be careful.
3\. Consider enabling ``MetricsPort`` to understand what is happening.
MetricsPort data for relays has been introduced since version >= 0.4.7.1-alpha,
while the overload data has been added to the relay descriptor since 0.4.6+.
Please be careful.
It's important to understand that exposing tor metrics publicly is dangerous to the Tor network users.
Please take extra precaution and care when opening this port.
Set a very strict access policy with ``MetricsPortPolicy`` and consider using your operating systems firewall features for defense in depth.
......@@ -128,16 +133,63 @@ Let's find out what some of these lines actually mean:
When a relay starts seeing "dropped", it is a CPU/RAM problem usually.
Tor is sadly single threaded _except_ for when the "onion skins" are processed.
The "onion skins" are the cryptographic work that needs to be done on the famous
"onion layers" in every circuits.
When tor processes the layers we use a thread pool and outsource all of that work
to that pool.
It can happen that this pool starts dropping work due to memory or CPU pressure
and this will trigger an overload state.
If your server is running at capacity this will be likely be triggered.
```tor_relay_exit_dns_error_total{...}```
Any counter in the "*_dns_error_total" realm indicates a DNS problem.
DNS timeouts issues interest mainly Exit nodes. If tor starts noticing DNS timeouts,
you'll get the overload flag. This might not be because your relay is overloaded
in terms of resources but it signals a problem on the network.
DNS timeouts at the Exits are a _huge_ UX problem for tor users. Therefore Exit
operators really need take care of those to help.
```tor_relay_load_oom_bytes_total{...}```
This indicates a RAM problem.
An Out-Of-Memory invocation indicates a RAM problem.
The relay might need more RAM or it is leaking memory.
If you noticed that the tor process is leaking memory, please report the issue via either [GitLab](https://gitlab.torproject.org) or send an email to the [tor-relays mailing list](https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays).
Tor has its own OOM and it is invoked when 75% of the total memory tor thinks it
can use is reached. Thus, let say tor thinks it can use 2GB in total then at
1.5GB of memory usage, it will start freeing memory. That, is considered an
overload state.
To estimate the amount of memory it has available, when tor starts, it will use
MaxMemInQueues or will look at the total RAM available on the system and apply
this algorithm:
```
if RAM >= 8GB {
memory = RAM * 40%
} else {
memory = RAM * 75%
}
/* Capped. */
memory = min(memory, 8GB) -> [8GB on 64bit and 2GB on 32bit)
/* Minimum value. */
memory = max(250MB, memory)
```
To avoid an overloaded state we recommend to run a relay above 2GB of RAM on
64bit. 4GB is advised.
One might notice that tor could be called by the OS OOM itself.
Because tor takes the total memory on the system when it starts, if the overall
system has many other applications running using RAM, it ends up eating too much
memory. In This case the OS could OOM tor without tor even noticing memory
pressure.
```
tor_relay_load_socket_total
......
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment