The Tor Project issueshttps://gitlab.torproject.org/groups/tpo/-/issues2023-09-22T23:50:13Zhttps://gitlab.torproject.org/tpo/core/tor/-/issues/40715MetricsPort: inbound ORPort connections: relays vs. non-relay connections2023-09-22T23:50:13ZcypherpunksMetricsPort: inbound ORPort connections: relays vs. non-relay connectionsthis got previously submitted on 2022-10-24 https://gitlab.torproject.org/tpo/core/tor/-/issues/40194#note_2849481
but that issue got closed and asked for new specific tickets for each new metric:
From last week's relay meetup we know t...this got previously submitted on 2022-10-24 https://gitlab.torproject.org/tpo/core/tor/-/issues/40194#note_2849481
but that issue got closed and asked for new specific tickets for each new metric:
From last week's relay meetup we know that tor knows whether an incoming OR connection is from a client or from a relay without looking at the source IP address.
https://pad.riseup.net/p/tor-relay-op-meetup-o22-keep
From the metrics added in !625 (merged) we know, that the increased CPU load correlates with an increase in the rate of new inbound OR connections. This rate increases when CPU load increases on exits:
```
rate(tor_relay_connections{type="OR",state="created",direction="received"}[$__rate_interval])
```
Could you please add a label for OR connections coming from clients vs. OR connections coming from other relays?
This would allow us to confirm that exits get more new inbound connections from clients when CPU load increases.
that new label could be `src`:
```
tor_relay_connections_total{type="OR",state="created",direction="received",src="relay"}
tor_relay_connections_total{type="OR",state="created",direction="received",src="non-relay"}
tor_relay_connections{type="OR",state="opened",direction="received",src="relay"}
tor_relay_connections{type="OR",state="opened",direction="received",src="non-relay"}
```https://gitlab.torproject.org/tpo/core/arti/-/issues/682Try to recover our unit test coverage post arti-1.1.02023-04-25T15:37:35ZNick MathewsonTry to recover our unit test coverage post arti-1.1.0With 1.1.0, our unit test coverage has dropped to 78.06%. This is the [lowest unit test coverage](https://gitlab.torproject.org/tpo/core/arti/-/wikis/ReleaseHistory) level on ANY release of arti that we have ever done: ouch!
It's time ...With 1.1.0, our unit test coverage has dropped to 78.06%. This is the [lowest unit test coverage](https://gitlab.torproject.org/tpo/core/arti/-/wikis/ReleaseHistory) level on ANY release of arti that we have ever done: ouch!
It's time to look at our [coverage results](https://tpo.pages.torproject.net/core/arti/coverage/unit/) and see where we need to write tests. Let's not scrimp here: our test coverage has saved us from bugs in the past!Arti 1.1.0: Anticensorship readyhttps://gitlab.torproject.org/tpo/core/arti/-/issues/641AbstractSpec SupportedCircUsage restrict_mut vs supports2023-01-24T17:28:58ZIan Jacksoniwj@torproject.orgAbstractSpec SupportedCircUsage restrict_mut vs supportsIn `impl crate::mgr::AbstractSpec for SupportedCircUsage`, `supports` and `restrict_mut` contain extremely similar code.
1. This should be unified.
2. Note that `supports` checks `Exit.ports` but `restrict_mut` does not modify `ports`:...In `impl crate::mgr::AbstractSpec for SupportedCircUsage`, `supports` and `restrict_mut` contain extremely similar code.
1. This should be unified.
2. Note that `supports` checks `Exit.ports` but `restrict_mut` does not modify `ports`:
```
16:07 <+Diziet> nickm: I see that supports does something wioth
TargetCircUsage::Exit.ports and restrict_mut doesn't. Is that
correct ?
16:26 <+nickm> Diziet: yeah, arguably restrict_mut should do that check too.
16:27 <+nickm> (Diziet: I say "arguably" because the "requirements" section on
restrict_mut says that it shouldn't be called in that case.)
```
I'm not sure that's right but we are too busy now to look into this properly. (CC @nickm)Ian Jacksoniwj@torproject.orgIan Jacksoniwj@torproject.orghttps://gitlab.torproject.org/tpo/core/tor/-/issues/40716Impelement conflux for onion services2022-11-28T14:01:05ZMike PerryImpelement conflux for onion servicesConflux is traffic splitting, and will result in increased throughput and reduced latency for onion services after a connection has been established, by routing traffic over multiple paths, or via the lowest latency path to a service.
T...Conflux is traffic splitting, and will result in increased throughput and reduced latency for onion services after a connection has been established, by routing traffic over multiple paths, or via the lowest latency path to a service.
This ticket is for the onion service pieces of conflux (https://gitlab.torproject.org/tpo/core/tor/-/issues/40593).
We will not be implementing the onion services pieces of conflux as part of that ticket. It can be done later, if any onion service sponsors care about latency or throughput.
The pieces for onion services are:
- **Negotiation**
- [ ] Protover Advertisement for Onions (24h)
- [ ] Rend circuit linking (40h)
This is specified in https://gitlab.torproject.org/tpo/core/torspec/-/blob/main/proposals/329-traffic-splitting.txt, but we probably want to allow onion services to configure their scheduler by manually choosing either BLEST, or LowRTT, since different kinds of onion services may want to optimize for either throughput or latency.
There may be some additional work wrt making sure linked edge conns work properly, if they are handled differently for the onion service case.
Also, some shadow validation and performance testing will be needed. Maybe 40h or so of dev time (though much longer wall-clock time).https://gitlab.torproject.org/tpo/core/tor/-/issues/40717Additional metricsport stats for various stages of onionservice handshake2023-12-07T14:41:35ZMike PerryAdditional metricsport stats for various stages of onionservice handshakeIf we export additional onion service metrics such as time measurements on the HSDIR, INTRO, and REND stages of circuit setup for both client and service side, and the number of timeouts/failures there, it would help to uncover the root ...If we export additional onion service metrics such as time measurements on the HSDIR, INTRO, and REND stages of circuit setup for both client and service side, and the number of timeouts/failures there, it would help to uncover the root cause of issues like https://gitlab.torproject.org/tpo/core/tor/-/issues/40570 and related reliability and connectivity issues with onion services.
We can also export congestion control info from https://gitlab.torproject.org/tpo/core/tor/-/issues/40708 to the onionservice metrics set, too, which can help us with tuning congestion control for onion services.
We can then hook up the onionperf onion service instances to our grafana dashboard, and gather more detailed stats that way, as a supplement to the metrics that get graphed on the metrics website.https://gitlab.torproject.org/tpo/core/arti/-/issues/651Basic acceptance testing for 1.1.0 deliverables2023-01-10T18:23:24ZNick MathewsonBasic acceptance testing for 1.1.0 deliverablesHere's what we should test to make sure that 1.1.0 behaves as advertised.
Please add more cases as needed. In each case, it would be nice to have integration testing, but we should at a minimum test these cases by hand.
## Successful ...Here's what we should test to make sure that 1.1.0 behaves as advertised.
Please add more cases as needed. In each case, it would be nice to have integration testing, but we should at a minimum test these cases by hand.
## Successful cases
* [x] Configure Arti with a single bridge, not using a pluggable transport.
* [x] Configure Arti with a single bridge, using obfs4. (#332)
* [x] Configure Arti with a single bridge, using snowflake. (#333)
* [ ] Configure Arti with multiple bridges, including a few nonexistent ones.
In each of the successful cases, we should:
* Observe that we can browse and download successfully.
* Observe that the log messages are reasonable-looking at level `info` and higher.
* Observe (using instrumentation, `lsof`, `tcpdump`, or similar) that Arti is only connecting as it is told.
You can get bridges from `bridges.torproject.org`.
## Failing cases
* [ ] Configure arti to use bridges, using only a single nonexistent bridge.
* [ ] Configure arti to use bridges, using multiple nonexistent bridges.
In the failing cases we should:
* Observe that the log messages are reasonable-looking at level `info` and higher.
* Observe (using instrumentation, `lsof`, `tcpdump`, or similar) that Arti is only connecting as it is told.
## Complex cases
* [ ] Configure a list of bridges, and turn enable_bridges on and off. Make sure that Arti behaves correctly.Arti 1.1.0: Anticensorship readyhttps://gitlab.torproject.org/tpo/core/arti/-/issues/629bridge descriptor download parameters should be configurable2022-12-12T18:13:36ZIan Jacksoniwj@torproject.orgbridge descriptor download parameters should be configurable`BridgeDescMgr` is configurable via a `struct BridgeDescDownloadConfig`. But right now the only way to construct one of those is its `Default` impl. Instead, this should be configurable via or derivable from something that appears, eve...`BridgeDescMgr` is configurable via a `struct BridgeDescDownloadConfig`. But right now the only way to construct one of those is its `Default` impl. Instead, this should be configurable via or derivable from something that appears, eventually, in `TorClientConfig`.
Open questons:
* Should this be part of `DownloadSchedule` (`[download_schedule]` config section) or part of `[bridges]` ?
* Do we want a way for the user to specify `parallelism` in one place for both directory downloads and bridge downloads - or are these different things that they ought not to be mixed up this way?
* Same question for `BridgeDescDownloadConfig.retry` vs `DownloadScheduleConfig.retry_*`?
When this is decided the actual implementation should be reasonably obvious.Arti 1.1.0: Anticensorship readyIan Jacksoniwj@torproject.orgIan Jacksoniwj@torproject.orghttps://gitlab.torproject.org/tpo/tpa/prometheus-alerts/-/issues/9Add alerts for network health relays2024-03-27T09:45:19ZGeorg KoppenAdd alerts for network health relaysWe have some relays and their metrics scraped via Prometheus (namely akka, ukko, d2d4, and Najdorf). We should work on alerts so we can avoid stumbling into some or all of the metrics not being exposed anymore in our dashboards due to $i...We have some relays and their metrics scraped via Prometheus (namely akka, ukko, d2d4, and Najdorf). We should work on alerts so we can avoid stumbling into some or all of the metrics not being exposed anymore in our dashboards due to $issue (as witnessed in tpo/network-health/team#281).
/cc @hiro, @dgouletGeorg KoppenGeorg Koppenhttps://gitlab.torproject.org/tpo/core/arti/-/issues/677More debug logs in pluggable transports2023-06-20T13:54:59ZNick MathewsonMore debug logs in pluggable transportsSpecifically, it would be good to have more debugging logs throughout the PT lifecycle, and to have trace logs for every step of the way along making a connection, negotiating with the proxy, etc. I have some `stashed` junk that I used ...Specifically, it would be good to have more debugging logs throughout the PT lifecycle, and to have trace logs for every step of the way along making a connection, negotiating with the proxy, etc. I have some `stashed` junk that I used in debugging, but it isn't production quality.Arti 1.1.0: Anticensorship readyhttps://gitlab.torproject.org/tpo/core/arti/-/issues/681Resolve remaining comments from !901 and !9032023-02-27T12:00:34ZNick MathewsonResolve remaining comments from !901 and !903The reviews of !901 and !903 had some suggestions, mainly related to commenting and documentation. We should apply those before we forget about them entirely.The reviews of !901 and !903 had some suggestions, mainly related to commenting and documentation. We should apply those before we forget about them entirely.Arti 1.1.0: Anticensorship readyhttps://gitlab.torproject.org/tpo/core/arti/-/issues/702More unit tests for tor-ptmgr crate2023-02-27T12:00:40ZNick MathewsonMore unit tests for tor-ptmgr crateCurrent ptmgr unit test coverage is around 38%. Let's get that increased :)Current ptmgr unit test coverage is around 38%. Let's get that increased :)Arti 1.1.0: Anticensorship readyhttps://gitlab.torproject.org/tpo/web/manual/-/issues/137Add section about crypto warning popup2022-12-17T15:19:26ZdonutsAdd section about crypto warning popupSee https://gitlab.torproject.org/tpo/applications/tor-browser/-/issues/41363 for the most recent work happening on this component, and https://gitlab.torproject.org/tpo/applications/tor-browser/-/issues/40899#note_2863965 for discussion...See https://gitlab.torproject.org/tpo/applications/tor-browser/-/issues/41363 for the most recent work happening on this component, and https://gitlab.torproject.org/tpo/applications/tor-browser/-/issues/40899#note_2863965 for discussion about the support URL.
tl;dr we have a `Learn more` link in the wingpanel that currently points to support-dot, which presumably was a temporary measure until dedicated content could be created.https://gitlab.torproject.org/tpo/core/tor/-/issues/40735[WARN] Tried connecting to router ... identity keys were not as expected2023-11-14T16:59:05Zcypherpunks[WARN] Tried connecting to router ... identity keys were not as expectedBackground: Tor Browser 12.0, Tor 4.7.12, Windows 7, vanilla bridges.
Repeatedly getting the following log line.
```
[WARN] Tried connecting to router at *address* ID=<none> RSA_ID=*FP1*, but RSA + ed25519 identity keys were not as exp...Background: Tor Browser 12.0, Tor 4.7.12, Windows 7, vanilla bridges.
Repeatedly getting the following log line.
```
[WARN] Tried connecting to router at *address* ID=<none> RSA_ID=*FP1*, but RSA + ed25519 identity keys were not as expected: wanted *FP1* + no ed25519 key but got *FP2* + *edFP*.
```
Ideas of what happened:
* MITM
* Bridge operator reinstalled it in-between me getting the bridge and now.
What is wrong:
* Bridge should be marked as unreachable: either it is not used already and connections are doomed to spend resources for nothing, or it should not be used as something is clearly wrong with it
* There should be a way to distinguish first idea from second - my best guess is building a tunneled directory connection to bridge authority and asking "Is there a bridge *FP2* and does it listen on *address*?"https://gitlab.torproject.org/tpo/core/tor/-/issues/40739[warn] Possible compression bomb; abandoning stream.2023-11-18T19:42:40Zcomputer_freak[warn] Possible compression bomb; abandoning stream.Relay with `Tor 0.4.7.13`:
```
[warn] Possible compression bomb; abandoning stream.
[warn] Unable to decompress HTTP body (tried Zstandard compressed, on Directory connection (client reading) with 86.59.21.38:80).
```
obfs4 Bridge with ...Relay with `Tor 0.4.7.13`:
```
[warn] Possible compression bomb; abandoning stream.
[warn] Unable to decompress HTTP body (tried Zstandard compressed, on Directory connection (client reading) with 86.59.21.38:80).
```
obfs4 Bridge with `Tor 0.4.7.13`:
```
[warn] Possible compression bomb; abandoning stream.
[warn] Error while uncompressing data: bad input?
[warn] Unable to decompress HTTP body (tried Zstandard compressed, on Directory connection (client reading) with 95.214.53.221:443).
```
Relay with `Tor 0.4.8.0-alpha-dev`:
```
[warn] Possible compression bomb; abandoning stream.
[warn] Unable to decompress HTTP body (tried Zstandard compressed, on Directory connection (client reading) with 199.58.81.140:80).
```
There are no more logs on `notice` log level.
A user at the [forum](https://forum.torproject.net/t/compression-bomb-in-tor-logs/6226) has the same problem.Alexander Færøyahf@torproject.orgAlexander Færøyahf@torproject.orghttps://gitlab.torproject.org/tpo/core/arti/-/issues/738Write a proxy-plus-socketpair wrapper for DataStream2024-02-24T18:32:24ZNick MathewsonWrite a proxy-plus-socketpair wrapper for DataStreamIn order to provide FFI for the most general purposes, we'll need to provide real sockets. That implies a socketpair, one side of which we give to the application, and the other side of which is backed by a `DataStream`.
This is necessa...In order to provide FFI for the most general purposes, we'll need to provide real sockets. That implies a socketpair, one side of which we give to the application, and the other side of which is backed by a `DataStream`.
This is necessary for #737
See [`ExportedApiSketch.md`](https://gitlab.torproject.org/tpo/core/arti/-/blob/main/doc/dev/ExportedApiSketch.md) for early thoughts.Arti: RPC Supporthttps://gitlab.torproject.org/tpo/core/tor/-/issues/40746Conflicting logic about whether bridges need descriptors for fetching dir inf...2023-04-12T14:42:46ZRoger DingledineConflicting logic about whether bridges need descriptors for fetching dir info from themIf you start your Tor with a pile of configured bridges but nothing cached, your Tor will sample the configured bridges to pick its ordered list of primary entry guards, and launch descriptor fetches to each of them.
But if the descript...If you start your Tor with a pile of configured bridges but nothing cached, your Tor will sample the configured bridges to pick its ordered list of primary entry guards, and launch descriptor fetches to each of them.
But if the descriptor hasn't arrived yet, while trying to bootstrap dir info you get these confusing messages in your logs:
```
Jan 31 18:56:44.928 [notice] Ignoring directory request, since no bridge nodes are available yet.
```
Things do bootstrap eventually, but it takes longer than it should, and the pile of scary log messages is scary.
What's going on here?
The way the log message comes about is that directory_get_from_dirserver() calls
```
const node_t *node = guards_choose_dirguard(dir_purpose, &guard_state);
if (node && node->ri) {
[...]
} else {
[...]
log_notice(LD_DIR, "Ignoring directory request, since no bridge "
"nodes are available yet.");
}
```
i.e. guards_choose_dirguard had better return a bridge for which we have the descriptor, or we're going to log a complaint and abort the directory fetch attempt.
But in select_primary_guard_for_circuit(), we do
```
const int need_descriptor = (usage == GUARD_USAGE_TRAFFIC);
[...]
SMARTLIST_FOREACH_BEGIN(gs->primary_entry_guards, entry_guard_t *, guard) {
[...]
if (guard->is_reachable != GUARD_REACHABLE_NO) {
if (need_descriptor && !guard_has_descriptor(guard)) {
log_info(LD_GUARD, "Guard %s does not have a descriptor",
entry_guard_describe(guard));
continue;
}
```
That is, in select_primary_guard_for_circuit() we require that the bridge have a descriptor only for the GUARD_USAGE_TRAFFIC case, but then in directory_get_from_dirserver() we expect that the bridge will always have a descriptor, even in the GUARD_USAGE_DIRGUARD case.
In normal operation this bug isn't a big deal, because it is a race to finish fetching the descriptor before we happen to pick it for asking directory info. But with the #40578 fix, where we defer fetching the descriptor if we won't use the bridge for the GUARD_USAGE_TRAFFIC case, the bug becomes more obvious.
I believe the fix is simply to always need_descriptor in select_primary_guard_for_circuit() -- meaning when we're going to launch a directory fetch we always choose among our primary guards who have descriptors already.https://gitlab.torproject.org/tpo/core/arti/-/issues/754should upgrading *some* dependancies be considered a breaking change2023-04-25T15:37:54Ztrinity-1686ashould upgrading *some* dependancies be considered a breaking changeUpgrading some dependencies can break downstream integrator. In particular, going from `rusqlite` 0.27 to 0.28 prevented a seamless upgrade of onionmasq from arti-some-old-version to arti-1.1.0, because 0.27 and 0.28 can't live together ...Upgrading some dependencies can break downstream integrator. In particular, going from `rusqlite` 0.27 to 0.28 prevented a seamless upgrade of onionmasq from arti-some-old-version to arti-1.1.0, because 0.27 and 0.28 can't live together apparently.
(found while working on onionmasq.)https://gitlab.torproject.org/tpo/core/arti/-/issues/766Should cargo-audit become advisory in CI?2023-04-25T15:39:30ZNick MathewsonShould cargo-audit become advisory in CI?Twice in the last 24 hours, I have had a branch where CI was passing before start failing because a package _which was okay when I opened the branch_ became insecure before the branch was merged.
Maybe we should make this an advisory fa...Twice in the last 24 hours, I have had a branch where CI was passing before start failing because a package _which was okay when I opened the branch_ became insecure before the branch was merged.
Maybe we should make this an advisory failure, and not have it make the branch unmergeable? OTOH, if we ignore this warning, it could allow us to merge known-insecure crates.https://gitlab.torproject.org/tpo/network-health/metrics/descriptorParser/-/issues/15Aggregate network totals for metrics with too many data points2024-01-16T13:49:10ZHiroAggregate network totals for metrics with too many data pointsWe should aggregate network totals when computing metrics that have too many data points to be aggregated by VictoriaMetrics itself.
One example is ``desc_relay_dirreq_v3_responses`` which can be aggregated by relay or a few other flags...We should aggregate network totals when computing metrics that have too many data points to be aggregated by VictoriaMetrics itself.
One example is ``desc_relay_dirreq_v3_responses`` which can be aggregated by relay or a few other flags but it cannot be aggregated over all relays by the timeseries DB.
This metric comes from parsing ExtraInfoDescriptor. I think a good approach would be to create an aggregated metrics that collects the total number of dirreq_v3_responses over a run of the parser. We can then aggregate the different aggregated measurement that we do over the course of a range of time.HiroHirohttps://gitlab.torproject.org/tpo/core/arti/-/issues/781Consider tor_bytes::write/read_counted_u8 etc2023-03-24T18:49:34ZNick MathewsonConsider tor_bytes::write/read_counted_u8 etcWe frequently want to encode a number of objects as a u8 or u16, and then encode the objects one by one. We could benefit from a helper method in Reader and/or Writer to do that.
One possible API:
```
impl Reader {...
pub fn read_co...We frequently want to encode a number of objects as a u8 or u16, and then encode the objects one by one. We could benefit from a helper method in Reader and/or Writer to do that.
One possible API:
```
impl Reader {...
pub fn read_counted_u8<V:Readable>(&mut self) -> Result<Vec<V>> { ... }
}
trait Writer {
pub fn write_counted_u8<V:Writeable>(&mut self, slice: &[V]) -> EncodeResult<()> {...}
}
```
This comes out of a discussion on !1052