The Tor Project issueshttps://gitlab.torproject.org/groups/tpo/-/issues2023-08-24T15:15:04Zhttps://gitlab.torproject.org/tpo/anti-censorship/bridgedb/-/issues/40018Evaluation of bridge statistics2023-08-24T15:15:04ZCecylia BocovichEvaluation of bridge statisticsSee what we information we have, what we need, and how we can use these statistics.See what we information we have, what we need, and how we can use these statistics.meskiomeskio@torproject.orgmeskiomeskio@torproject.orghttps://gitlab.torproject.org/tpo/anti-censorship/team/-/issues/112Reported as offline in metrics, some bridges are online and running2024-02-29T15:22:53ZGusReported as offline in metrics, some bridges are online and runningSince last week, some bridge operators are reporting that their bridge is 'offline' in Metrics, but they are online and running.
I can confirm that this is happening. One of my bridges is marked as [offline](https://metrics.torproject....Since last week, some bridge operators are reporting that their bridge is 'offline' in Metrics, but they are online and running.
I can confirm that this is happening. One of my bridges is marked as [offline](https://metrics.torproject.org/rs.html#details/25A5B3BB5449EC5A0D4AE4DB657899C02C186EBE), but on the tor logs I see:
>Nov 28 12:02:57.000 [notice] Heartbeat: Since last heartbeat message, I have seen 200 unique clients.
Other messages on the logs:
```
Nov 20 12:23:29.000 [notice] Guard bauruine ($5B83DC983406651A0B4F6AE1940793CDD6A6F92E) is failing more circuits than usual. Most likely this means the Tor network is overloaded. Success counts are 198/283. Use counts are 63/63. 227 circuits completed, 0 were unusable, 30 collapsed, and 5 timed out. For reference, your timeout cutoff is 324 seconds.
Nov 20 23:04:10.000 [notice] Our directory information is no longer up-to-date enough to build circuits: We're missing descriptors for 1/3 of our primary entry guards (total microdescriptors: 5983/6034). That's ok. We will try to fetch missing descriptors soon.
Nov 21 03:24:31.000 [notice] Guard rixtyminutes ($01AE2DE314276C82FCCC3603A1C2F3238E6544C9) is failing more circuits than usual. Most likely this means the Tor network is overloaded. Success counts are 109/156. Use counts are 37/37. 132 circuits completed, 0 were unusable, 23 collapsed, and 5 timed out. For reference, your timeout cutoff is 324 seconds.
```
Reddit: https://www.reddit.com/r/TOR/comments/z2o7ro/bridge_metrics_showing_offline/meskiomeskio@torproject.orgmeskiomeskio@torproject.orghttps://gitlab.torproject.org/tpo/network-health/team/-/issues/100Go over metrics docs and replace references to gitweb with those to Gitlab an...2024-03-27T09:44:42ZGeorg KoppenGo over metrics docs and replace references to gitweb with those to Gitlab and move missing code if neededWe see Gitlab as the canonical place for metrics code nowadays, yet at least a bunch of docs still point to Gitweb (and it's not clear whether there are still code bases git.tpo-only).
We should go over our code and docs to make sure Gi...We see Gitlab as the canonical place for metrics code nowadays, yet at least a bunch of docs still point to Gitweb (and it's not clear whether there are still code bases git.tpo-only).
We should go over our code and docs to make sure Gitlab is indeed seen by all of that as the True Location.
/cc @hiro @irlGeorg KoppenGeorg Koppenhttps://gitlab.torproject.org/tpo/network-health/team/-/issues/72Create an inventory of metric's running services and make quick health checks2022-06-02T12:47:03ZirlCreate an inventory of metric's running services and make quick health checksMany of the services have been ticking along, but might be building towards disaster with unchecked monitoring (or monitoring that is entirely absent). Let's build a list of all the services and make sure that they are not about to explo...Many of the services have been ticking along, but might be building towards disaster with unchecked monitoring (or monitoring that is entirely absent). Let's build a list of all the services and make sure that they are not about to explode:
* [x] CollecTor
* [x] collector.torproject.org
* [x] collector2.torproject.org
* [x] Onionoo
* [x] Updater
* [x] Web
* [x] metrics-web statistics
* [x] Metrics timeline - tpo/metrics/timeline#4
* [x] Relay Search
* [x] Exit Scanner
* [x] TorDNSEL
* [x] Tor Check
* [x] Exonerator
* [x] metrics-lib - dependencies are resolvable with Apache Ivy and test suite passes on master
* [x] OnionPerf - special case maintained by @acute so we know it's good
* [x] Consensus Health - special case maintained by @tom so we know it's good
If you spot any services are missing from this list, add a comment below!HiroHirohttps://gitlab.torproject.org/tpo/tpa/team/-/issues/41515meronense OOM2024-02-05T19:52:19Zanarcatmeronense OOMtoday, metrics.tpo went down because the OOM killer was invoked. not sure what happened. i restarted both metrics-r and metrics-web.service, pending further investigation.
this happened before, of course. we bumped the memory on that bo...today, metrics.tpo went down because the OOM killer was invoked. not sure what happened. i restarted both metrics-r and metrics-web.service, pending further investigation.
this happened before, of course. we bumped the memory on that box to 20GB in #41335 and had issues after the bullseye upgrade as well (#40814), both incidents should be investigated. those are just the incidents that pop up in the gitlab "Similar issues", further investigation in other issues probably warranted.
possibly related with the bookworm upgrade, of course (#41252).anarcatanarcathttps://gitlab.torproject.org/tpo/tpa/team/-/issues/40535colchicifolium disk full2023-06-07T15:45:23Zanarcatcolchicifolium disk fullcolchicifolium's disk is rising steadily, this is the last year:
![image](/uploads/e781feb8a476adcb640ab6a275d25e6b/image.png)
we can see when we added 50G then 200G more.
@hiro is thinking about redesigning this service, but in the m...colchicifolium's disk is rising steadily, this is the last year:
![image](/uploads/e781feb8a476adcb640ab6a275d25e6b/image.png)
we can see when we added 50G then 200G more.
@hiro is thinking about redesigning this service, but in the meantime, let's give this poor server a break.anarcatanarcathttps://gitlab.torproject.org/tpo/tpa/team/-/issues/41167connect to postgresql db on new metrics DB via tls2023-06-27T15:13:20ZHiroconnect to postgresql db on new metrics DB via tlsWould it be possible to get a read only user to connect to the postgresql db on metrics-psqlts-01 via tls?
This would be used to access it via grafana, but also allow metrics developers to query the data.
Possibly people that would acc...Would it be possible to get a read only user to connect to the postgresql db on metrics-psqlts-01 via tls?
This would be used to access it via grafana, but also allow metrics developers to query the data.
Possibly people that would access this would be:
@hiro
@gk
@mattrighettiJérôme Charaouilavamind@torproject.orgJérôme Charaouilavamind@torproject.orghttps://gitlab.torproject.org/tpo/tpa/team/-/issues/40965Planning how to deploy victoriametrics on metrics-psqlts-012022-11-28T22:51:23ZHiroPlanning how to deploy victoriametrics on metrics-psqlts-01I am thinking to start deploy victoriametrics to metrics-psqlts-01.
The quickstart guide suggest to either use docker or snap https://docs.victoriametrics.com/Quick-Start.html.
What would be your take on this? I know there is a debian p...I am thinking to start deploy victoriametrics to metrics-psqlts-01.
The quickstart guide suggest to either use docker or snap https://docs.victoriametrics.com/Quick-Start.html.
What would be your take on this? I know there is a debian package but it is also a bit outdated.
Cheers!
cc: @gkSponsor 112 : Combating Malicious RelaysHiroHirohttps://gitlab.torproject.org/tpo/tpa/team/-/issues/41130Deploy new metrics database stack2023-07-07T08:00:20ZHiroDeploy new metrics database stackI have been testing our victoriametrics + postgresql setup on metrics-psqlts-01 for a while, and now that we are close to have a prod deployment of this pipeline I'd like to have things properly in puppet.
I have a branch with a tentati...I have been testing our victoriametrics + postgresql setup on metrics-psqlts-01 for a while, and now that we are close to have a prod deployment of this pipeline I'd like to have things properly in puppet.
I have a branch with a tentative setup that I'd like to have your opinion on called metrics-deploy.
This branch has also support to deploy a python web app to access and query both the postgresql db and victoria metrics.
Victoria metrics runs with docker, but without compose. I am not sure you'd prefer a compose setup, since this is a single service.
An alternative would be to run the full stack with compose. Would postgresql backups work in that case?
I am going to be out next week. So maybe we could discuss this in costa rica face to face?HiroHirohttps://gitlab.torproject.org/tpo/onion-services/onion-support/-/issues/162Get EOTK stats for Sponsor 123 during November 20222022-12-08T19:28:07ZSilvio RhattoGet EOTK stats for Sponsor 123 during November 2022* [x] Get statistics for the November 2022 Narrative Report as [per contract](https://nc.torproject.net/apps/onlyoffice/242116?filePath=%2FSponsors%2FS123%20-%20USAGM%2FSubmitted%20Documents%2F2021-09-17%20Tor%20Secure%20Access%20-%20OTF...* [x] Get statistics for the November 2022 Narrative Report as [per contract](https://nc.torproject.net/apps/onlyoffice/242116?filePath=%2FSponsors%2FS123%20-%20USAGM%2FSubmitted%20Documents%2F2021-09-17%20Tor%20Secure%20Access%20-%20OTF%20TaS%20Narrative-1-20-2022_203_PM.docx):
* [x] Uptime of USAGM .onion addresses.
* [x] Number of visitors to USAGM .onion addresses (page hits, if `HiddenServiceExportCircuitID` data is not available).
* [x] Make sure that Onionprobe data is filtered out.
* [x] [Progress Tracker](https://nc.torproject.net/apps/onlyoffice/296583?filePath=%2FOnion%20Services%2FOnion%20Support%2FS123%20Progress%20Tracker.ods) (or a standalone sheet): statistics subsheet built from [eotk-log-parser](https://gitlab.torproject.org/tpo/onion-services/eotk-log-parser).
* [x] Create the ticket for the next stats gathering.
/cc @rayaSponsor 123: Tor Secure Access Package for USAGM [First Phase]Silvio RhattoSilvio Rhatto2022-12-01https://gitlab.torproject.org/tpo/onion-services/onion-launchpad/-/issues/66Frontend support for Matomo analytics2023-05-16T14:34:32ZSilvio RhattoFrontend support for Matomo analyticsOnion Launchpad implementation of Matomo analytics according to [this proposal][].
## Requirements
1. [x] The feature MUST be *disabled* by default, and enabled only if some
environment variables are set (like the analytics endp...Onion Launchpad implementation of Matomo analytics according to [this proposal][].
## Requirements
1. [x] The feature MUST be *disabled* by default, and enabled only if some
environment variables are set (like the analytics endpoint and a site ID/key).
2. [x] There MUST be documentation stating that this feature, even with a better
configuration in terms of privacy, could still be a point of collecting
access data without passing to the Tor network for better anonymization. And
also would rely on additional JavaScript code embedded in the landing page.
3. [x] Services operators MUST be recommended to host the backend only behind an
HTTPS proxy without IP logging (and without passing the source IP to the
backend, so if there's any backend vulnerability it won't be possible to
attackers to discover user's IP addresses). ___Or even better: leave the
backend behind an Onion Service___.
4. [x] There MUST be a [consent UX][] informing users what and how it's collected,
and asking for authorization. No cookies should reside in the client machine.
## Implementation details
* [x] Plug the [Clean Insights JS SDK][] or the [Matomo JS SDK][] into [Onion Launchpad][].
* [x] Enable the metrics collection only if explicitly set by an environment variable during build time.
* [x] Implement a [consent UX][].
* [x] Implement the page hits collection.
* [x] Document the [analytics collection threat model][] (subsection "Landing page metrics" of this link/comment).
[analytics collection threat model]: https://gitlab.torproject.org/tpo/onion-services/onion-launchpad/-/issues/39#note_2854486
## Scope
This ticket covers:
* Basic functionality without censorship protection for the metrics system.
* Description: in this phase, the whole implementation is completed.
This ticket does not cover:
* The backend development.
* Implementing censorship protection for the metrics system.
* Content and styling for the consent UX (handled in a [distinct ticket][]).
[this proposal]: https://gitlab.torproject.org/tpo/onion-services/onion-launchpad/-/issues/39#note_2854486
[Onion Launchpad]: https://gitlab.torproject.org/tpo/onion-services/onion-launchpad
[distinct ticket]: tpo/onion-services/onion-launchpad#67
[Clean Insights JS SDK]: https://gitlab.com/cleaninsights/clean-insights-js-sdk
[Matomo JS SDK]: https://developer.matomo.org/guides/tracking-javascript-guide
[consent UX]: https://okthanks.com/blog/2021/5/14/clean-consent-uxSponsor 123: Tor Secure Access Package for USAGM [First Phase]Silvio RhattoSilvio Rhatto2023-01-20https://gitlab.torproject.org/tpo/onion-services/onionprobe/-/issues/49Tests and metrics for X.509 certificates2023-06-14T20:48:53ZSilvio RhattoTests and metrics for X.509 certificatesOnionprobe should check and have metrics for X.509 certificates:
* [x] Expiry date.
* [x] Fingerprint.
* [x] Issuer.
It also needs:
* [x] CommonName and SubjectAltName matching.
* [x] A Prometheus alerts for certificates about to expi...Onionprobe should check and have metrics for X.509 certificates:
* [x] Expiry date.
* [x] Fingerprint.
* [x] Issuer.
It also needs:
* [x] CommonName and SubjectAltName matching.
* [x] A Prometheus alerts for certificates about to expire.
That enables monitoring of certificate health and even alerting when certificates are about to expire.Silvio RhattoSilvio Rhatto2023-03-31https://gitlab.torproject.org/tpo/onion-services/onionprobe/-/issues/71Tor metrics Prometheus exporter2023-03-30T17:04:06ZSilvio RhattoTor metrics Prometheus exporterSetup a Tor metrics Prometheus exporter, so Onionprobe could have two distinct
Prometheus exporters:
1. The default, existing one with Onionprobe metrics.
2. Another with the Tor process metrics.
Tasks:
* [x] Add an Option to setup a ...Setup a Tor metrics Prometheus exporter, so Onionprobe could have two distinct
Prometheus exporters:
1. The default, existing one with Onionprobe metrics.
2. Another with the Tor process metrics.
Tasks:
* [x] Add an Option to setup a `MetricsPort` and `MetricsPortPolicy`
on the spawned Tor process, but disabled by default.
* [x] Include it also on the Prometheus collection and make it
available as a Grafana dashboard, but disabled by default.
* [x] Add a warning/document somewhere that this setting should be used with care.
* [x] Document how to enable the feature in the standalone monitoring node,
including again the warnings about this setting.
Documentation: https://gitlab.torproject.org/tpo/core/tor/-/issues/40762Sponsor 123: Tor Secure Access Package for USAGM [First Phase]Silvio RhattoSilvio Rhatto2023-04-04https://gitlab.torproject.org/tpo/onion-services/onionspray-log-parser/-/issues/7Add a flags on eotk-get-logs-from-s3 to select from/to dates2023-06-05T16:09:03ZSilvio RhattoAdd a flags on eotk-get-logs-from-s3 to select from/to dates* [x] Add a flags on `eotk-get-logs-from-s3` to allowing filtering logs by a data range or a single month. Only logs in that range (or in that month) should be copied.
* [x] Inform S123 analytics when this flag is ready to be tested.* [x] Add a flags on `eotk-get-logs-from-s3` to allowing filtering logs by a data range or a single month. Only logs in that range (or in that month) should be copied.
* [x] Inform S123 analytics when this flag is ready to be tested.Silvio RhattoSilvio Rhatto2023-05-31https://gitlab.torproject.org/tpo/onion-services/onionprobe/-/issues/55Setup Onionprobe meeting to evaluate next steps2022-08-12T23:38:06ZSilvio RhattoSetup Onionprobe meeting to evaluate next stepsAs discussed with @gus in our 2022-06-01 1:1 meeting, we could setup a meeting to discuss:
1. What we've done so far with Onionprobe. What it can do (now and in the future).
2. Gather new ideas.
3. Think about how to integrate with metr...As discussed with @gus in our 2022-06-01 1:1 meeting, we could setup a meeting to discuss:
1. What we've done so far with Onionprobe. What it can do (now and in the future).
2. Gather new ideas.
3. Think about how to integrate with metrics, like:
* Monitoring the health of Onion Service descriptors (reachability and latency).
Who to invite:
* Someone from the Network Team.
* Someone from the Network Health Team.
* Someone from TPA.Sponsor 123: Tor Secure Access Package for USAGM [First Phase]Silvio RhattoSilvio Rhattohttps://gitlab.torproject.org/tpo/web/community/-/issues/72[Content][Community Resources] GoodBad ISPs page improvements2023-11-08T23:50:48ZGus[Content][Community Resources] GoodBad ISPs page improvements* [x] Create link anchors for the countries
* [ ] Create Table of Content* [x] Create link anchors for the countries
* [ ] Create Table of ContentJérôme Charaouilavamind@torproject.orgJérôme Charaouilavamind@torproject.orghttps://gitlab.torproject.org/tpo/onion-services/onionprobe/-/issues/56Support for User-Agent HTTP Header2023-08-09T00:29:57ZSilvio RhattoSupport for User-Agent HTTP HeaderImplement configurable support for the `User-Agent` HTTP Header, so users can customize if (and which) user agent information they want to submit at each request.
As an example, this is useful for filtering out Onionprobe requests when ...Implement configurable support for the `User-Agent` HTTP Header, so users can customize if (and which) user agent information they want to submit at each request.
As an example, this is useful for filtering out Onionprobe requests when gathering page count statistics on Onion Service sites.Silvio RhattoSilvio Rhattohttps://gitlab.torproject.org/tpo/network-health/team/-/issues/75Merge exit scanner/TorDNSEL/check documentation2021-08-26T12:24:07ZirlMerge exit scanner/TorDNSEL/check documentationExit scanner documentation exists in its own wiki, to facilitate easier editing and maintenance of this it should get merged in here.Exit scanner documentation exists in its own wiki, to facilitate easier editing and maintenance of this it should get merged in here.irlirlhttps://gitlab.torproject.org/tpo/onion-services/onion-support/-/issues/93Setup Onionprobe visualizations for TPO2023-10-26T10:25:14ZSilvio RhattoSetup Onionprobe visualizations for TPOSetup Onionprobe visualizations for TPO in the Prometheus or Grafana dashboard.Setup Onionprobe visualizations for TPO in the Prometheus or Grafana dashboard.https://gitlab.torproject.org/tpo/network-health/team/-/issues/106Handle bad input more consistently in metrics code bases2022-06-02T12:47:53ZKarsten LoesingHandle bad input more consistently in metrics code basesWe started thinking about handling bad input while sanitizing bridge descriptors in CollecTor (legacy/trac#19834) and while reading relay descriptors in Onionoo (legacy/trac#20412). But before we implement any changes we should generali...We started thinking about handling bad input while sanitizing bridge descriptors in CollecTor (legacy/trac#19834) and while reading relay descriptors in Onionoo (legacy/trac#20412). But before we implement any changes we should generalize our strategies for handling bad input to avoid solving the same problem over and over. The result of this discussion can also serve as guide for future code.
What we're not covering here (but what we should think about anyway) is how we're handling issues during processing that are not directly related to bad input, like problems with sanitizing bridge IP addresses.
So, we can distinguish a couple of use cases where we're handling descriptors as input:
1. CollecTor downloads descriptors from the directory or TorDNSEL or Torperf or processes previously downloaded descriptors.
2. CollecTor synchronizes descriptors from other CollecTor instances.
3. CollecTor reads previously uploaded bridge descriptors and produces sanitized versions of them.
4. Metrics, Onionoo, ExoneraTor, and other applications download descriptors from CollecTor and use them locally.
The requirements on input data are quite different for these four use cases. Let's go through them:
1. As of a few weeks ago we're storing and serving descriptors even if metrics-lib cannot parse them. The idea is that CollecTor shouldn't decide what goes into the archive, but the directory authorities (or TorDNSEL and Torperf) should. As long as we can detect the descriptor type, extract the publication date, and possibly calculate the digest, we can accept a descriptor.
2. The requirements are pretty much the same as for 1.
3. We need to be very picky about bridge descriptors, in particular about unknown parts, because those might contain sensitive information that we'd rather not copy over to sanitized bridge descriptors.
4. Most applications would want to skip bad descriptors and not bother much about it. As of a few days ago, they all use metrics-lib for parsing descriptors.
So much about the differences. Let's also list the commonalities or possible goals for common behavior:
- Regardless of the chosen strategy, we should apply it to all variants of descriptor badness. This may sound obvious, but we're currently not doing this. For example, if we encounter an invalid "a" line in a bridge descriptor, we skip that line, whereas an invalid "r" line makes us skip the entire descriptor. This is part of what we're trying to streamline in legacy/trac#19834, even though this case is not explicitly listed there.
- Whenever we encounter an error in processing a descriptor, we should attempt to recover and continue with subsequent good descriptors. The reason is that descriptors in a common source can come from different original sources, and we cannot blame valid descriptors for following a faulty descriptor. This is the cause for legacy/trac#20412.
This is just the start. Let's add more thoughts to this ticket and assemble a guide that we can apply to existing tickets and future code changes.