Unverified Commit b2d61020 authored by anarcat's avatar anarcat
Browse files

dump pad in git

parent dc989cfd
************************************************************************************
************************ TOR HACKWEEK MARCH 2021 **********************************
************************************************************************************
PROJECT: Prometheus alerts for anti-censorship metrics
This pad https://pad.riseup.net/p/2021-hackweek-prometheus-alerts
BBB Room to meet: https://tor.meet.coop/ana-is6-lrj-q8k https://pad.riseup.net/p/2021-hackweek-metrics-viz see https://tor.meet.coop/gab-vwe-8wh-un1
Summary: We have BridgeDB exporting prometheus metrics so far, and we could implement this for Snowflake. It would be great if we could get alerts when usage changes to notify us of possible censorship events. Somewhat related, it would also be nice to get alerts when default bridge usage drops off suddenly or directly connecting Tor users from different regions.
Skills Needed: Maybe Go (for changes to snowflake), maybe Python for other services, some sysadmin experience to figure out how to do the alerts, metrics pipeline experience.
Team:
cecylia (cohosh) (UTC -4)
tara(?) (UTC +1)
agix (UTC +1)
anarcat (utc-4)
Main objectives:
1) Documentation!
- document what prometheus2 is doing https://gitlab.torproject.org/tpo/tpa/team/-/wikis/howto/prometheus#monitored-services
- document all of our anti-censorship alerts in one place (where?) https://gitlab.torproject.org/tpo/tpa/team/-/wikis/howto/prometheus
2) Expand our prometheus metrics for anti-censorship services
- export existing snowflake metrics for prometheus - see https://gitlab.torproject.org/tpo/tpa/team/-/wikis/howto/prometheus#adding-metrics-for-users
- add disk space/RAM/CPU monitoring for anti-censorship services (isn't this already covered? i'm not sure :) not for snowflake, this is why documentation is the first step i guess XD) - just install a node exporter and tell me the endpoint :)
- expand the metrics tor exports for prometheus
3) Play around with prometheus alert rules to recognize both outages and trends
- tor exports prometheus data out of the metrics port now!
4) Figure out where to send all of our alerts
- We could end emails to our existing anti-censorship alerts mailing list: https://lists.torproject.org/cgi-bin/mailman/listinfo/anti-censorship-alerts
- Make sure we're also noticing logged errors for our services (we currently only use those for debugging)
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment