|
|
= Network Health =
|
|
|
|
|
|
We will organize potential network health team work according priorities. This will involve
|
|
|
1. Getting some rough understanding what is in scope and where items overlap with other teams' work
|
|
|
1. Prioritizing the genuine network health tasks
|
|
|
1. Thinking about interactions with other teams that work on related topics (Do we need new channels for that? Do we need special contact persons in the respective teams? ...)
|
|
|
|
|
|
**Material:**
|
|
|
|
|
|
We'll start with arma's mail about the roadmap for a potential network health team: https://lists.torproject.org/pipermail/tor-project/2018-December/002138.html. Thus, (re-)reading that thread and thinking along the scope of the session outlined above seems like a good idea.
|
|
|
|
|
|
**Facilitator(s):** GeKo + arma
|
|
|
|
|
|
**Audience:** Anyone interested in Tor network health
|
|
|
|
|
|
**Duration:** 1 hour
|
|
|
|
|
|
== Prep ==
|
|
|
|
|
|
You do not need any prep to make this session. (Re-)reading the mail linked to in the Material section above is recommended, though.
|
|
|
|
|
|
|
|
|
== Desired outcomes ==
|
|
|
|
|
|
* Having a prioritized list of things a network health team/person would work on
|
|
|
* Having a clear understanding where the network health work overlaps with other areas and how do avoid duplication of work
|
|
|
|
|
|
|
|
|
== Notes ==
|
|
|
|
|
|
{{{
|
|
|
1. Overview
|
|
|
- Standards for good relays (community team owns most of it)
|
|
|
- Documenting community standards about good relays [gus has done some work - community team ownership]
|
|
|
- Best practices for relay families [nobody currently; phase 2]
|
|
|
- Detecting and resolving bad relays [bad relay team has been working on that]
|
|
|
- Anomaly Analysis (nobody; Network Health engineer would own)
|
|
|
- Baselines for performance, usage, load, etc [bad-relays team]
|
|
|
- Finding DoS issues [network team]
|
|
|
- Tracking relay connectivity [nobody currently; phase 2; phase 1 examine current data]
|
|
|
- Looking for resource limit issues [nobody currently; phase 1 - examine current data]
|
|
|
- Look for default bridges hitting resource limits [nobody currently]
|
|
|
- exception reports (maybe relay operators should get it separately) [nobody currently]
|
|
|
- bandwidth authorities behavior [nobody currently]
|
|
|
- Ensure current usage stats are accurate (nobody owns; network health should)
|
|
|
- Tracking users, performance, relays by various metrics [metrics team, do we have all the metrics we need/want?]
|
|
|
- count users [with network team and metrics team; but currently nobody actively]
|
|
|
- Monitoring bridge growth and usage [with censorship team, metrics team]
|
|
|
- Relay advocacy; creating a community (nobody owns; community team should)
|
|
|
- maintain docs for setting up and running relays and bridges [gus - community team (check whether it gets maintained]
|
|
|
- grow a cohesive community of relay operators so they have peers [nobody currently]
|
|
|
- Keeping relays on right Tor version [nobody currently]
|
|
|
- Gamification/rewards/incentives for relays [nobody currently, could be a fellow or intern]
|
|
|
- strengthen non-profits that run relays [nobody currently]
|
|
|
- Communicate with companies that make heavy use of the Tor network [nobody currently]
|
|
|
- Maintain components of network (nobody; network health/network team should)
|
|
|
- maintain directory authority relationships [okay]
|
|
|
- keep bandwidth authorities working (including setting the right balance between speed and location diversity) [juga, pastly, network team]
|
|
|
- have enough tor browser default bridges, and keep them running smoothly [with censorship team; s30]
|
|
|
- update the fallbackdirs list [teor, gus]
|
|
|
|
|
|
2. What should we do with funding
|
|
|
- Ticket brainstorming captured on post-its
|
|
|
- Categories:
|
|
|
- Active scanning (e.g. DNS errors, Exitmap, OnionPerf)
|
|
|
- Metrics scripts/apps/APIs for querying/custom reports
|
|
|
- Analysis of reports (make analysis of data we have easier)
|
|
|
- Relay reporting
|
|
|
- Resource use (CPU, memory pressure, ...)
|
|
|
- Protocol violations
|
|
|
- Heartbeat log (Getting data out of the relay; What is acceptable here?)
|
|
|
- Bugfixes to user counting, other metrics
|
|
|
- Metrics for estimating how relay campaigns go (diff of capacity, numbers...)
|
|
|
|
|
|
3. Who owns what; who should own what?
|
|
|
- Documented above
|
|
|
}}} |
|
|
\ No newline at end of file |