Team
Network Health Team
About us
Welcome to the Network Health page! Our team, along with many dedicated individuals in the Tor community, is committed to ensuring the well-being of the Tor network, its nodes, and the community of operators. Our objetives are network security, functionality, and reliability for all users, focusing our efforts on five areas, withing these groups.
Security
Security involves protecting the network from malicious activities, ensuring the integrity and confidentiality of data transmitted across the network.
This area involves setting standards and actively removing threats from the network.
- Track community standards about what makes a good relay
- Publish up-to-date expectations for relay operators
- Set best practices for how to set relay families
- Detect and resolve bad relays
- Exitmap, sybil detection, hsdir traps, etc.
This area is focused on identifying and mitigating anomalies that can pose security risks, this area is essential for maintaining a secure operating environment within the Tor network.
- Anomaly analysis / network health engineer [with network team]
- Establish baselines of expected network behavior
- Look for and resolve denial of service issues
- Track connectivity issues between relays
- Look for relays hitting resource limits
Functionality
Functionality ensures the network performs within healthy baselines, allowing access and usability for its users.
This area supports the functionality of the network by ensuring that growth and usage are monitored and optimized for performance, helping to manage the network efficiently based on actual usage patterns.
- Make sure usage/growth stats are collected and accurate
- Track network performance, relay diversity by various metrics
- Count users [with network team]
- Monitor bridge growth and usage [with censorship team]
These efforts enhance the functionality of the network by providing necessary support and resources to those operating it, ensuring that it runs smoothly and effectively.
- Relay advocacy [with community team]
- Maintain docs for setting up and running relays and bridges
- Grow a cohesive community of relay operators so they have peers
- Keep relays on the right tor versions
- Relaunch a gamification / badge system for lauding good relay progress
- Strengthen relationships with non-profit orgs that run relays
- Help companies that want to offset their tor network load
Reliability
Reliability refers to the network's ability to consistently perform its intended function under normal and stress conditions, maintaining service continuity.
- Maintain the components of the network
- Maintain directory authority relationships
- Keep bandwidth authorities working (including setting the right balance between speed and location diversity)
- Have enough tor browser default bridges, and keep them running smoothly [with censorship team]
- Update the fallbackdirs list
This focus area is critical for reliability as it involves maintaining core network infrastructure, which ensures the network remains operational and robust against various challenges and demands.
Communication Channels
Just go to #tor-dev, and somebody from the team might either be around or appear later and get back to you.
We use IRC for our meetings, we meet on the OFTC network.
Team meeting | UTC | Location |
---|---|---|
Primary team meeting | Monday 12:00 UTC | #tor-meeting |
The Network Health's asynchronous medium of communication are the network-health@, tor-relays@, and tor-dev@ mailing lists, depending on which is more applicable. These lists are public in the sense that anyone can subscribe, send emails, and read archives. Feel free to subscribe and just listen if you want, and feel free to post if you have a question that you think is on topic.
For metrics related topic our asynchronous medium of communication is the network-health@ mailing list. This list is public in the sense that anyone can subscribe and read archives. But it's moderated on the first post, meaning that your first post will be reviewed to make sure it's not spam and on topic and all further posts will go directly to the list. Feel free to subscribe and just listen if you want, and feel free to post if you have a question that you think is on topic.
General Priorities
- Detect and resolve bad relays
- Exitmap, sybil detection, hsdir traps, etc.
- Anomaly analysis / network health engineer [with network team]
- Establish baselines of expected network behavior
- Monitor network disruption or problems
- Relay advocacy [with community team]
- Strengthen relationships with non-profit orgs that run relays
- Maintain docs for setting up and running relays and bridges
- Make sure usage/growth stats are collected and accurate
- Track network performance, relay diversity by various metrics
- Maintain the components of the network to keep it healthy
- Keep bandwidth authorities working (including setting the right balance between speed and location diversity)
PRIORITIES FOR 2025 [Q1-Q2]
Community Advocacy and Support
- Relay Community Engagement
- Establish community-driven behavioral agreements and consequences for relay operators. (P112-O2)
- Maintain the components of the network
- Work with directory authority operators to plan transition from C to Arti. (P141)
Network Health Engineering and Anomaly Analysis
- Relay Attacks Mitigation
- Evaluate and implement solutions to relay attacks. (P112-O3 (minus O3.5))
- bandwidth inflation on the Tor network (P112-O3.5)
- Connectivity Tracking
- Track relay-to-relay connectivity. (GSOC)
- Onbasca Refactoring
- Refactor and redesign onbasca.
- Anomaly Analysis
- Conduct surprise anomaly analysis on the network as needed.
- SBWS maintenance and development
- Maintain and develop sbws
- Measure Arti performances
- Ensure Arti collects performance metrics and delivers them to the metrics pipeline. (P141)
- Create a test network with authorities, middle nodes, and exit nodes. (P141)
- Discuss possible list of metrics we want to have from arti-based relays with the arti team. (P141)
Detection and Resolution of Bad Relays
- Tooling Improvements
- Improve tools for detecting bad relays.
- Detection and Resolution
- Run bad-relay detection scripts regularly.
Metrics pipeline development and improvement
- Metrics Services Infrastructure
- Deploy Network Status API for metrics services.
- Improve monitoring and alerting for metrics service.
- Metrics Website
- Rebuild the Tor metrics website.
Support for Researchers
- Network Experiments
- Provide support for researchers conducting network experiments.
- Tor Safetyboard work
- Evaluate research proposals as needed
Past priorities and roadmaps Priorities discussion pad
Active Projects
- Project 112 - Combating Malicious Relays
- Project 141 - Arti Relays
Network Health
We are concerned with the well-being of the Tor network and its particular relays. There are currently five main areas of work involved in that effort, guided by processes and policies which we have developed over time:
- Bad relay work
- Relay health work
- Tor network health work
- Network data analysis work
- Network experiments
To work in those areas a lot of tools got developed over time both by Tor Project staff and external contributors and volunteers. Some of those tools are currently in use while others are obsolete or unused in our day-to-day work. For an overview see:
Metrics
We provide a set of monitoring and observability software tools and services for the public Tor network.
General Guides
Services and Tools
We list some long-term projects maintained under the metrics umbrella. These services are designed and developed with the flexibility to be mirrored, should the need arise.
- CollecTor is the friendly data collecting service
- ExoneraTor helps to find out whether an IP address was used as a Tor relay
- Metrics Website is the primary place to learn interesting facts about the Tor network
- metrics-lib is a Java library that fetches and parses Tor descriptors.
- Onionoo is a web-based protocol to learn about currently running Tor relays and bridges
- Exit Scanner/TorDNSEL/Tor Check
- OnionPerf
- Metrics Timeline
Inventory
Metrics products are hosted on tpa maintained hardware, except for some onionperf installations which are administered via ansible.
We maintain a list of metrics VMs and the services they host.
Services Ops
We maintain a list of services and their operational documentation.
How to get involved
There are several areas where you could get involved:
- Contribute to one of our projects
- Get involved with network data analysis
- Help us redesign our metrics data pipeline
- Network Status API (NSA for short ;)
- descriptorParser
- TagTor
- tor_fusion (parse network documents in rust)