Skip to content
T

Team

Network Health Team

About us

Welcome to the Network Health page! Our team, along with many dedicated individuals in the Tor community, is committed to ensuring the well-being of the Tor network, its nodes, and the community of operators. Our objetives are network security, functionality, and reliability for all users, focusing our efforts on five areas, withing these groups.

Security

Security involves protecting the network from malicious activities, ensuring the integrity and confidentiality of data transmitted across the network.

This area involves setting standards and actively removing threats from the network.

  1. Track community standards about what makes a good relay
    • Publish up-to-date expectations for relay operators
    • Set best practices for how to set relay families
    • Detect and resolve bad relays
      • Exitmap, sybil detection, hsdir traps, etc.

This area is focused on identifying and mitigating anomalies that can pose security risks, this area is essential for maintaining a secure operating environment within the Tor network.

  1. Anomaly analysis / network health engineer [with network team]
    • Establish baselines of expected network behavior
    • Look for and resolve denial of service issues
    • Track connectivity issues between relays
    • Look for relays hitting resource limits

Functionality

Functionality ensures the network performs within healthy baselines, allowing access and usability for its users.

This area supports the functionality of the network by ensuring that growth and usage are monitored and optimized for performance, helping to manage the network efficiently based on actual usage patterns.

  1. Make sure usage/growth stats are collected and accurate
    • Track network performance, relay diversity by various metrics
    • Count users [with network team]
    • Monitor bridge growth and usage [with censorship team]

These efforts enhance the functionality of the network by providing necessary support and resources to those operating it, ensuring that it runs smoothly and effectively.

  1. Relay advocacy [with community team]
    • Maintain docs for setting up and running relays and bridges
    • Grow a cohesive community of relay operators so they have peers
      • Keep relays on the right tor versions
    • Relaunch a gamification / badge system for lauding good relay progress
    • Strengthen relationships with non-profit orgs that run relays
    • Help companies that want to offset their tor network load

Reliability

Reliability refers to the network's ability to consistently perform its intended function under normal and stress conditions, maintaining service continuity.

  1. Maintain the components of the network
    • Maintain directory authority relationships
    • Keep bandwidth authorities working (including setting the right balance between speed and location diversity)
    • Have enough tor browser default bridges, and keep them running smoothly [with censorship team]
    • Update the fallbackdirs list

This focus area is critical for reliability as it involves maintaining core network infrastructure, which ensures the network remains operational and robust against various challenges and demands.

Communication Channels

Just go to #tor-dev, and somebody from the team might either be around or appear later and get back to you.

We use IRC for our meetings, we meet on the OFTC network.

Team meeting UTC Location
Primary team meeting Monday 12:00 UTC #tor-meeting

The Network Health's asynchronous medium of communication are the network-health@, tor-relays@, and tor-dev@ mailing lists, depending on which is more applicable. These lists are public in the sense that anyone can subscribe, send emails, and read archives. Feel free to subscribe and just listen if you want, and feel free to post if you have a question that you think is on topic.

For metrics related topic our asynchronous medium of communication is the network-health@ mailing list. This list is public in the sense that anyone can subscribe and read archives. But it's moderated on the first post, meaning that your first post will be reviewed to make sure it's not spam and on topic and all further posts will go directly to the list. Feel free to subscribe and just listen if you want, and feel free to post if you have a question that you think is on topic.

General Priorities

  1. Detect and resolve bad relays
    • Exitmap, sybil detection, hsdir traps, etc.
  2. Anomaly analysis / network health engineer [with network team]
    • Establish baselines of expected network behavior
    • Monitor network disruption or problems
  3. Relay advocacy [with community team]
    • Strengthen relationships with non-profit orgs that run relays
    • Maintain docs for setting up and running relays and bridges
  4. Make sure usage/growth stats are collected and accurate
    • Track network performance, relay diversity by various metrics
  5. Maintain the components of the network to keep it healthy
    • Keep bandwidth authorities working (including setting the right balance between speed and location diversity)

PRIORITIES FOR 2024 [Q3-Q4]

Community Advocacy and Support

  1. Relay Community Engagement
    • Establish community-driven behavioral agreements and consequences for relay operators. (S112-O2)
    • Support the OTF fellow on Relay Operators Community Health Research. (S112-O2)
  2. Maintain the components of the network
    • Work with directory authority operators to plan transition from C to Arti

Network Health Engineering and Anomaly Analysis

  1. Relay Attacks Mitigation
    • Evaluate and implement solutions to relay attacks. (S112-O3 (minus O3.5))   
    • bandwidth inflation on the Tor network (S112-O3.5)
  2. Connectivity Tracking
    • Track relay-to-relay connectivity.
  3. Onbasca Refactoring
    • Refactor and redesign onbasca.
  4. Anomaly Analysis
    • Conduct surprise anomaly analysis on the network as needed.
  5. SBWS maintenance and development
    • Maintain and develop sbws
  6. Measure Arti performances
    • Create a test network with authorities, middle nodes, and exit nodes.

Detection and Resolution of Bad Relays

  1. Tooling Improvements
    • Improve tools for detecting bad relays.
  2. Detection and Resolution
    • Run bad-relay detection scripts regularly.

Metrics pipeline development and improvement

  1. Metrics Services Infrastructure
    • Deploy a data store and API for metrics services. (S112-O1)
    • Improve monitoring and alerting for metrics service.
  2. Metrics Website
    • Rebuild the Tor metrics website.

Support for Researchers

  1. Network Experiments
    • Provide support for researchers conducting network experiments.
  2. Tor Safetyboard work
    • Evaluate research proposals as needed

Past priorities and roadmaps

Active Projects

Network Health

We are concerned with the well-being of the Tor network and its particular relays. There are currently five main areas of work involved in that effort, guided by processes and policies which we have developed over time:

To work in those areas a lot of tools got developed over time both by Tor Project staff and external contributors and volunteers. Some of those tools are currently in use while others are obsolete or unused in our day-to-day work. For an overview see:

Metrics

We provide a set of monitoring and observability software tools and services for the public Tor network.

General Guides

Services and Tools

We list some long-term projects maintained under the metrics umbrella. These services are designed and developed with the flexibility to be mirrored, should the need arise.

Inventory

Metrics products are hosted on tpa maintained hardware, except for some onionperf installations which are administered via ansible.

We maintain a list of metrics VMs and the services they host.

Services Ops

We maintain a list of services and their operational documentation.

How to get involved

There are several areas where you could get involved:

Resources

Developer meeting notes

Other