Skip to content
GitLab
Projects Groups Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in
  • I Ideas
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Issues 8
    • Issues 8
    • List
    • Boards
    • Service Desk
    • Milestones
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
  • Activity
  • Create a new issue
  • Issue Boards
Collapse sidebar
  • The Tor Project
  • Network Health
  • Metrics
  • Ideas
  • Issues
  • #28276
Closed (moved) (moved)
Open
Issue created Nov 01, 2018 by George@gman999

towards an Exception Reports framework

Currently, there's a number of pulse-checks of the network and its components conducted. IRL's tickets legacy/trac#24070, legacy/trac#24071, legacy/trac#24073 raise a few more.

However, I think we have to step back and start looking at an organized framework on this.

Exception reports are basically overviews about significant changes in some routine/activity. We determine some baseline, say, the consensus weight of each bandwidth authorities, and note if there's a drastic change, maybe daily or twice-a-day, then notify the relevant parties.

The basics would be this:

  • we determine the areas to address, such as public relays, exits-only, dirauths, bwauths, bridges, censorship, guards, etc.

  • we determine metrics we need to see, eg, changes in CW, bandwidth advertised, versions, TTL... and determine a baseline, maybe within a standard deviation or so.

  • then we figure out who needs to know when something is outside the baseline range.

  • we could also develop some automated or human-driven 'next-steps', eg, Call X bwauth and tell them to ping their upstream, file a track ticket, email some alias@ of people.

  • Another more interesting direction, yet vital, would be to incorporate the OONI data, which would be a much better detailed baseline of network health.

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Assignee
Assign to
Time tracking