Skip to content

GitLab

  • Projects
  • Groups
  • Snippets
  • Help
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in
W
Website
  • Project overview
    • Project overview
    • Details
    • Activity
  • Issues 41
    • Issues 41
    • List
    • Boards
    • Labels
    • Service Desk
    • Milestones
  • Operations
    • Operations
    • Incidents
  • Analytics
    • Analytics
    • Value Stream
  • Members
    • Members
  • Activity
  • Create a new issue
  • Issue Boards
Collapse sidebar

GitLab is used only for code review, issue tracking and project management. Canonical locations for source code are still https://gitweb.torproject.org/ https://git.torproject.org/ and git-rw.torproject.org.

  • The Tor Project
  • Metrics
  • Website
  • Issues
  • #2519

Closed
Open
Opened Feb 09, 2011 by Karsten Loesing@karsten

Change aggregation from daily averages to rolling 24-hour averages for censorship detector

We're currently aggregating most stats by calculating daily means. The main reason for doing so is the smoothing effect that makes it easier to understand trends. Also, daily aggregates make it easier to keep our materialized views up-to-date. In general, daily aggregates are sufficient when we're interested in long-term developments of 1 month or more.

There are at least two shortcomings of daily averages: We need to wait until at least half a day is over, better one day, before displaying data for that day. Another shortcoming is that 1 data point per day is not enough when looking at short time intervals of, say, one to two weeks.

Instead of daily averages, we could use rolling 24-hour averages. Every data point would be the average (mean) of the 24 hours ending at that data point. The 24-hour rolling average removes intra-day patterns and gives us a smooth curve, too. I attached an example of running relays with the raw data and the 24-hour rolling average. Compare this to our current graph.

If we decide we want to try rolling averages, I'll have to fight R some more. We should start with the relay flags graph and add other graphs based on the network status consensus. Graphs based on the bandwidth histories in extra-info descriptors, including our user number estimates, are going to be more difficult.

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Assignee
Assign to
None
Milestone
None
Assign milestone
Time tracking
None
Due date
None
Reference: tpo/metrics/website#2519