Skip to content

GitLab

  • Projects
  • Groups
  • Snippets
  • Help
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in
Trac
Trac
  • Project overview
    • Project overview
    • Details
    • Activity
  • Issues 246
    • Issues 246
    • List
    • Boards
    • Labels
    • Service Desk
    • Milestones
  • Operations
    • Operations
    • Metrics
    • Incidents
  • Analytics
    • Analytics
    • Value Stream
  • Wiki
    • Wiki
  • Members
    • Members
  • Activity
  • Create a new issue
  • Issue Boards
Collapse sidebar

GitLab is used only for code review, issue tracking and project management. Canonical locations for source code are still https://gitweb.torproject.org/ https://git.torproject.org/ and git-rw.torproject.org.

  • Legacy
  • TracTrac
  • Issues
  • #27076

Closed (moved)
Open
Opened Aug 08, 2018 by Karsten Loesing@karsten

Reconfigure collector2.tp.o to do less

We have two CollecTor instances: collector.tp.o on colchicifolium and collector2.tp.o on corsicum. Reasons for having two instances instead of one are related to failure tolerance:

  1. Whenever collector.tp.o fails, it doesn't fetch consensuses and votes from the directory authorities, and those are only available for an hour. If collector.tp.o fails for a couple hours, it can later fetch missing descriptors from collector2.tp.o.
  2. While collector.tp.o is down, Onionoo can fetch relay descriptors from collector2.tp.o and continue to provide recent data.

However, I think we went a bit too far when configuring collector2.tp.o to also sync descriptors from collector.tp.o. It does that with bridge descriptors and sanitized web logs.

Here's how the two instances are currently configured:

collector.tp.o/colchicifolium: 
RelaySources = Cache, Remote, Sync, Local
BridgeSources = Local
ExitlistSources = Remote
OnionPerfSources = Remote
WebstatsSources = Local

collector2.tp.o/corsicum:
RelaySources = Remote
BridgeSources = Sync
ExitlistSources = Remote
OnionPerfSources = Remote
WebstatsSources = Sync

It's the two "Sync" entries at the bottom. I think we mainly put them in so that the respective sync code gets executed, too, so that we would notice any issues with that.

I now believe that these entries are not helpful and potentially harmful, for several reasons:

  1. The sync mode of the bridgedescs module does not clean up the recent/ directory after placing descriptors there. The local mode would do that, but the sync mode does not. The effect is that bridge descriptors in recent/ pile up and fill up disk space. Even worse, Onionoo fetches everything contained in that directory, so that bootstrapping a new Onionoo instance downloads vast amounts of data these days.
  2. I don't yet know what happened in #27055 (moved), but it seems that simplifying the configuration of collector2.tp.o should make that issue at least less likely to happen again.

I could imagine reconfiguring collector2.tp.o to only perform the following tasks:

collector2.tp.o/corsicum:
RelaySources = Remote
ExitlistSources = Remote

The effect would be that we'd still keep our failure tolerance properties and nothing more.

Does that make sense? Did I miss anything important here?

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Assignee
Assign to
None
Milestone
None
Assign milestone
Time tracking
None
Due date
None
Reference: legacy/trac#27076