Outline of CollecTor Descriptor Distribution
The sync-process will available for the modules relaydescs, bridgedescs, exitlists, and torperf. The additional functionality should be generalized as far as possible and module dependent functionality should be part of the module's code.
Configuration
- General settings:
Add the properties
SyncRelayDescriptors
,SyncBridgeDescriptors
,SyncExitLists
, andSyncTorperfFiles
to the respective properties sections. These properties have the enum typeSyncType
with the following values:Sync
,NoSync
, andSyncOnly
. The propertySyncFolder
contains the top path for storing the downloaded descriptors. - Choice of sync-sources:
The properties
SyncSourcesRelayDescriptors
,SyncSourcesBridgeDescriptors
, andSyncSourcesExitLists
are added to the respective properties sections. Each containing an array of strings specifying a source name and source URL for each CollecTor instance to retrieve descriptors from. - Choice of descriptors:
The entire substructure of 'recent' will be fetched, i.e.
recent/exit-lists/*
for exitlists,recent/relay-descriptors/**/*
for relaydescs, andrecent/bridge-descriptors/**/*
for bridgesdescs. - Backup of replaced local files:
There won't be a backup of replaced local files.
Fetching and Merging
If Sync*
has the value NoSync
, nothing is done. SyncOnly
will not start the module and immediately begin fetching from the instances configured in SyncSources*
. Sync
will first run the module and then begin to sync.
Processing
a. Retrieve descriptors from the CollecTor instances defined in SyncSources*
. These descriptors are stored in SyncFolder
under the host part of the instance's url, e.g. my-sync-folder/collector.torproject.org/recent/exit-lists
for exitlists from the main instance.
b. Following retrieval the fetched descriptors are examined:
i. discard descriptor files that do not contain what they should (see comment:11) and log a warning with sync-source info and reason (see criteria).
i. copy valid descriptors (see criteria) without a pre-existing local copy to the local *OutDirectory
(cf. collector.properties) and 'recent' structure.
i. if there is a local copy already, decide which copy to keep (see criteria).
I. local copy is kept, log debug message with source and reason.
I. local and fetched are identical, log debug message with source and reason.
I. Maybe later: fetched copy should replace local descriptor. Copy fetched descriptor to local *OutDirectory
and 'recent'. In all cases log debug message with source and reason.
Replacement criteria
As the replacement criteria are not fully defined yet and it is very likely that there will be more criteria in future a modular/pluggable approach seems useful, i.e.:
- define
KeepCriterium
andReplaceCriterium
interfaces - register implementing classes with CollecTor in order to facilitate the selection steps described above.
The only initial ReplaceCriterium
will never allow replacing.
The only initial KeepCriterium
is a valid descriptor is contained in the descriptor file.
For the initial implementation it suffices to hard-code the *Criterium
classes with the option to easily make that configurable later.