|
|
[[TOC]]
|
|
|
|
|
|
[[Image(https://collector.torproject.org/images/collector-logo.png, link=https://collector.torproject.org, valign=middle, height=70)]] [[Image(https://collector.torproject.org/images/collector-wordmark.png, link=https://collector.torproject.org, valign=middle, height=20)]]
|
|
|
|
|
|
= CollecTor Development =
|
|
|
|
|
|
This is a living and changing document to accompany the current project for improving [https://CollecTor.torproject.org CollecTor].
|
|
|
|
|
|
== Areas of Work
|
|
|
During the course of this project the following sections will more and more turn into descriptions and documentation.
|
|
|
Currently, they are a mixture of very defined improvements as well as sketches and wishes and questions.
|
|
|
|
|
|
=== Analyze Descriptor Completeness
|
|
|
|
|
|
The analysis will be based on log-files and the downloaded files and address the following questions:
|
|
|
==== How many descriptors are missing?
|
|
|
* Details about missing referenced descriptors can be found here: [wiki:doc/CollecTor/AnalysisDescriptorCompleteness Analysis Part 1]
|
|
|
* Details about missing consensus and votes: [wiki:doc/CollecTor/AnalysisVotesAndConsensusCompleteness Analysis Part 2]
|
|
|
* Analysis of missing referenced descriptors on the current development CollecTor mirror: [wiki:doc/CollecTor/AnalysisDescriptorCompletenessFromScratch Analysis of pure download mirror]
|
|
|
|
|
|
==== How could this loss be avoided?
|
|
|
* actively monitor resources like available storage space (discussion in ticket #18865).
|
|
|
* verify and improve runtime statistics in order to have a clearer picture (discussion in ticket #19169).
|
|
|
* Extra-info descriptors dropped b/c of parsing problems are counted as missing. This should be avoided. ticket #19170.
|
|
|
|
|
|
==== Next Steps ====
|
|
|
Continue analysis when sync-process is deployed.
|
|
|
|
|
|
=== Provide Guide Documents
|
|
|
|
|
|
These guides should be based on the previous work in [https://onionoo.torproject.org Onionoo] and metrics-lib. In detail
|
|
|
|
|
|
* Contributor's Guide: create as detailed in #18733 and place the new guide in a central location, which still needs to be identified; this could be a large document in the central place and a small document in CollecTor referencing the main document. (detailed discussion in #18730)
|
|
|
* Release Process (definded in #18732)
|
|
|
* Installation Guide for Operators (adapt the [https://gitweb.torproject.org/collector.git/tree/INSTALL.md existing document]), ticket #18734
|
|
|
|
|
|
=== Implement the Release Process
|
|
|
|
|
|
(according to the guide above)
|
|
|
|
|
|
== Design Changes
|
|
|
|
|
|
This section describes improvements that ought to make CollecTor more maintainable, testable, and more efficient.
|
|
|
|
|
|
1. Run collector with an internal scheduler instead of using external scheduling (e.g. crontab), #19018
|
|
|
1. Add shutdown hook to provide a controlled way of stopping. Discussion #19016.
|
|
|
1. Some parts of CollecTor's data processing are provided by bash scripts run via crontab. These should be integrated into the java application.
|
|
|
|
|
|
=== Improve CollecTor Operation and Setup
|
|
|
|
|
|
Once there is the executable jar including the shutdown hook implementation CollecTor should be started as a linux service, i.e., an appropriate shell script needs to be provided.
|
|
|
|
|
|
=== Further Sketches of Areas for Improvements
|
|
|
|
|
|
* store unparsable descriptors rather than discarding them
|
|
|
- add local storage for descriptors that cannot be parsed for review by the service operator and later reprocessing
|
|
|
* synchronization between CollecTor instances see #18910 and DescriptorDistribution
|
|
|
* improve the process of creating tarballs
|
|
|
- reduce memory consumption throughout
|
|
|
* consider using an embedded http server in order to reduce operating complexity
|
|
|
|
|
|
== Releases
|
|
|
=== Release 1.1.0
|
|
|
Release date: tbd
|
|
|
[[TicketQuery(milestone=CollecTor 1.1.0,format=table,order=status,desc=true,col=id|summary|status|max=20)]]
|
|
|
|
|
|
=== Release 1.2.0
|
|
|
Release date: tbd
|
|
|
[[TicketQuery(milestone=CollecTor 1.2.0,format=table,order=status,desc=true,col=id|summary|status|max=20)]]
|
|
|
|
|
|
=== Release 2.0.0
|
|
|
Release date: tbd
|
|
|
[[TicketQuery(milestone=CollecTor 2.0.0,format=table,order=status,desc=true,col=id|summary|status|max=20)]]
|
|
|
|
|
|
== Past Releases
|
|
|
=== Release 1.0.2, October 7, 2016
|
|
|
[[TicketQuery(milestone=CollecTor 1.0.2,format=table,order=status,desc=true,col=id|summary|severity|max=20)]]
|
|
|
|
|
|
=== bugfix Release 1.0.1, August 22, 2016
|
|
|
Prevent out-of-memory error, cf. #19913.
|
|
|
|
|
|
=== First Release 1.0.0, August 11, 2016
|
|
|
[[TicketQuery(milestone=CollecTor 1.0.0,format=table,order=status,desc=true,col=id|summary|severity|max=20)]]
|
|
|
|
|
|
== All Tasks in Trac
|
|
|
|
|
|
=== Active Tasks
|
|
|
|
|
|
[[TicketQuery(component=Metrics/CollecTor,status=!closed,format=table,order=changetime,desc=true,col=id|summary|status|priority|severity|reporter|changetime,max=10)]]
|
|
|
|
|
|
=== Completed Tasks
|
|
|
|
|
|
[[TicketQuery(component=Metrics/CollecTor,status=closed,format=table,order=changetime,desc=true,col=id|summary|priority|severity|reporter|changetime,max=10)]] |