This is the main ticket for GetTor logging. Let's try to discuss everything here or you can open new tickets and reference this as the parent ticket.
GetTor's logging is important so that we can see estimate how many users use it, what kind of bundles are important, etc. Note that we will not storing any information that can identify users of the service; our intent is to store counters so that we can know how many requests we had.
Here is what we will be storing (counters):
- Number of requests for the email bot. (A "request" is considered if we reply to an email with the links to the bundles.)
- Number of requests for the other distribution channels: Twitter and XMPP. (A "request" is considered if we reply to a query with the links to the bundles.)
- OS: Windows, Linux or OS X.
- Locale: Language of the request (en, es, etc.)
- Requests per day: this is useful if in events of censorship, if there was an increase in the number of requests for a given day.
Talking with ilv, he described how we are storing user data.
- In the SQLite database, we have a table that stores the sha256 of the address so that we can prevent GetTor from being spammed. Let's clear this after a day so that we don't keep the hashed email address for long and also because since we are not actually sending out the bundles, we shouldn't enforce harsh limits on blacklisting addresses.
On a related note, after how many requests does an email address get blacklisted?
- In the request table we only store the counter for the requests. This is fine. From the log files, we should extract the other information, update this request table and then use that to generate the automatic reports.
ilv, seems fine? Let's finalize this before the implementation.