The easiest way to understand how !GetTor works is by enumerating the
different steps involved in the problem we want to solve. Consider the
following situations:
1. Receive requests from users via different channels.
2. Process the received requests, extracting the information needed to provide an useful response, namely: source address/user, operating system and language.
3. Construct a response according to the information extracted.
4. Create an anti-flood mechanism that allows to blacklist specific users.
5. Verify that the source address/user of the request is not permanently or temporarily blacklisted.
6. Send back a reply with the links to download Tor Browser from some popular ''non-blocked'' cloud service.
7. Keep track of the number of requests received by !GetTor.
8. Upload Tor Browser to popular ''non-blocked'' cloud services.
The current design of !GetTor consists of a series of modules, each one
intended for a specific task. There are two big groups: the main modules,
and the service modules. The main modules are Core,
Blacklist and Database, aimed to cover the points 3), 4), 5) and 7).
The service modules are STMP, XMPP and Twitter, aimed to cover the points
1), 2), 5) and 6).
Whenever a request is received, it is handled by one of the service modules
according to the channel the request was sent by the user. These channels are email
for SMTP, chat for XMPP, and DM for Twitter. The corresponding module process
the request, collecting all the necessary data to provide
an useful reply, namely: operating system, language and source address/user.
It also makes sure that the source address/user is not blacklisted (See
Blacklisting for details). If no valid data is found, then a help message
is sent back to the user. Otherwise, the service module contacts the Core
module asking for the links and then replies to the user.
In both cases, the Core module increases the number of requests received
in the database. A very simple diagram of the modules interaction looks
For more details on these methods please check the code repository and/or see the cloud service scripts section.
=== Distribution Channels
Ideally, a user should have various ways to contact !GetTor and receive
the Tor Browser. This ''distribution channels'' should parse a request,
get the user's OS and language, ask for the links to the core module and
then send this info back to the user. Ideally, each distribution channel
should be handled by a separate module. Currently, there is one distribution
channel deployed (SMTP), one implemented but not deployed (XMPP), and one
not finished (Twitter).
==== SMTP
This modules is on charge of receive and reply requests via email. Back
in 2008 when !GetTor was conceived, SMTP was the main and only distribution
channel. Requests were answered with the actual bundle as an attachment
instead of links. This approach was good, but the bundles started to get
larger in size to the point were it was no longer feasible to send it as
an attachment (the current size of Tor Browser is ~40Mb).
There three scenarios involved in sending links via email:
* Listen for users' emails directed to !GetTor robot.
* Determine the type of request and get the necessary data to reply it.
* Send back a reply to the user.
The first point is handled by the mail server provided by the Tor Project.
In addition, we use email forwarding to make sure we get all the emails
directed to !GetTor robot. For this a .forward like the following is used:
{{{#!bash
|"python2.7 /path/to/gettor/smtp_process.py"
}}}
With this, the only concern of the smtp_process.py script is to receive
emails fron the standard input and talk to the SMTP module to process it.
The SMTP module has only one public method:
{{{#!python
process_email(raw_msg)
}}}
Where:
'''raw_msg''': String that represents the email received.
A basic script for communicating with the SMTP module should look like this:
{{{#!python
#!/usr/bin/env python
import sys
import gettor.smtp
service = gettor.smtp.SMTP()
incoming = sys.stdin.read()
service.process_email(incoming)
}}}
The other two points are handled by the SMTP module. The first step after
receiving a request is determine if the address is blacklisted. See the
Blacklisting section to check the current process to do that. Then, the
next step is to determine the type of request received.For now, there are
only two types of request that could be received: help and links. The decision
process to determine what type we have received is the following:
* Does the body of the message include the words ''windows'', ''linux'', or ''osx''?
If so, we have received a links request.
* Any other case should be considered as a help request, including blank
emails.
For both types of request the language is obtainede from the address the email
was intended to: gettor+lc@torproject.org, where lc stands for the supported
locales by Tor Browser. Currently, the only locale supported is English.
If no locale is specified, we assume English by default.
Knowing the type and language of the request is enough to construct a reply
and send it to the user. Every time a reply is sent, the number of requests
received is increased in the database. See Database to check the current
DB schema.
==== XMPP
To be redacted.
==== Twitter
To be redacted.
=== Database
The database module, as its name suggests, is in charge of interacting
with the !GetTor database. The current design is quite simple and satisfies
two needs:
'''Add a request'''. For now it consists only in knowing how many requests we have received so far. No other data is collected.
'''Add/delete/update a user'''. This allow us to know how many requests a single user has made and thus avoid any type of flood (see Blacklisting). For this purpose we collect the following data:
* ''user'': 256 hash of the user address/account.
* ''service'': string that represents the distribution method used by the user
(e.g. SMTP).
* ''times'': number of requests received from the same user.
* ''blocked'': boolean flag to know if user is permanently blacklisted.
* ''last_request'': timestamp that represents the last time a given user
made a request from the same distribution channel.
The initial design of the database module (during the revamp) considered a
lot of data to be collected (type of request, language, os, etc.), but
eventually we decided to keep just the necessary data to know how many
requests !GetTor has received and to avoid flood. The type of database
choosen for this purpose was SQLite. You can check a sample database in
the code repository (gettor.db).
=== Blacklisting
The current blacklisting mechanism is quite simply and it's based on the data
collected by the 'users' table specified in !GetTor's database, plus some
extra parameters defined in blacklist.cfg, which help us to stablish limits
to avoid flood. The current mechanism depends on four parameters:
* '''user''': Hashed address/account of the user. It helps to identify malicious users.
* '''service''': Service or distribution channel used by the user trying to contact !GetTor.
* '''max_req''': Maximum number of requests per user ''and'' service allowed at the moment.
* '''wait_time''': Number of minutes a user should wait until she reaches '''max_req'''.
Both the '''user''' and '''service''' parameteres are obtained in real time
when !GetTor receives a request. The other two, '''max_req''' and '''wait_time'''
are specified in blacklist.cfg. Each service module (e.g. SMTP) should be
in charge of specifying the path to this configuration file and interact
with the !Blacklisting module according to that information. The current
mechanism also depends on the '''last_request''', '''times''', and '''blocked'''
fields of the database for the record associated with '''user''' and '''service'''.
With all of this, the decision algorithm can be described as follows:
{{{#!python
if blocked:
update_user_on_db(user, service, times+1, 1)
raise BlacklistError("Blocked user")
elif times >= max_req:
last = get last_request from db
next = last + wait_time
if now < next:
# too many requests from the same user
update_user_on_db(user, service, times+1, 0)
raise BlacklistError("Too many requests")
else:
# fresh user again!
update_user_on_db(user, service, 1, 0)
else:
# adding up a request for user
update_user_on_db(user, service, times+1, 0)
}}}
This simple mechanism helps us avoid malicious users from flooding one or
more services/distribution channels with infinite requests. As you may
otice, if a user make a request before the '''wait_time''' has passed, then
the user must wait another '''wait_time''' to make a request again, and
if a user make a request after she has reached the maximun number of requests
and waited '''wait_time''', then the counter of her requests is setted to
one. You can check the {{{_is_blacklisted}}} method of the SMTP module to
see how a service should interact with the Blacklisting module.
This mechanism could certainly be improved. If you have any ideas/comments
about it, please tell us (ideally by filling a ticket :)
=== Cloud Services
For each service used by !GetTor to distribute the Tor Browser files there
should be a script in charge of uploading such files according to the methods
provided by the service used. Each one of these scripts must assume that the
latest Tor Browser files has been downloaded (see Other Scripts) and contemplate the
following tasks (in order):
1. Get the fingerprint from the key used to sign the Tor Browser.
2. Use the Core module to create a new links file (core.create_links_file).
3. Obtain the sha256 checksum of each {tar.xz, exe, dmg} file to be uploaded.
4. Check that the corresponding .asc signature exists for each {tar.xz, exe, dmg} file to be uploaded.
5. Identify the architecture, language and operating system associated to each {tar.xz, exe, dmg} file to be uploaded.
6. Create a string describing a new link, using the information identified before.
7. Use the Core module to add a link to the new links file created (core.add_link), specifying the service, the operating system, and the language (locale).
You can check the existing scripts for Dropbox and Google Drive to see the current
methods used to do the points listed above, specially 1, 3, 5, and 6. For more details
on how the links files are created and how the links are stored, check the documentation
about the Core module. Below is a list of the current services/providers integrated
with !GetTor:
* '''Dropbox''': Deployed. In use for a long time.
* '''Google Drive''': Implemented, but not yet deployed.
* '''Github''': Implemented, but not yet deployed. This one should be especially useful to distribute the Tor Browser
in places where Dropbox and Google Drive are blocked (e.g. China).
If you have an idea for a new service that could be used (even if you don't know how to implement it),
please contact us (ideally by filling a ticket :).
=== Other Scripts
Below is a list of scripts used for diverse and "smaller" tasks:
* '''blacklist.py''': Handle blacklisting of users. Execute blacklist.py -h for more details.
* '''create_db.py''': Handle the creation of the SQLite database used by !GetTor for managing blacklisting of users and keep track of basic stats. Execute create_db.py -h for more details.
* '''stats.py''': Handle basic stats according to the information stored in the SQLite database. Execute stats.py -h for more details.
* '''fetch_latest_torbrowser.py''': Automate the download of Tor Browser files from Tor Project's website and upload of these files to the services used by !GetTor every time a new stable version of Tor Browser is available. Implemented, but not yet deployed. See the source file in the repository for more details.