title: TPA-RFC-3: tools
Summary: we try to restrict the number of tools users and sysadmins need to learn to operate in our environment. This policy documents which tools we use.
Background
A proliferation of tools can easily creep up into an organisation. By limiting the number of tools in use, we can keep training and documentation to a more reasonable size. There's also the off chance that someone might already know all or a large proportion the tools currently in use if the set is smaller and standard.
Proposal
This proposal formally defines which tools are used and offered by TPA for various things inside of TPO.
We try to have one and only one tool for certain services, but sometimes we have many. In that case, we try to deprecate one of the tools in favor of the other.
Scope
This applies to services provided by TPA, but not necessarily to all services available inside TPO. Service admins, for example, might make different decisions than the ones described here for practical reasons.
Tools list
This list consists of the known policies we currently have established.
- version control: git, gitolite
- operating system: Debian packages (official, backports, third-party and TPA)
- host installation: debootstrap, FAI
- ad-hoc tools: SSH, Cumin
- directory servers: OpenLDAP, BIND, ud-ldap, Hiera
- authentication servers: OpenLDAP, ud-ldap
- time synchronisation: NTP (
ntp
Debian package, from ntp.org) - Network File Servers: DRBD
- File Replication Servers: static mirror system
- Client File Access: N/A
- Client OS Update: unattended-upgrades, needrestart, dsa nagios checks, multiple reboot tools (ticket #33406)
- Client Configuration Management: Puppet
- Client Application Management: Debian packages, systemd
lingering, cron
@reboot
targets (deprecated) - Mail: SMTP/Postfix, Mailman, ud-ldap, dovecot (on gitlab)
- Printing: N/A
- Monitoring: syslog-ng central host, Nagios, Prometheus, Grafana, no paging
- password management: pwstore
- help desk: Trac, email, IRC
- backup services: bacula, postgresql hot sync
- web services: Apache, Nginx, Varnish (deprecated), haproxy (deprecated)
- documentation: ikiwiki, Trac wiki
- datacenters: Hetzner cloud, Hetzner robot, Cymru, Sunet, Linaro, Scaleway (deprecated)
- Programming languages: Python, Perl (deprecated), shell (for short programs)
TODO
- figure out scope... list has grown big already
- are server specs part of this list?
- software raid?
- add Gitlab issues to help desk, deprecate Trac
- add Fabric to host installs and ad-hoc tools
- consider Gitlab wiki as a ikiwiki replacement?
- add RT to help desk?
Examples
- all changes to servers should be performed through Puppet, as much as possible...
- ... except for services not managed by TPA ("service admin stuff"), which can be deployed by hand, Ansible, or any other tool
Deadline
No deadline set yet, still drafting.
Status
This proposal is currently in the draft
state.
References
Drafting this policy was inspired by the limiting tool dev choices blog post from Chris Siebenmann from the University of Toronto Computer Science department.
The tool classification is a variation of the infastructures.org checklist, with item 2 changed from "Gold Server" to "Operating System". The naming change is rather dubious, but I felt that "Gold Server" didn't really apply anymore in the context of configuration management tools like Puppet (which is documented in item 13). Debian is a fundamental tool at Tor and it feels critical to put it first and ahead of everything else, because it's one thing that we rely on heavily. It also does somewhat acts as a "Gold Server" in that it's a static repository of binary code. We also do not have uniform "Client file access" (item 10) and "Printing" (item 16). Item 18 ("Password management") was also added.