Skip to content
Snippets Groups Projects
Commit 9b2eacd2 authored by Iain R. Learmonth's avatar Iain R. Learmonth
Browse files

metrics markdown

parent c3253e09
No related branches found
No related tags found
No related merge requests found
# metrics-cloud
# Overview
[[!toc levels=3]]
## Overview
The metrics-cloud framework aims to enable:
......@@ -17,7 +20,7 @@ The CloudFormation templates are relevant only to testing and development, while
to both environments.
# Usage of AWS for Tor Metrics Development
## Usage of AWS for Tor Metrics Development
Each member of the Tor Metrics team has a standing allowance of 100USD/month for development using AWS. In practice,
we have not used more than 50USD/month for the team in any one month and generally sit around 25USD/month. It is
......@@ -25,7 +28,7 @@ still important to minimize costs when using AWS and the use of CloudFormation t
rapid creation, provisioning and destruction should help with this.
## CloudFormation Templates
### CloudFormation Templates
CloudFormation is an AWS service allowing the definition of *stacks*. These stacks describe a series of AWS services
using a domain-specific language and allow for the easy creation of a number of interconnected resources. All resources
......@@ -41,7 +44,7 @@ tracking of spending in the billing portal through the tags.
Documentation for CloudFormation, including an API reference, can be found at: <https://docs.aws.amazon.com/cloudformation/>.
### Quickstart: Deploying a template
#### Quickstart: Deploying a template
Each template begins with comments with any relevant notes about the template, and a deployment command that will upload
and deploy the template on AWS. The commands will look something like:
......@@ -56,7 +59,7 @@ Once the stack has been deployed from the template, you can view its resources a
the [CloudFormation management console](https://console.aws.amazon.com/cloudformation/home?region=us-east-1#/stacks?filteringText=&filteringStatus=active&viewNested=true&hideStacks=false).
### SSH Key Selection
#### SSH Key Selection
The [identify\_user.sh](https://gitweb.torproject.org/metrics-cloud.git/tree/cloudformation/identify_user.sh) script prints out the name of the SSH public key to be used based on either:
......@@ -70,7 +73,7 @@ If you change the default key you would like to use, update the mapping in this
SSH keys are managed through the [EC2 management console](https://console.aws.amazon.com/ec2/v2/home?region=us-east-1#KeyPairs:) and are not (currently) managed by a CloudFormation template.
## Templates and Stacks
### Templates and Stacks
There is no directory hierachy for the templates in the `cloudformation` folder of the repository. There are a couple of naming
conventions used though:
......@@ -79,13 +82,13 @@ conventions used though:
- Long-term and shared templates/stacks start with `metrics-`
### `billing-alerts`
#### `billing-alerts`
The [`billing-alerts` template](https://gitweb.torproject.org/metrics-cloud.git/tree/cloudformation/billing-alerts.yml) sends notifications to the subscribed individuals whenever the predicted spend for the month will be
over 50USD. Email addresses can be added here if other people should be notified too.
### `metrics-vpc`
#### `metrics-vpc`
The [`metrics-vpc` template](https://gitweb.torproject.org/metrics-cloud.git/tree/cloudformation/metrics-vpc.yml) contains shared resources for Tor Metrics development templates. This includes:
......@@ -140,7 +143,7 @@ The [`metrics-vpc` template](https://gitweb.torproject.org/metrics-cloud.git/tre
These domain names should **never** appear on anything user facing and are for **development purposes only**.
### Typical Dev/Testing Stacks
#### Typical Dev/Testing Stacks
A typical test/dev stack will consist of an EC2 instance and a DNS name. Some services store a lot of data and may have
a second volume attached for the data storage.
......@@ -201,7 +204,7 @@ It's not common to use other AWS services as part of these templates as the goal
deployed on TPA managed hosts.
## Linting
### Linting
[`cfn-lint`](https://github.com/aws-cloudformation/cfn-python-lint) is used to ensure we are complying with best practices. None of the team have formal training in the use of CloudFormation
so we are really making it up as we go along. Other tools may be used in the future, as we learn about them, to make sure we are using
......@@ -210,13 +213,13 @@ things efficiently and correctly.
This is also run as part of the [continuous integration checks](https://travis-ci.org/github/torproject/metrics-cloud/) on Travis CI.
# Ansible Playbooks
## Ansible Playbooks
Ansible is an open-source software provisioning, configuration management, and application-deployment tool. It's written in Python,
is mature, and has an extensive selection of modules for almost everything we could need.
## Inventories and site.yml
### Inventories and site.yml
In general, there are two inventories: [production](https://gitweb.torproject.org/metrics-cloud.git/tree/ansible/production) and dev. Only the production inventory is committed to git, the dev inventory will
vary between members of the team, referencing their own dev instances as created by CloudFormation. We do not specify a default
......@@ -229,7 +232,7 @@ Inside the inventory, hosts are grouped by their purpose. For each group there i
allow multiple hosts to be provisioned together.
## `metrics-common`
### `metrics-common`
The [`metrics-common`](https://gitweb.torproject.org/metrics-cloud.git/tree/ansible/roles/metrics-common) role allows us to have a consistent environment between services, and closely matches the environment that
would be provided by a TSA managed machine. The role handles:
......@@ -249,10 +252,10 @@ This is all configured via group variables in the [`ansible/group_vars/`](https:
these work. These override the [defaults](https://gitweb.torproject.org/metrics-cloud.git/tree/ansible/roles/metrics-common/defaults/main.yml) set in the role.
## Service roles
### Service roles
## Linting
### Linting
[`ansible-lint`](https://docs.ansible.com/ansible-lint/) is used to ensure we are complying with best practices. None of the team have formal training in the use of Ansible
so we are really making it up as we go along. Other tools may be used in the future, as we learn about them, to make sure we are using
......@@ -261,14 +264,14 @@ things efficiently and correctly.
This is also run as part of the [continuous integration checks](https://travis-ci.org/github/torproject/metrics-cloud/) on Travis CI.
# Common Tasks
## Common Tasks
## Add a new member to the team
### Add a new member to the team
## Update an SSH key for a team member
### Update an SSH key for a team member
## Deploy and provision a development environment for a service
### Deploy and provision a development environment for a service
......@@ -8,11 +8,11 @@ This is a TSA host so already has a bunch of ping and NRPE checks. Application
specific checks are mostly looking at the index file:
* That there is an index file that parses and:
* it was recently updated
* it contains a recent run for:
* bridge descriptors
* relay descriptors
* exit lists
** it was recently updated
** it contains a recent run for:
*** bridge descriptors
*** relay descriptors
*** exit lists
The old check uses bushel's CollecTor index parser, but we could equally hack
up a single python script to do this with the JSON at a lower level. In the
......@@ -51,6 +51,6 @@ Application specific checks would include:
* that a file is available in the webserver root for the last analysis run
* that there is something listening on the tgen connect port
* also on the onion service
** also on the onion service
* that the HTTPS certificate is valid and not about to expire (on port 8443)
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment