Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
Wiki Replica
Manage
Activity
Members
Labels
Plan
Issues
Issue boards
Milestones
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Build
Pipelines
Jobs
Pipeline schedules
Artifacts
Deploy
Releases
Container Registry
Model registry
Operate
Environments
Monitor
Incidents
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
The Tor Project
TPA
Wiki Replica
Commits
641cba41
Verified
Commit
641cba41
authored
4 years ago
by
anarcat
Browse files
Options
Downloads
Patches
Plain Diff
document how people can add targets to the external prometheus server
parent
7ad38c77
No related branches found
Branches containing commit
No related tags found
No related merge requests found
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
tsa/howto/prometheus.mdwn
+57
-2
57 additions, 2 deletions
tsa/howto/prometheus.mdwn
with
57 additions
and
2 deletions
tsa/howto/prometheus.mdwn
+
57
−
2
View file @
641cba41
...
...
@@ -12,6 +12,8 @@ layer on top (see [[Grafana]]).
# Tutorial
## Looking at pretty graphs
The Prometheus web interface is available at:
<https://prometheus.torproject.org>
...
...
@@ -22,6 +24,9 @@ over the last two weeks for the known servers.
[this link]: https://prometheus1.torproject.org/graph?g0.range_input=2w&g0.expr=node_load5&g0.tab=0
The Prometheus web interface is crude: it's better to use [[grafana]]
dashboards for most purposes other than debugging.
# How-to
## Pager playbook
...
...
@@ -138,6 +143,56 @@ policies.
[allow scrape job collection]: https://github.com/voxpupuli/puppet-prometheus/pull/304
[Prometheus Puppet module]: https://github.com/voxpupuli/puppet-prometheus/
### Manual node configuration
External services can be monitored by Prometheus, as long as they
comply with the [OpenMetrics][] protocol, which is simply to expose
metrics such as this over HTTP:
metric{label=label_val} value
A real-life (simplified) example:
node_filesystem_avail_bytes{alias="alberti.torproject.org",device="/dev/sda1",fstype="ext4",mountpoint="/"} 16160059392
The above says that the node alberti has the device `/dev/sda` mounted
on `/`, formatted as an `ext4` filesystem which has 16160059392 bytes
(~16GB) free.
[OpenMetrics]: https://openmetrics.io/
System-level metrics can easily be monitored by the secondary
Prometheus server. This is usually done by installing the "node
exporter", with the following steps:
* On Debian Buster and later:
apt install prometheus-node-exporter
* On Debian stretch:
apt install -t stretch-backports prometheus-node-exporter
... assuming that backports is already configured. if it isn't, such a line in `/etc/apt/sources.list.d/backports.debian.org.list` should suffice:
deb https://deb.debian.org/debian/ stretch-backports main contrib non-free
... followed by an `apt update`, naturally.
The firewall on the machine needs to allow traffic on the exporter
port from the server `prometheus2.torproject.org`. Then [open a
ticket][new-ticket] for TPA to configure the target. Make sure to
mention:
* the hostname for the exporter
* the port of the exporter (varies according to the exporter, 9100
for the node exporter)
* how often to scrape the target, if non-default (default: 15s)
Then TPA needs to hook those as part of a new node `job` in the
`scrape_configs`, in `prometheus.yml`, from Puppet, in
`profile::prometheus::server`.
## SLA
Prometheus is currently not doing alerting so it doesn't have any sort
...
...
@@ -172,10 +227,10 @@ and the Alertmanager can be configured with High availability.
## Issues
There is no issue tracker specifically for this project, [File][] or
There is no issue tracker specifically for this project, [File][
new-ticket
] or
[search][] for issues in the [generic internal services][search] component.
[
File
]: https://trac.torproject.org/projects/tor/newticket?component=Internal+Services%2FTor+Sysadmin+Team
[
new-ticket
]: https://trac.torproject.org/projects/tor/newticket?component=Internal+Services%2FTor+Sysadmin+Team
[search]: https://trac.torproject.org/projects/tor/query?status=!closed&component=Internal+Services%2FTor+Sysadmin+Team
## Monitoring and testing
...
...
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment