Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
Wiki Replica
Manage
Activity
Members
Labels
Plan
Issues
Issue boards
Milestones
Wiki
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Snippets
Build
Pipelines
Jobs
Pipeline schedules
Artifacts
Deploy
Releases
Package registry
Container Registry
Model registry
Operate
Environments
Terraform modules
Monitor
Incidents
Service Desk
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
Cecylia Bocovich
Wiki Replica
Commits
817986e9
Unverified
Commit
817986e9
authored
3 years ago
by
anarcat
Browse files
Options
Downloads
Patches
Plain Diff
document the external git repo
parent
5ad5cac0
No related branches found
Branches containing commit
No related tags found
No related merge requests found
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
howto/prometheus.md
+59
-13
59 additions, 13 deletions
howto/prometheus.md
with
59 additions
and
13 deletions
howto/prometheus.md
+
59
−
13
View file @
817986e9
...
...
@@ -72,8 +72,11 @@ Once you have an exporter endpoint (say at
curl http://example.com:9090/metrics
This should return a number of metrics that change (or not) at each
call. From there on, provide that endpoint to the sysadmins, which
will follow the next procedure to add the metric to Prometheus.
call.
From there on, provide that endpoint to the sysadmins (or someone with
access to the external monitoring server), which will follow the
procedure below to add the metric to Prometheus.
Once the exporter is hooked into Prometheus, you can browse the
metrics directly at:
<https://prometheus.torproject.org>
. Graphs
...
...
@@ -82,7 +85,41 @@ those need to be created and committed into git by sysadmins to
persist, see the
[
anarcat dashboard directory
](
https://gitlab.com/anarcat/grafana-dashboards
)
for more
information.
## Adding metrics for admins
## Adding targets on the external server
Alerts and scrape targets on the external server are managed through a
Git repository called
[
prometheus-alerts
](
https://gitlab.torproject.org/tpo/tpa/prometheus-alerts
)
. To add a scrape target:
1.
clone the repository
git clone https://gitlab.torproject.org/tpo/tpa/prometheus-alerts/
cd prometheus-alerts
2.
assuming you're adding a node exporter, to add the target:
cat > targets.d/node_myproject.yaml <<EOF
# scrape the external node exporters for project Foo
---
- targets:
- targetone.example.com
- targettwo.example.com
3.
add, commit, and push:
git checkout -b myproject
git add targets.d
git commit -m"add node exporter targets for my project"
git push origin -u myproject
The last push command should show you the URL where you can submit
your merge request.
After being merged, the changes should propagate within
[
4 to 6
hours
](
https://gitlab.torproject.org/tpo/tpa/team/-/wikis/howto/puppet/#cron-and-scheduling
)
.
See also the
[
targets.d documentation in the git repository
](
https://gitlab.torproject.org/tpo/tpa/prometheus-alerts/-/tree/main/targets.d
)
.
## Adding targets on the internal server
TODO: talk about
`scrape_jobs`
for in-puppet configurations.
...
...
@@ -118,11 +155,12 @@ Alerting Overview](https://prometheus.io/docs/alerting/latest/overview/) but I h
have instead been following
[
this tutorial
](
https://ashish.one/blogs/setup-alertmanager/
)
which was quite
helpful.
### Adding alerts
### Adding alerts
in Puppet
The Alertmanager is currently managed through Puppet, in
`profile::prometheus::server::external`
. An alerting rule is defined
like:
The Alertmanager can (but currently isn't, on the external server)
managed through Puppet, in
`profile::prometheus::server::external`
.
An alerting rule, in Puppet, is defined like:
{
'name' => 'bridgestrap',
...
...
@@ -146,13 +184,21 @@ like:
],
},
The key part of the alert is the
`expr`
setting which is a PromQL
expression that, when evaluated to "true" for more than
`5m`
(the
`for`
settings), will fire an error at the Alertmanager. Also note
the
`team`
label which will route the message to the right team. Those
routes are defined later, in the
`routes`
and
`receivers`
settings.
Note that we might want to move those to Hiera so that we could use
YAML code directly, which would better match the syntax of the actual
alerting rules.
### Adding alerts through Git, on the external server
The external server pulls pulls a
[
git repository
](
https://gitlab.torproject.org/tpo/tpa/prometheus-alerts/
)
for alerting and
targets regularly. Alerts can be added through that repository by
adding a file in the
`rules.d`
directory, see
[
rules.d
](
https://gitlab.torproject.org/tpo/tpa/prometheus-alerts/-/tree/main/rules.d
)
directory
for more documentation on that.
Note that alerts (probably?) do not take effect until a sysadmin
reloads Prometheus.
Note that those might move to separate files and/or Hiera later on
.
TODO: confirm how rules are deployed
.
### Adding alert recipients
...
...
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment