convert Puppet's cron resources into systemd timers
Puppet's built-in cron resource is kind of crap:
- if you first set a parameter like
hour
and then remove it, it doesn't turn it into*
, e.g. if you havecron { 'foo': hour => 5 }
and turn that intocron { 'foo': }
, that's a noop but should logically turn it into a job that runs every minute - if you add a resource to a manifest, then remove it, it doesn't get removed from the host, e.g. if you remove the
Cron['foo']
resource above, it will stay in the crontab - it uses the
/var/spool/crontabs/root
resource instead of the more readable and intelligible/etc/cron.d
- it was removed from core puppet and moved to a contrib module (see the deprecation notice and #41285 (closed)))
There are two options here.
- voxpupuli maintains what seems to look like an excellent cron module
- just ditch cron and turn everything into a systemd timer (that is what wikimedia did)
The former is easier: the cron::job
resource looks backwards compatible with the old cron
type, except that it creates the resource in /etc/cron.d
instead of /var/...
But I would very much like to use systemd timers instead: they provide built-in monitoring as failing timers will raise an alarm with systemd's internal status, which then triggers monitoring (as opposed to sending us email). It could also drastically reduce the amount of noise we're going through each morning, although that might be a problem if we actually rely on that output. We probably would need to go through each resource by hand to evaluate anyways.
Wikimedia has this trick to list hosts with the given resource:
cumin R:cron
Obviously, all hosts currently have a cron resource. But it's not as much work as I'd imagined:
puppetdb=# SELECT count(*),title FROM catalog_resources WHERE type = 'Cron' GROUP BY title ORDER by count(*) DESC;
count | title
-------+---------------------------------
87 | puppet-cleanup-clientbucket
81 | prometheus-lvm-prom-collector-
9 | prometheus-postfix-queues
6 | docker-clear-old-images
5 | docker-clear-nightly-images
5 | docker-clear-cache
5 | docker-clear-dangling-images
2 | collector-service
2 | onionoo-bin
2 | onionoo-network
2 | onionoo-service
2 | onionoo-web
2 | podman-clear-cache
2 | podman-clear-dangling-images
2 | podman-clear-nightly-images
2 | podman-clear-old-images
1 | update rt-spam-blocklist hourly
1 | update torexits for apache
1 | metrics-web-service
1 | metrics-web-data
1 | metrics-web-start
1 | metrics-web-start-rserve
1 | metrics-network-data
1 | rt-externalize-attachments
1 | tordnsel-data
1 | tpo-gitlab-backup
1 | tpo-gitlab-registry-gc
1 | update KAM ruleset
(28 rows)
that's 28 distinct resources to update, and many of them are basically the same (e.g. all the podman
stuff is similar). some already must be moved out of cron to be ran as normal services (e.g. metrics stuff).
i doubt we need the output in any of those and it would logged in journald anyway. in fact, it might even allow us log more things as we wouldn't have to deal with the resulting email...