Cumin
Cumin is a tool to operate arbitrary shell commands on howto/Puppet hosts that match a certain criteria. It can match classes, facts and other things stored in the PuppetDB.
It is useful to do adhoc or emergency changes on a bunch of machines at once. It is especially useful to run Puppet itself on multiple machines at once to do progressive deployments.
It should not be used as a replacement for Puppet itself: most configuration on server should not be done manually and should instead be done in Puppet manifests so they can be reproduced and documented.
Installation
Debian package
cumin has been available through debian archives since boorkworm, so you can simply:
sudo apt install cumin
If your distro does not have packages available, you can also install with a python virtualenv. See the section below for how to achieve this.
Initial configuration
cumin
is relatively useless for us if it doesn't poke puppetdb to resolve
which hosts to run commands on. So we want to get it to talk to puppetdb. Also,
it gets pretty annoying to have to manually setup the ssh tunnel after getting
an error printed out by cumin, so we can get the tunnel setup automatically.
Once cumin is installed drop the following configuration in
~/.config/cumin/config.yaml
:
transport: clustershell
puppetdb:
host: localhost
scheme: http
port: 6785
api_version: 4 # Supported versions are v3 and v4. If not specified, v4 will be used.
clustershell:
ssh_options:
- '-o User=root'
log_file: cumin.log
default_backend: puppetdb
Now you can simply use an alias like the following:
alias cumin="cumin --config ~/.config/cumin/config.yaml"
while making sure that you setup an ssh tunnel manually before calling cumin like the following:
ssh -L6785:localhost:8080 puppetdb-01.torproject.org
Or instead of the alias and the ssh command, you can try setting up an
automatic tunnel upon calling cumin
. See the following section to set that
up.
Automatic tunneling to puppetdb with bash + systemd unit
This trick makes sure that you never forget to setup the ssh tunnel to puppedb
before running cumin
. This section will replace cumin
by a bash function,
so if you created a simple alias like mentioned in the previous section, you
should start by getting rid of that alias. Lastly, this trick requires nc
in
order to verify if the tunnel port is open so, install it with:
sudo apt install nc
To get the automatic tunnel, we'll create a systemd unit that can bring the
tunnel up for us. Create the file
~/.config/systemd/user/puppetdb-tunnel@.service
, making sure to create the
missing directories in the path:
[Unit]
Description=Setup port forward to puppetdb
After=network.target
[Service]
ExecStart=-/usr/bin/ssh -W localhost:8080 puppetdb-01.torproject.org
StandardInput=socket
StandardError=journal
Environment=SSH_AUTH_SOCK=%t/gnupg/S.gpg-agent.ssh
The Environment
variable is necessary for the ssh command to be able
to request the key from your YubiKey, this may vary according to your
authentication system. It's only there because systemd might not have
the right variables from your environment, depending on how it's started.
And you'll need the following for socket activation, in
~/.config/systemd/user/puppetdb-tunnel.socket
:
[Unit]
Description=Socket activation for PuppetDB tunnel
After=network.target
[Socket]
ListenStream=127.0.0.1:6785
Accept=yes
[Install]
WantedBy=graphical-session.target
With this in place, make sure that systemd has loaded this unit file:
systemctl --user daemon-reload
systemctl --user enable --now puppetdb-tunnel.socket
Note: if you already have a line like LocalForward 8080 127.0.0.1:8080
under a block for host puppetdb-01.torproject.org
in
your ssh configuration, it will cause problem as ssh
will try to
bind to the same socket as systemd. That configuration should be
removed.
The above can be tested by hand without creating any systemd configuration with:
systemd-socket-activate -a --inetd -E SSH_AUTH_SOCK=/run/user/1000/gnupg/S.gpg-agent.ssh -l 127.0.0.1:6785 \
ssh -o BatchMode=yes -W localhost:8080 puppetdb-01.torproject.org
The tunnel will be shutdown as soon as it's done, and fired up as needed. You will need to tap your YubiKey, as normal, to get it to work of course.
This is different from a -N
"daemon" configuration where the daemon
stays around for a long-lived connection. This is the only way we've
found to make it work with socket activation. The alternative to that
is to use a "normal" service that is not socket activated and start
it by hand:
[Unit]
Description=Setup port forward to puppetdb
After=network.target
[Service]
ExecStart=/usr/bin/ssh -nNT -o ExitOnForwardFailure=yes -o BatchMode=yes -L 6785:localhost:8080 puppetdb-01.torproject.org
Environment=SSH_AUTH_SOCK=/run/user/1003/gnupg/S.gpg-agent.ssh
Virtualenv / pip
If Cumin is not available from your normal packages (see bug 924685 for Debian), you must install it in a Python virtualenv.
First, install dependencies, Cumin and some patches:
sudo apt install python3-clustershell python3-pyparsing python3-requests python3-tqdm python3-yaml
python3 -m venv --system-site-packages ~/.virtualenvs/cumin
~/.virtualenvs/cumin/bin/pip3 install cumin
~/.virtualenvs/cumin/bin/pip3 uninstall tqdm pyparsing clustershell # force using trusted system packages
Now if you follow the initial setup section above, then you can either create an alias in the following way:
alias cumin="~/.virtualenvs/cumin/bin/cumin --config ~/.config/cumin/config.yaml"
Or you can instead use the automatic ssh tunnel trick above, making sure to change the path to cumin in the bash function.
Avoiding spurious connection errors by limiting batch size
If you use cumin to run ad-hoc commands on many hosts at once, you'll most probably want to look into setting yourself up for direct connection to the hosts, instead of passing through a jump host.
Without the above-mentioned setup, you'll quickly hit a problem where hosts give
you seemingly random ssh connection errors for a variable percentage of the host
list. This is because you are hitting ssh server limitations imposed on you on
the jump host. The ssh server uses the default value for its MaxStartups
option, which means once you have 10 simultaneous open connections you'll start
seeing connections dropped with a 30% chance.
Again, it's recommended in this case to set yourself up for direct ssh connection to all of the hosts. But if you are not in a position where this is possible and you still need to go through the jump host, you can avoid weird issues by limiting your batch size to 10 or lower, e.g.:
cumin -b 10 'F:os.distro.codename=bookworm' 'apt update'
Note however that doing this will have the following effects:
- execution of the command on all hosts will be much slower
- if some hosts see command failures, cumin will stop processing your requested commands after reaching the batch size. so your command will possibly only run on 10 of all of the hosts.
Example commands
This will run the uptime
command on all hosts:
cumin '*' uptime
To run against only a subset, you need to use the Cumin grammar, which is briefly described in the Wikimedia docs. For example, this will run the same command only on physical hosts:
cumin 'F:virtual=physical' uptime
You can invert a condition by placing 'not ' in front of it. Also for facts, you can retrieve structured facts using puppet's dot notation (e.g. 'networking.fqdn' to check the fqdn fact). Using these two techniques the following example will run a command on all hosts that have not yet been upgraded to bookworm:
cumin 'not F:os.distro.codename=bookworm' uptime
To run against all hosts that have an ssl::service
resource in their latest
built catalog:
cumin 'R:ssl::service' uptime
To run against only the dal ganeti cluster nodes:
cumin 'C:role::ganeti::dal' uptime
Or, the same command using the O:
shortcut:
cumin 'O:ganeti::dal' uptime
To query any host that applies a certain profile:
cumin 'P:opendkim' uptime
And to query hosts that apply a certain profile with specific parameters:
cumin 'P:opendkim%mode = sv' uptime
Any Puppet fact or class can be queried that way. This also serves as
a ad-hoc interface to query PuppetDB for certain facts, as you don't
have to provide a command. In that case, cumin
runs in "dry mode"
and will simply show which hosts match the request:
$ cumin 'F:virtual=physical'
16 hosts will be targeted:
[...]
Mangling host lists for Cumin consumption
Say you have a list of hosts, separated by newlines. You want to run a command on all those hosts. You need to pass the list as comma-separated words instead.
Use the paste
command:
cumin "$(paste -sd, < host-list.txt)" "uptime"
Disabling touch confirmation
If running a command that takes longer than a few seconds, the cryptographic token will eventually block future connections and prompt for physical confirmation. This typically is not too much of a problem for short commands, but for long-running jobs, this can lead to timeouts if the operator is distracted.
The best way to workaround this problem is to temporarily disable touch confirmation, for example with:
ykman openpgp keys set-touch aut off
cumin '*' ': some long running command'
ykman openpgp keys set-touch aut on
Discussion
Alternatives considered
- Choria - to be evaluated
- Ansible?
- Puppet mcollective?
- simple SSH loop from LDAP output
- parallel-ssh
- see also the list in the Puppet docs
See also fabric.