do not write logs on caching servers
In #32239 (closed), a caching system was deployed with nginx. To get hit rate ratios, log files are written to disk, with IP address and user agents anonymized. That's okay-ish: it's not as well anonymized as our apache log files because it's not possible to have a per-day granularity in timestamps.
From there, mtail wakes up once in a while and parses those logfiles and counts things, which are exposed as metrics picked up by prometheus. That in turn gives us pretty Prometheus graphs and makes us feel better about ourselves.
But ideally, we wouldn't have log files at all and pipe things directly into mtail. But we don't want to hang the webserver while waiting for mtail (which can be a little flaky), so the typical way to deal with this is to pipe logs first in syslog.
I couldn't immediately figure out how to do this during deployment so I'm opening this ticket to make sure we eventually operate that conversion.
One problem I had is the syslog-ng config sends all logs to the central logging server. If we start pushing web hits into syslog, this could become unwieldy, to say the least, in terms of performance mostly, but also privacy.
It's also not clear to me how to send logs from syslog into mtail without hitting the disk in the first place.
So the checklist is:
-
how to send logs from nginx to syslog ( access_log syslog:server=unix:/dev/log,facility=local3,tag=nginx_access extended;
seems to be the magic config in nginx) -
how to avoid sending those logs to the central server -
how to send those logs (and only those) into mtail
All of this should be automatically configured in Puppet as well.