Investigate stunnel outage on crm-ext-01
On 6/12, we started getting a lot of error messages from donate.torproject.org. Antoine took a look at it and determined that stunnel was the problem and restarted it and that seems to have fixed the problem.
When the stunnel is down, the donations don't get queued into Redis so the donations don't get recorded in the CRM and receipts don't get sent. Also mailing list signups don't get recorded. It's very time consuming to get that data into the CRM from the error emails.
We'd like to see if we can figure out what happened to cause the stunnel to stop working and then make it more reliable.
I have seen very small outages in the past.
Here's what I'm thinking we should try:
- Could someone turn up the logging level for stunnel on crm-ext-01 to info (6) and if it's not already separated out into a different syslog file, separate it out so it's easy to debug.
- Then the next time I'll get an error message about it, we can look through the logs and maybe that will give use some idea of what the problem is.