Loading service/crm.md +52 −0 Original line number Diff line number Diff line Loading @@ -156,6 +156,54 @@ to the underlying storage from the attacker. Then API keys secrets should probably be rotated, follow the [Rotating API tokens procedure](#rotating-api-tokens). ### Jobs not running If you get an alert about a "CiviCRM job failure", for example: The CiviCRM job send_scheduled_mailings on crm-int-01.torproject.org has been marked as failed for more than 4h. This could be that it has not run fast enough, or that it failed. ... it means a CiviCRM job (in this case `send_scheduled_mailings`) has either failed or has not run in its configured time frame. (Note that we currently can't distinguish those states, but hopefully [will have metrics to do so soon](https://gitlab.torproject.org/tpo/web/civicrm/-/issues/148).) The "scheduled job failures" section will also show more information about the error:  To debug this, first find the "Scheduled Job Logs": 1. Go to Administer > System Settings > Scheduled Jobs 2. Find the affected job (above `send_scheduled_mailings`) 3. Click "view log" Here's a screenshot of such a log:  This will show the error that triggered the alert: - If it's an exception, it should be investigated in the source code. - If the job just hasn't ran in a timely manner, the systemd timer should be investigated with `systemctl status civicron@prod.timer` There's also the global CiviCRM on-disk log. It's not perfect, because on this server there are sometimes 2 different logs. It can also rather noisy, with deprecation alerts, civirules chatter, etc. Those are also available in "Administer > Administration Console > View Log" in the web interface and stored on disk, in: ls -altr /srv/crm.torproject.org/htdocs-prod/sites/default/files/civicrm/ConfigAndLog/CiviCRM.1.*.log Note that it's also possible to run the jobs by hand, but we don't have specific examples on how to do this for all jobs. See the Resque process job, below, for a more specific example. ### Kill switch enabled If the [Resque Processor Job](#queues) gets stuck because it failed to Loading @@ -167,6 +215,10 @@ switch:  Note that this is a special case of the more general job failure above. It's documented explicitly and separately here because it's such an important part that it warrants its own documentation. The "scheduled job failures" section will also show more information about the error: Loading Loading
service/crm.md +52 −0 Original line number Diff line number Diff line Loading @@ -156,6 +156,54 @@ to the underlying storage from the attacker. Then API keys secrets should probably be rotated, follow the [Rotating API tokens procedure](#rotating-api-tokens). ### Jobs not running If you get an alert about a "CiviCRM job failure", for example: The CiviCRM job send_scheduled_mailings on crm-int-01.torproject.org has been marked as failed for more than 4h. This could be that it has not run fast enough, or that it failed. ... it means a CiviCRM job (in this case `send_scheduled_mailings`) has either failed or has not run in its configured time frame. (Note that we currently can't distinguish those states, but hopefully [will have metrics to do so soon](https://gitlab.torproject.org/tpo/web/civicrm/-/issues/148).) The "scheduled job failures" section will also show more information about the error:  To debug this, first find the "Scheduled Job Logs": 1. Go to Administer > System Settings > Scheduled Jobs 2. Find the affected job (above `send_scheduled_mailings`) 3. Click "view log" Here's a screenshot of such a log:  This will show the error that triggered the alert: - If it's an exception, it should be investigated in the source code. - If the job just hasn't ran in a timely manner, the systemd timer should be investigated with `systemctl status civicron@prod.timer` There's also the global CiviCRM on-disk log. It's not perfect, because on this server there are sometimes 2 different logs. It can also rather noisy, with deprecation alerts, civirules chatter, etc. Those are also available in "Administer > Administration Console > View Log" in the web interface and stored on disk, in: ls -altr /srv/crm.torproject.org/htdocs-prod/sites/default/files/civicrm/ConfigAndLog/CiviCRM.1.*.log Note that it's also possible to run the jobs by hand, but we don't have specific examples on how to do this for all jobs. See the Resque process job, below, for a more specific example. ### Kill switch enabled If the [Resque Processor Job](#queues) gets stuck because it failed to Loading @@ -167,6 +215,10 @@ switch:  Note that this is a special case of the more general job failure above. It's documented explicitly and separately here because it's such an important part that it warrants its own documentation. The "scheduled job failures" section will also show more information about the error: Loading