[quote=“qlddev, post:3, topic:10626, full:true”]
Thanks @JohnGill. I don’t know what happens between line 608 and 741, but why set the run_check to 0 if it already is.[/quote]
The process works like this:
- Matrix cron run #1 starts. It executes line 608 at the beginning of the run, setting run_check to 0 and running to 1. It then starts processing cron jobs one by one.
- If cron run #1 has taken more than 15 minutes, cron run #2 will start. It will check running to see whether there is a cron run already in progress. Since there is, it will increment run_check to 1, and then exit.
- If cron run #1 has taken more than 30 minutes, cron run #3 will start. It will check running to see whether there is a cron run already in progress, discover that there is, and increment run_check to 2 before exiting.
- If cron run #1 has taken more than 45 minutes, cron run #4 will start. It will check running to see whether there is a cron run already in progress, discover that there is, and increment run_check to 3 before exiting.
At this point, if the Matrix cron manager is using the default settings, run_check will have reached the Warn After Blocked Runs threshold of 3 (described in https://matrix.squiz.net/manuals/system-management/chapters/scheduled-jobs-manager#Options-Screen). This causes a cron deadlock warning e-mail to be sent, alerting administrator users to the fact that there may be a problem with the Matrix crons.
- If cron run #1 then ends up completing successfully (i.e. it didn’t die or hang, it was just taking a very long time), it will execute line 741 and reset run_check and running to 0. At this point the cron deadlock e-mails will stop and the deadlock will appear to have resolved itself.
That said, it would normally be unusual for a cron run to take 21 hours. You can try to identify the long-running job by going through your Matrix logs and looking for lines that say “the value of attribute current_job for asset Scheduled Jobs Manager has been changed from X to Y”, matching the times when the cron deadlock began and ended.
Alternatively, if you have a service agreement with Squiz you can send in a ticket and ask our support team to investigate.