If you receive the following message:
Cluster host connection failure for Local server: Connection refused (111)
...then the cc-control process has exited for some reason.UPDATE - December 2013 - IMPORTANT!
We're finding that a lot of clients are skipping the initial steps in this article and jumping straight into debugging with gdb or enabling core dumps. This is counterproductive.
The very first steps you should take when diagnosing this problem are as follows:
- Ensure that cc-control hasn't simply been shut down. Try running:
If cc-control starts up and the problem does not recur, then there is nothing to diagnose and you can stop here.Debugging a crash
If cc-control gives an error at this point, then send us the error. No need to troubleshoot beyond this point.
- Normally, Centova Cast's cron job will automatically restart cc-control within 60 seconds if it exits for ANY reason, so if it remains down for more than a minute, it's likely that there is a problem with your cron job or that your firewall is blocking access to localhost on port 2198. Check your cron logs (/var/log/cron or /var/log/messages) to determine why the "/etc/init.d/centovacast check" cron job is not running correctly. This cron job should be configured in /etc/cron.d/centovacast.
- If you believe cc-control is crashing and wish to troubleshoot cc-control to determine why, continue reading, but please read carefully. As explained below, after enabling core dumps, you MUST wait until the next time cc-control crashes.
If cc-control is actually crashing and you wish to determine why, you can diagnose it as follows.Update Centova Cast to the latest build
If you believe cc-control is in fact crashing with a segfault, run the update command to ensure that you are running the very latest build.
The data collected by the procedure below relies on values built into the executable file which change every time we rebuild cc-control. Accordingly, the data is only useful if we test it against the exact same build of cc-control that you are using. If you are using an outdated build of cc-control -- even if it's just a couple of days old -- the data will be totally useless to us.
So even if you think that you are running the latest build, update anyway:
/usr/local/centovacast/sbin/updateEnable core dumps
Enable core dumps on your server by running the following command as root:
/usr/local/centovacast/sbin/enable_coredumps startWait for a core dump
Periodically check /var/spool/coredumps and look for files named core.cc-control_*. As soon as at least one such file exists, zip (or tar/gzip) it up and send it to us in a support ticket.Disable core dumps
Once you've sent us a core dump, disable core dumps by running:
Also note that recent builds of Centova Cast include crash detection and recovery code. So in the event of a control daemon crash, Centova Cast will automatically restart cc-control within a minute or two, and you may not even notice it was down. So be sure to periodically check the debug log file to see if a crash has been recorded.