Centova Technologies Forum

Centova Cast v3 => Bugs and issues => Topic started by: Centova - Steve B. on January 20, 2012, 03:55:00 pm

Title: Diagnosing cc-control crashes (aka "Cluster host connection failure")
Post by: Centova - Steve B. on January 20, 2012, 03:55:00 pm
If you receive the following message:
Code: [Select]
Cluster host connection failure for Local server: Connection refused (111)
...then the cc-control process has exited for some reason.

UPDATE - December 2013 - IMPORTANT!

We're finding that a lot of clients are skipping the initial steps in this article and jumping straight into debugging with gdb or enabling core dumps.  This is counterproductive.

The very first steps you should take when diagnosing this problem are as follows:

Code: [Select]
/etc/init.d/centovacast start


Debugging a crash

If cc-control is actually crashing and you wish to determine why, you can diagnose it as follows.

Update Centova Cast to the latest build

If you believe cc-control is in fact crashing with a segfault,  run the update command to ensure that you are running the very latest build.

The data collected by the procedure below relies on values built into the executable file which change every time we rebuild cc-control. Accordingly, the data is only useful if we test it against the exact same build of cc-control that you are using. If you are using an outdated build of cc-control -- even if it's just a couple of days old -- the data will be totally useless to us.

So even if you think that you are running the latest build, update anyway:

Code: [Select]
/usr/local/centovacast/sbin/update

Enable core dumps

Enable core dumps on your server by running the following command as root:

Code: [Select]
/usr/local/centovacast/sbin/enable_coredumps start

Wait for a core dump

Periodically check /var/spool/coredumps and look for files named core.cc-control_*. As soon as at least one such file exists, zip (or tar/gzip) it up and send it to us in a support ticket.

Disable core dumps

Once you've sent us a core dump, disable core dumps by running:

Code: [Select]
/usr/local/centovacast/sbin/enable_coredumps stop

Also note that recent builds of Centova Cast include crash detection and recovery code.  So in the event of a control daemon crash, Centova Cast will automatically restart cc-control within a minute or two, and you may not even notice it was down.  So be sure to periodically check the debug log file to see if a crash has been recorded.

Happy debugging. :)