Thursday 6 April 2017

Fix Expired Certificate Issue on Ganeti cluster

When the certificate expires on a ganeti cluster you will get these error messages when starting the ganeti service:
(0) root@server ~
# gnt-instance list
Cannot communicate with the master daemon.
Is it running and listening for connections?
(1) root@server ~
# gnt-cluster getmaster
sever.gnt6.fog.priv
(0) root@server ~
# ls /var/lib/ganeti/ssconf_master_node
/var/lib/ganeti/ssconf_master_node
(0) root@server ~
# cat /var/lib/ganeti/ssconf_master_node
sever.gnt6.fog.priv
(0) root@server ~
# /etc/init.d/ganeti start
Starting Ganeti cluster:ganeti-noded...done.
ganeti-masterd...ERROR:root:RPC error in master_info from node X: Error 60: server certificate verification failed. CAfile: /var/lib/ganeti/server.pem CRLfile: none
ERROR:root:RPC error in master_info from node X: Error 60: server certificate verification failed. CAfile: /var/lib/ganeti/server.pem CRLfile: none
ERROR:root:RPC error in master_info from node Y: Error 60: server certificate verification failed. CAfile: /var/lib/ganeti/server.pem CRLfile: none
ERROR:root:RPC error in master_info from node Z: Error 60: server certificate verification failed. CAfile: /var/lib/ganeti/server.pem CRLfile: none
CRITICAL:root:Cluster inconsistent, most of the nodes didn't answer after multiple retries. Aborting startup
CRITICAL:root:Use the --no-voting option if you understand what effects it has on the cluster state
failed (exit code 1).
ganeti-rapi...done.
ganeti-confd...done.
(0) root@sever ~
# gnt-instance list
Cannot communicate with the master daemon.
Is it running and listening for connections?
(1) root@server ~
Checking the certificate, you see that it has expired. To solve this issue, on the current (but broken) master, create a new certificate:
# cp /var/lib/ganeti/server.pem ~/expired.server.pem
# openssl req -new -newkey rsa:1024 -days 1825 -nodes -x509 -keyout /var/lib/ganeti/server.pem -out /var/lib/ganeti/server.pem -batch &&
# chmod 0400 /var/lib/ganeti/server.pem
Then run this script to copy this new certificate to all nodes in the cluster:
#!/bin/sh
for i in X Y Z
do
       ssh $i "cp /var/lib/ganeti/server.pem ~/"
       scp newserver.pem $i:/var/lib/ganeti/server.pem
       ssh $i "chmod 0400 /var/lib/ganeti/server.pem"
       ssh $i "/etc/init.d/ganeti restart"
done
/etc/init.d/ganeti restart
Note it will restart the ganeti service on all the non-master nodes before restarting the service on the master node.