After upgrading to 13.0 74.14+ from older releases sometimes it was observed that config sync is continuously failing in HA/Cluster deployments. Failure can be because of multiple reasons like:
- Internal user login is disabled but ns_comm_key is not configured
- Ssh_host_rsa_key private and public is not appropriate
- Rpc node password mismatch among nodes
- Disk space issue or some unknow reason which is not yet identified
Go through the article,
https://support.citrix.com/article/CTX214822 to identify if the issue was caused due to case 1 (listed above)
If the issue is not caused due to case 1 then check for case 2. This article discusses case 2.
- To enhance the secure way of config sync “strichostkeychecking” is now enabled while making ssh connection to the Primary/CCO node.
- CCO/Primary node’s ssh_host_rsa_key.pub file content was fetched on a secure channel.
- This public key is used by NON-CCO/Secondary node to authenticate CCO/Primary node identity using stricthostkeychecking.
- This will ensure NON-CCO/Secondary node is establishing a connection with the right CCO/Primary node.
- In the case of a man-in-middle attack, configysnc will fail because the public key sent by the attacker will not match with the fetched public key.
- But if CCO/Primary itself doesn’t have the correct entry in ssh_host_rsa_key.pub then this strichostkeychecking will fail which lead to configsync failure continuously.
Identify issue:
Run below command in CCO/Primary node to identified if both public and private keys are correct:
$ ssh-keygen -y -f /nsconfig/ssh/ssh_host_rsa_key
Above command will print public key and this public sent by the CCO/Primary node to the NON-CCO/Secondary during ssh connection. Match the output of the above command with public content in “/nsconfig/ssh/ssh_host_rsa_key.pub”. This should be the same. Like below:
$ ssh-keygen -y -f /nsconfig/ssh/ssh_host_rsa_key
ssh-rsa
AAAAB3NzaC1yc2EAAAADAQABAAABAQC/mUBEWQVdgwEAyF1/RyN6ZS0WVM45brXptqk95pHu6dF41LDLByCT4PH5mqhWhneh6nsdS3XcgPTCgtZP7R4XDh4X3xJ9tje26VCCDcFApB9OGXxdTSqS0X+vVwLLW9uX93xpxfNRCI7pJqflhe3xCgjE6SN6wfgLfnj+7xfikkbxDUF5KSXJglHHz1jkLbSX2fCMEnv6KxhMUBDefGpY9QXqr1Ea62qaigOnMcknC5kbZE3oeaEiUYw25YfdqDE70yUx8rnZmbEMdQT3ldedRprKSiTVpubwFmiHDGJKp/LBOowZvZy/zFCwJdvBZRcePRcpFiMHxc+l0e2cKyVN
$ cat /nsconfig/ssh/ssh_host_rsa_key.pub
ssh-rsa
AAAAB3NzaC1yc2EAAAADAQABAAABAQC/mUBEWQVdgwEAyF1/RyN6ZS0WVM45brXptqk95pHu6dF41LDLByCT4PH5mqhWhneh6nsdS3XcgPTCgtZP7R4XDh4X3xJ9tje26VCCDcFApB9OGXxdTSqS0X+vVwLLW9uX93xpxfNRCI7pJqflhe3xCgjE6SN6wfgLfnj+7xfikkbxDUF5KSXJglHHz1jkLbSX2fCMEnv6KxhMUBDefGpY9QXqr1Ea62qaigOnMcknC5kbZE3oeaEiUYw25YfdqDE70yUx8rnZmbEMdQT3ldedRprKSiTVpubwFmiHDGJKp/LBOowZvZy/zFCwJdvBZRcePRcpFiMHxc+l0e2cKyVN root@ns
If it’s matching then configsync is failing due to some other reasons and needs to be explored for the root cause. But if it is not matching then ssh_host_rsa_key keys need to be regenerated.