Cluster
This guide explains how to troubleshoot cluster-related aspects of the Anapaya appliances.
Current configuration and state
Retrieve the current cluster configuration from the appliance:
appliance-cli get config -f body.config.cluster
Get the current cluster sync state of the appliance:
appliance-cli get debug/cluster/status
Cluster is configured with static topology sychronization:
appliance-cli get debug/cluster/status
{
"mode": "static"
}
Cluster is configured with dynamic topology sychronization:
appliance-cli get debug/cluster/status
{
"mode": "dynamic",
"peers": [
{
"address": "10.1.0.1:42003",
"last_sync_attempt": "2024-02-13T12:49:16.903700998Z",
"name:": "peer-name",
"status": "success",
}
]
}
When to enable dynamic topology sychronization?
We recommend to enable dynamic topology synchronization for CORE appliances. This way the configuration does not need to be updated when the topology changes for instance if a new SCION link is added. On the other hand, we recommend to not enable dynamic topology synchronization for EDGE appliances.
Common problems
Static cluster config does not match the other appliances
Compare the scion
and scion_tunneling
configuration sections with the cluster
configuration
section of all other appliances in the cluster and make sure that the values match.
Example
Get the SCION section on EDGE:
appliance-cli get config -f body.config.scion
Get the SCION section of the cluster config on a peer EDGE. This assumes there is only one peer and one AS configured.
appliance-cli get config -f body.config.cluster.peers[0].scion.ases[0]
In the two outputs, the following values should match:
isd_as
shard_id
control.address
neighbors[].neighbor_isd_as
neighbors[].relationship
- Neighbor interfaces
interface_id
scion_mtu
next_hop
in the cluster config should match therouter.internal_interface
of the SCION config.
Get the IP-in-SCION tunneling endpoint on the EDGE:
appliance-cli get config -f body.config.scion_tunneling.endpoint
Get the IP-in-SCION tunneling endpoint of the cluster config on a peer EDGE. This assumes there is only one peer and one AS configured.
appliance-cli get config -f body.config.cluster.peers[0].scion_tunneling.endpoint
In the two outputs, all the values in the cluster endpoint configuration should match the values in the IP-in-SCION tunneling endpoint configuration.
Cluster synchronization fails
- Inspect docker service logs of the control
service and look for the entry
Fetch from network failed...
, which helps to determine the underlying issue. - If the issue is network related, make sure the network connectivity is working as expected using the Ping the underlay network for the destination mentioned in the logs. As well as MTU consistency across the path.
- In case of a parsing error, investigate why the other party is sending malformed objects.