Skip to main content

Cluster

This guide explains how to troubleshoot cluster-related aspects of the Anapaya appliances.

Current configuration and state

Retrieve the current cluster configuration from the appliance:

appliance-cli get config -f body.config.cluster

Get the current cluster sync state of the appliance:

appliance-cli get debug/cluster/status

Cluster is configured with static topology sychronization:

appliance-cli get debug/cluster/status
{
"mode": "static"
}

Cluster is configured with dynamic topology sychronization:

appliance-cli get debug/cluster/status
{
"mode": "dynamic",
"peers": [
{
"address": "10.1.0.1:42003",
"last_sync_attempt": "2024-02-13T12:49:16.903700998Z",
"name:": "peer-name",
"status": "success",
}
]
}

When to enable dynamic topology sychronization?

We recommend to enable dynamic topology synchronization for CORE appliances. This way the configuration does not need to be updated when the topology changes for instance if a new SCION link is added. On the other hand, we recommend to not enable dynamic topology synchronization for EDGE appliances.

Common problems

Static cluster config does not match the other appliances

Compare the scion and scion_tunneling configuration sections with the cluster configuration section of all other appliances in the cluster and make sure that the values match.

Example

Get the SCION section on EDGE:

appliance-cli get config -f body.config.scion

Get the SCION section of the cluster config on a peer EDGE. This assumes there is only one peer and one AS configured.

appliance-cli get config -f body.config.cluster.peers[0].scion.ases[0]

In the two outputs, the following values should match:

  • isd_as
  • shard_id
  • control.address
  • neighbors[].neighbor_isd_as
  • neighbors[].relationship
  • Neighbor interfaces
    • interface_id
    • scion_mtu
    • next_hop in the cluster config should match the router.internal_interface of the SCION config.

Get the IP-in-SCION tunneling endpoint on the EDGE:

appliance-cli get config -f body.config.scion_tunneling.endpoint

Get the IP-in-SCION tunneling endpoint of the cluster config on a peer EDGE. This assumes there is only one peer and one AS configured.

appliance-cli get config -f body.config.cluster.peers[0].scion_tunneling.endpoint

In the two outputs, all the values in the cluster endpoint configuration should match the values in the IP-in-SCION tunneling endpoint configuration.

Cluster synchronization fails

  1. Inspect docker service logs of the control service and look for the entry Fetch from network failed..., which helps to determine the underlying issue.
  2. If the issue is network related, make sure the network connectivity is working as expected using the Ping the underlay network for the destination mentioned in the logs. As well as MTU consistency across the path.
  3. In case of a parsing error, investigate why the other party is sending malformed objects.