Skip to main content

SCION/CP-PKI

This guide explains how to troubleshoot SCION and CP-PKI related aspects of the Anapaya appliances.

Current configuration and state

The current SCION configuration can be retrieved from the appliance using the following command:

appliance-cli get config -f body.config.scion

To get the current SCION state of the appliance, use the following command:

appliance-cli info scion

This lists all the SCION ASes that are configured on the appliance and shows the state of crypto material and the state of the SCION interfaces.

Common problems

TRC for local ISD missing

appliance-cli info scion
SCION ASes
- 1-ff00:1:1
Crypto:
- TRC for local ISD ❌
...

Without a TRC for the local ISD, the appliance cannot receive and validate topology information and therefore there will be no SCION connectivity.

Refer to TRC handling on how to provision the TRC.

AS certificate missing or expired

appliance-cli info scion
SCION ASes
- 1-ff00:1:1
Crypto:
...
- AS certificate ❌

Without a valid AS certificate, the appliance cannot receive and validate topology information and therefore there will be no SCION connectivity.

Refer to Certificate handling on how to create a CSR and request a certificate. Refer to Request AS certificate via sibling appliance if the appliance is part of a cluster and a sibling appliance already has a valid AS certificate.

SCION interface is down

The appliance cannot send or receive SCION traffic on a SCION interface which is down.

Refer to the corresponding SCIONInterfaceStateDown to find out how to investigate the issue.

Uploading AS certificate fails

If the AS certificate is in PEM format, make sure that the certificate chain has exactly two certificates: the AS certificate and the issuer certificate. Also, make sure that there is no trailing line in the certificate chain.

SCION connectivity issues

This section provides primary guidelines to troubleshoot some common network issues caused by the misconfiguration of SCION services.

Issue: Assume that you are operating the SCION AS 1-ff00:1:1 and you are notified that the connectivity from the host EDGE-1 in your AS to the neighboring AS 1-ff00:1:2 is lost. This can be a loss of SCION connectivity or IP connectivity over IP-in-SCION tunneling.

Alerting-system-independent guide

In practice, your alerting system which sits on top of the monitoring system, should inform you about such an incident. You might be able to extract information from the alerts which can be useful to find the source of the issue. In this guide, we do not rely on such information as it is dependent on your monitoring and alerting systems.

Recommendation only

The steps taken here for troubleshooting should be perceived solely as recommendations. Furthermore, they are meant to assist you with resolving only a small subset of issues you might encounter in practice.

A reasonable first step is to log into EDGE-1 and check the set of SCION paths to the AS 1-ff00:1:2.

Not all expected paths alive considers the case where you do not see the full set of paths you expect and explains two potential causes and how to resolve them.

All expected paths alive covers the scenarios where all the expected paths are alive and considers two possible causes and guides you how to resolve them.

Not all expected paths alive

A basic sanity check for SCION connectivity-related issues is to log into EDGE-1 in the AS 1-ff00:1:1 and run the showpaths command. This command shows the set of available paths to a particular destination. The --refresh forces the scion tool to grab fresh paths from the local SCION control service.

showpaths towards the AS 1-ff00:1:2
scion showpaths 1-ff00:1:2 --refresh

If there is no path, the output looks like:

Available paths to 1-ff00:1:2
Error: no path found

It is also possible that you do not see the complete set of paths you expect or some of them are in the timeout state instead of alive. For example, you expect to see the path [1-ff00:1:1 2>3 1-ff00:1:2] which corresponds to the link from interface 2 in 1-ff00:1:1 to interface 3 in 1-ff00:1:2, but it is not present. Run an IP ping between EDGE-1 and the corresponding router in AS 1-ff00:1:2. If this works, it means that there is connectivity on the IP underlay connecting EDGE-1 and the router in 1-ff00:1:2. In that case, the connectivity issue is probably on the SCION layer. If, on the other hand, the IP ping does not work, the root cause of the issue is probably in the lower layers, e.g., misconfiguration of the underlay network or an issue with networking hardware. This document assumes that the root cause of the issue is at the SCION layer and explains three most possible scenarios.

Scenario 1: endpoint misconfiguration

One potential cause is that there is an error in the configuration of EDGE-1. This is especially likely if you have just configured EDGE-1. Furthermore, if a non-empty subset of the paths is available, the AS certificate issue that we discuss in the next section can be ruled out on our side.

The issue could be simply caused by a typo in an IP address or a missing entry. In the example above, you need to check the configuration of interface 2 in your AS. If this is the problem, fix the misconfiguration, configure the appliance with the new configuration, and then check that you see the set of expected paths.

Scenario 2: AS certificate issue

If there is no valid AS certificate configured on EDGE-1, the appliance cannot create valid path segments from the beacons because it cannot sign them. As a result, the showpaths will not display any path. Thus, the AS certificate might be the source of the problem.

Get the list of AS certificates that are configured on the appliance:

appliance-cli get cppki/certificates

If there is no AS certificate configured on EDGE-1, the output is:

   {
"certificate_chains": []
}

Missing AS certificate can be due to forgetting to configure an AS certificate, deleting the certificate accidentally, or failing to renew certificates automatically, e.g., when there has been a prolonged connectivity issue in the order of days.

To resolve the issue, you need to add a valid AS certificate to EDGE-1. In general, an AS certificate needs to be requested from one of the CAs of the local ISD. The initial certificate is requested with an out-of-band mechanism. See Certificate handling for more details on listing, generating, and installing AS certificates.

Scenario 3: time synchronization issue

If the appliance has a valid AS certificate but does not have any paths to the SCION network, its time might have been desynchronized, resulting in appliance's disability to verify beacons and create path segments.

Check the current date, timezone and NTP status:

timedatectl status

For example this output shows that timezone is UTC and NTP synchronization is not working:

                Local time: Mon 2023-10-30 12:39:43 UTC
Universal time: Mon 2023-10-30 12:39:43 UTC
RTC time: Mon 2023-10-30 12:39:43
Time zone: Etc/UTC (UTC, +0000)
System clock synchronized: no
NTP service: active
RTC in local TZ: no

Time synchronization failure can be due to wrong configuration of NTP servers or unreachabilty of the servers. NTP servers must be reachable via an underlay IP connectivity. NTP servers should be configured in the appliance configuration. See System for detailed information on how to configure time servers.

Temporary solution

As a temporary solution, set the time manually:

timedatectl set-time '2015-11-20 16:14:50'

However, to avoid future time synchronization problems, configure NTP servers and make sure they are reachable.

Time zone not needed

It is not necessary to configure a timezone for the SCION network to be operational. If you prefer, you can set the timezone using the command::

timedatectl set-timezone UTC

We recommend using UTC everywhere since it makes it easier to correlate events across timezones.

All expected paths alive

Scenario 1: Domain misconfiguration

Assume that ping from the end host Endhost-1 in the AS 1-ff00:1:1 to the end host Endhost-2 in the AS 1-ff00:1:2, which should be reachable over the IP-in-SCION tunneling, does not work. Meanwhile, running a showpaths command towards AS 1-ff00:1:2 displays all the expected paths between the two ASes.

Inspect the prefixes advertised by the local SCION AS (i.e., 1-ff00:1:1) and the prefixes learnt from the remote SCION ASes (in particular, 1-ff00:1:2).

These prefixes are exposed by the appliance on a debug endpoint:

appliance-cli get debug/scion-tunneling/sgrp/domains

Below is an example of how the output could look like::

   {
"domains": {
"your-domain-name": {
"announced": ["10.0.10.0/24"],
"received": [""]
}
}
}

In this case, no prefix from remote ASes has been learned.

If there is a discrepancy between the set of expected and learnt prefixes, the domain is probably misconfigured.

  1. Fix the configuration and configure the appliance with the modified configuration.
  2. Check the HTTP status page to confirm that the changes appear there too.
  3. Try ping command from Endhost-1 to Endhost-2.

Scenario 2: TRC issue

In order for the appliance to join the SCION network and communicate with other nodes, it has to be configured with a set of TRCs. These TRCs build the trust anchors for verifying all of the control plane data that is exchanged in the SCION protocol. Therefore, the lack of a trusted TRC in the appliance results in loss of connectivity.

Get the list of configured TRCs on the appliance:

appliance-cli get cppki/trcs

If there is no TRC configured on the appliance, the output is as follow::

   {
"trcs": []
}

This indicates that no TRC is configured on the appliance. To fix the issue, install a valid TRC on this appliance. See TRC handling for more details on generating and installing a TRC.

Missing TRC can be due to forgetting to configure a TRC or deleting the TRC accidentally. If the latter has happened, the showpaths command may function correctly for some time.