Skip to main content

Common operations

This documentation page contains information for common operations that are helpful when troubleshooting.

Collect appliance debug dump

A debug dump is a compressed journald-log of the last hour. Among other things, it contains snapshots of metrics and appliance API-endpoints taken at regular intervals. A debug dump should always be included when filing a bug report, e.g.

The following command takes a debug dump and stores the result in debugdump.zst:

appliance-cli debug dump -o debugdump.zst
Unreachable API

If the HTTP-API of the appliance is not properly configured or not reachable, you can specify the -use-journalctl option. Specifying this option bypasses the appliance API and makes use of journalctl directly.

Gather appliance information

To collect appliance-related information to provide it to the Anapaya customer support:

  1. SSH to the given machine.

  2. Collect general information by running:

    appliance-cli info > appliance.info
  3. Fetch the appliance configuration by running:

    appliance-cli get config > config.json
Secrets in the config (prior to v0.39.0)

In old versions (prior to v0.39.0) the appliance configuration contained secrets, so please remove them before sending the information to anyone!

Gather general host information

Collect host-related information to provide it to the Anapaya customer support:

  1. SSH to the given machine.

  2. Run

    sudo lshw

Check docker services

Check whether the services (run as docker containers) are running:

  1. SSH to the given machine

  2. Use docker ps -a:

    $ docker ps -a
    CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
    c718397beaf9 scion-all:v0.32.2 "/app/scion-all netw…" 7 days ago Up 7 days dataplane-control
    5beecfb5d081 vpp-dataplane:v0.32.2 "/usr/bin/vpp -c /sh…" 7 days ago Up 7 days dataplane
    ...

The output of the command shows whether the service is up and for how long it has been running. If the service is up for a very short amount of time, there is a chance that it is crashlooping.

For further information please refer to the official Docker documentation.

Change log level

Change the log level to debug for collecting more information:

  1. SSH to the given machine

  2. Change the debug level of a specific service to debug.

    appliance-cli services log level <service-name> debug
warning

Revert your changes after troubleshooting.

Inspect docker service logs

Inspect the logs of services running as docker containers:

  1. SSH to the given machine.

  2. If needed, see the list of services:

    docker ps -a
  3. Inspect the logs by running the following command:

    docker logs <service-name>
Recent logs

To see only the recent logs use:

docker logs <service-name> --since=<time-duration>

For example, to check the logs of the last minute, run:

docker logs <service-name> --since=1m
Save logs to file

To save the logs in a file use:

docker logs <service-name> 2> <filename>
Search in logs

To grep through the logs use

docker logs <service-name> 2>&1 | grep <query>

For further information please refer to the official Docker documentation.

Restart a service

Automatic restart

The Anapaya appliance restarts failed services automatically, so manual restarting is likely to be useful only when the service is stuck and/or unresponsive.

Restart a service using the appliance-cli:

appliance-cli post debug/services/${service_name}/restart

where ${service_name} is the name of the service you want to restart. To get the possible values for the ${service_name}, use the following command:

appliance-cli get debug/services
Using docker restart

Alternatively, you can restart a service by running the following commands:

  1. SSH to the given machine
  2. Run docker restart <service-name>

For further information, please refer to the official Docker documentation.

Clean up docker images

Remove docker images that are no longer used:

  1. SSH to the given machine.

  2. List all docker images by running:

    docker image ls
  3. Remove old unused images by running:

    docker image prune

For further information please refer to the official Docker documentation.

Connect to the BGP daemon's interactive console

Connect to the BGP daemon's shell:

  1. SSH to the given machine.

  2. Open the interactive console by running:

    docker exec -it frr vtysh

For further information on the console please refer to the official FRR documentation.

Check systemd services

Check if the systemd services are running:

  1. SSH to the given machine

  2. Run systemctl list-units '<service-name>':

    $ systemctl list-units 'appliance*'
    UNIT LOAD ACTIVE SUB DESCRIPTION
    appliance-host.service loaded active running Anapaya Appliance Host Service
    appliance-installer.service loaded active running Anapaya Appliance Installer
    ...
  3. To get a more detailed overview of a specific service, use systemctl status <service-name>:

    $ systemctl status appliance-installer.service
    ● appliance-installer.service - Anapaya Appliance Installer
    Loaded: loaded (/etc/systemd/system/appliance-installer.service; enabled; vendor preset: enabled)
    Active: active (running) since Mon 2023-02-13 08:26:29 UTC; 7min ago
    Main PID: 166 (appliance-insta)
    Tasks: 13 (limit: 38262)
    CGroup: /system.slice/appliance-installer.service
    └─166 /usr/bin/appliance-installer --config /etc/anapaya/installer/appliance-installer.toml
    ...

With these commands you can see whether the service is active and running and for how long it has been running.

note

A systemd service can be restarted using systemctl restart <service-name>.

Inspect systemd service logs

View the systemd service logs:

  1. SSH to the given machine.

  2. If needed, you can list the appliance-related services by running:

    systemctl list-units 'appliance*'
  3. Inspect the logs by running:

    journalctl -eu <service-name>
Recent logs

To see only the recent logs use the --since flag. For example, to see only the logs from today use journalctl -eu <service-name> --since today.

To show the most recent 20 entries, use the -n 20 option.

Save logs to file

To save the logs in a file use:

journalctl -u <service-name> > <filename>
Search in logs

To grep through the logs use:

journalctl -eu <service-name> | grep <query>

Check the systemd-timesyncd service

The Anapaya appliance uses the systemd-timesyncd service, which acts as an NTP client and connects to a pool of NTP servers for time synchronization. The following actions provide some starting points for troubleshooting the timesyncd service called systemd-timesyncd.service. For further information please refer to the official documentation.

  1. SSH to the given machine.

  2. Check if the system clock is synchronized and if NTP service is active using:

    timedatectl status
  3. Check if the service is running

  4. Check the log of the service.

  5. Restart the service:

    systemctl restart systemd-timesyncd.service
  6. Find the configured NTP servers:

    cat /etc/systemd/timesyncd.conf | grep NTP

Disk usage analysis

This section contains some helpful commands that you may need when investigating if you run out of disk space.

  1. SSH to the given machine.

  2. Check the current space:

    df -h <path>
  3. Check the list of the current files:

    ls -l <path>
  4. The du command can be used to get a more detailed overview of which directory consumes how much space. You can vary the max-depth option or the starting directory:

    du -cha --max-depth=1 / | grep -E "M|G"

For further information about the du command please refer to the official documentation.

Clean up disk space

There are several ways to free up disk space. The options are divided depending on the context.

  1. Remove outdated packages:

    sudo apt autoclean
  2. Remove orphaned packages which are no longer needed:

    sudo apt autoremove
  3. Clean the entire APT cache:

    sudo apt clean

Systemd journal logs

  1. Check systemd journal logs:

    journalctl --disk-usage
  2. Clear the logs that are older than 3 days:

    sudo journalctl --vacuum-time=3d

Docker images

Fix topology synchronization error

Appliances in a cluster share their topology information with each other. This either happens statically through configuration or dynamically through an exchange protocol. For further information on how to configure topology synchronization in the appliance configuration, refer to Topology Synchronization. The instructions below should help to identify a misconfiguration.

  1. Check the logs of the appliance-controller service. The logs should contain an error describing the misconfiguration.
  2. Fix the misconfigured appliances and update them.

Inspect SCION paths used for IP-in-SCION tunneling

While troubleshooting SCION connectivity, it is often useful to check the available paths for each domain. This section provides an overview on how to achieve this.

  1. SSH to the given machine.

  2. Show the currently available paths for all domains and traffic matchers by running the following command. This also shows whether the path is alive, dead (no probes are passing through), expired or similar.

    appliance-cli inspect scion-tunneling summary --all-paths
  3. Show the currently used paths for a specific domain.

    appliance-cli inspect scion-tunneling summary --all-paths \
    --domain <domain>

    For the used paths for a specific traffic matcher within the given domain, run:

    appliance-cli inspect scion-tunneling summary --all-paths \
    --domain <domain> --traffic-matcher <traffic matcher>

Ping the underlay network

When investigating an issue, it is often helpful to determine whether the underlying IP connectivity is the problem.

For further information, please refer to the official ping documentation.

tip

The ping command runs indefinitely, unless specified otherwise:

ping -c <number> <destination>

Changing the source address is possible either directly via the address or the interface name:

ping <destination> -I <interface/address>

The default time interval between successive packet transmissions is one second. You can specify a custom interval in seconds:

ping -i <interval> <destination>

It may be useful to test how the underlying network deals with big packets. Use -s option to set the packet size and -M do to prohibit fragmentation:

ping -s <size> -M do <destination>