How do I restore the Kubernetes etcd data in an NSP cluster?

Purpose
CAUTION 

CAUTION

System Data Corruption

Attempting to restore the etcd data from one NSP cluster to a different NSP cluster causes the restore to fail, and renders the NSP cluster unrecoverable.

You must restore only an etcd data backup from the same NSP cluster; you cannot move an NSP cluster configuration to a different cluster, or restore a cluster configuration in a new cluster.

An etcd data backup is a snapshot of all Kubernetes objects and associated critical information. A scheduled backup of the etcd data is performed daily. The following procedure describes how to recover a failed NSP cluster by restoring the etcd data using a backup copy.

Steps
Obtain and distribute snapshot
 

Log in as the root user on an NSP cluster member in the primary data center.


Enter the following to copy the snapshot from the backup pod to an empty directory on the local file system:

kubectl cp namespace/nsp-backup-storage-0:/tmp/backups/nsp-etcd/nsp-etcd_backup_timestamp.tar.gz local_path

where

namespace is the Kubernetes namespace

timestamp is the snapshot creation time

local_path is the empty local directory


Determine which nodes are members of the etcd cluster.

Note: The file lists either one member, or three, depending on the deployment type.

  1. Open the /etc/etcd.env file for viewing.

  2. Locate the following parameters:

    • ETCD_INITIAL_CLUSTER

    • ETCD_INITIAL_CLUSTER_TOKEN

    • ETCD_INITIAL_ADVERTISE_PEER_URLS

  3. Record the parameter values.

  4. Close the file.


Enter the following as the root user on each etcd cluster member identified in Step 3:

Note: After you perform this step, the etcd cluster is unreachable.

systemctl stop etcd ↵


Transfer the snapshot file obtained in Step 2 to each etcd cluster member.


Restore database on members
 

Perform Step 8 to Step 16 on each etcd cluster member station.


Go to Step 17.


Log in as the root user.


Navigate to the directory that contains the transferred snapshot file.


10 

Enter the following:

tar xzf path/nsp-etcd_backup_timestamp.tar.gz ↵

where

path is the absolute path of the snapshot file

timestamp is the snapshot creation time

The snapshot file is uncompressed.


11 

Enter the following:

ETCDCTL_API=3 etcdctl snapshot restore etcd.db --name etcdN --initial-cluster cluster --initial-cluster-token token --initial-advertise-peer-urls URL

where

N is the member number in the recorded ETCD_INITIAL_CLUSTER value, for example, 1, 2, or 3

cluster is the recorded ETCD_INITIAL_CLUSTER value

token is the recorded ETCD_INITIAL_CLUSTER_TOKEN value

URL is the recorded ETCD_INITIAL_ADVERTISE_PEER_URLS value

The etcd database is restored.


12 

Enter the following to create a directory in which to store the previous database:

mkdir path/old_etcd_db ↵

where path is the absolute path of the created directory


13 

Enter the following to move the previous database files to the created directory:

mv /var/lib/etcd/* path/old_etcd_db ↵

where path is the absolute path of the directory created in Step 12


14 

Enter the following:

mv ./etcdN.etcd/* /var/lib/etcd/ ↵

where N is the member number specified in Step 11

The uncompressed snapshot files move to the /var/lib/etcd directory.


15 

Enter the following:

systemctl start etcd ↵

The etcd service starts.


16 

Enter the following:

systemctl status etcd ↵

The etcd service status is displayed.

The service is up if the following is displayed:

Active: active (running)


17 

When the etcd service is up, close the open console windows.

End of steps