How do I restore the Kubernetes etcd data in an NSP cluster?

Purpose

CAUTION

System Data Corruption

Attempting to restore the etcd data from one NSP cluster to a different NSP cluster causes the restore to fail, and renders the NSP cluster unrecoverable.

You must restore only an etcd data backup from the same NSP cluster; you cannot move an NSP cluster configuration to a different cluster, or restore a cluster configuration in a new cluster.

An etcd data backup, called a snapshot, captures all Kubernetes objects and associated critical information. A scheduled etcd data snapshot is performed daily. The following procedure describes how to recover a failed NSP cluster by restoring the etcd data from a snapshot.

Steps

Obtain and distribute snapshot

1	Log in as the root or NSP admin user on the NSP cluster host.
2	Enter the following to identify the namespace of the nsp-backup-storage pod: `#` `kubectl get pods -A \| grep nsp-backup ↵` The leftmost entry in the output line is the namespace, which in the following example is nsp-psa-restricted: `nsp-psa-restricted nsp-backup-storage-0 1/1 Running 0 5h16m`
3	Record the namespace value.
4	Enter the following to identify the etcd snapshot to restore: `#` `kubectl exec -n namespace nsp-backup-storage-0 - ls -la /tmp/backups/nsp-etcd/ ↵` where namespace is the namespace value recorded in Step 3 The directory contents are listed; the filename format of an etcd snapshot is: nsp-etcd_backup_timestamp.tar.gz where timestamp is the snapshot creation time
5	Record the name of the snapshot file that you need to restore.
6	Enter the following to copy the snapshot file from the backup pod to an empty directory on the local file system: `#` `kubectl cp namespace/nsp-backup-storage-0:/tmp/backups/nsp-etcd/snapshot_file path/snapshot_file ↵` where namespace is the namespace value recorded in Step 3 path is an empty local directory snapshot_file is the snapshot file name recorded in Step 5
7	Enter the following: Note: The file lists either one member, or three, depending on the deployment type. `#` `grep ETCD_INITIAL /etc/etcd.env ↵` Output like the following is displayed. `ETCD_INITIAL_ADVERTISE_PEER_URLS=https://local_address:port` `ETCD_INITIAL_CLUSTER_STATE=existing` `ETCD_INITIAL_CLUSTER_TOKEN=k8s_etcd` `ETCD_INITIAL_CLUSTER=etcd1=https://address_1:port,etcd2=https://address_2:port,etcd3=https://address_3:port` where local_address is the IP address of the etcd cluster member you are operating from address_1. address_2. and address_3 are the addresses of all etcd cluster members port is a port number
8	Perform the following on each etcd cluster member. Note: After this step, the etcd cluster is unreachable until the restore is complete. Log in as the root or NSP admin user. Enter the following: `#` `systemctl stop etcd ↵` The etcd service stops. Transfer the snapshot file obtained in Step 7 to the cluster member.
Restore database on etcd cluster members

9	Perform Step 11 to Step 19 on each etcd cluster member.
10	Go to Step 20.
11	Log in as the root or NSP admin user.
12	Navigate to the directory that contains the transferred snapshot file.
13	Enter the following: `#` `tar xzf path/nsp-etcd_backup_timestamp.tar.gz ↵` where path is the absolute path of the snapshot file timestamp is the snapshot creation time The snapshot file is uncompressed.
14	Enter the following: `#` `ETCDCTL_API=3 etcdctl snapshot restore etcd.db --name member --initial-cluster initial_cluster --initial-cluster-token token --initial-advertise-peer-urls URL ↵` where member is the name of the cluster member you are working on, for example, etcd2 initial_cluster is the ETCD_INITIAL_CLUSTER list of cluster members recorded in Step 7 token is the ETCD_INITIAL_CLUSTER_TOKEN value recorded in Step 7 URL is the URL of the cluster member you are working on; for example, the etcd2 cluster member URL shown in Step 7 is https://address_2:port The etcd database is restored.
15	Enter the following to create a directory in which to store the previous database: `#` `mkdir path/old_etcd_db ↵` where path is the absolute path of the directory to create
16	Enter the following to move the previous database files to the created directory: `#` `mv /var/lib/etcd/* path/old_etcd_db ↵` where path is the absolute path of the directory created in Step 15
17	Enter the following: `#` `mv ./member.etcd/* /var/lib/etcd/ ↵` where member is the member name specified in Step 14 The backup files move to the /var/lib/etcd directory.
18	Enter the following: `#` `systemctl start etcd ↵` The etcd service starts.
19	Enter the following: `#` `systemctl status etcd ↵` The etcd service status is displayed. The service is up if the following is displayed: `Active: active (running)`
20	When the etcd service is up, close the open console windows. End of steps

How do I restore the Kubernetes etcd data in an NSP cluster?

Purpose

Steps

Obtain and distribute snapshot

Restore database on etcd cluster members