How do I restore the Kubernetes etcd data in an NSP cluster?
Purpose
CAUTION System Data Corruption |
Attempting to restore the etcd data from one NSP cluster to a different NSP cluster causes the restore to fail, and renders the NSP cluster unrecoverable.
You must restore only an etcd data backup from the same NSP cluster; you cannot move an NSP cluster configuration to a different cluster, or restore a cluster configuration in a new cluster.
An etcd data backup is a snapshot of all Kubernetes objects and associated critical information. A scheduled backup of the etcd data is performed daily. The following procedure describes how to recover a failed NSP cluster by restoring the etcd data using a backup copy.
Steps
Obtain and distribute snapshot | |
1 |
Log in as the root user on an NSP cluster member in the primary data center. |
2 |
Enter the following to copy the snapshot from the backup pod to an empty directory on the local file system: # kubectl cp namespace/nsp-backup-storage-0:/tmp/backups/nsp-etcd/nsp-etcd_backup_timestamp.tar.gz local_path ↵ where namespace is the Kubernetes namespace timestamp is the snapshot creation time local_path is the empty local directory |
3 |
Determine which nodes are members of the etcd cluster. Note: The file lists either one member, or three, depending on the deployment type. |
4 |
Enter the following as the root user on each etcd cluster member identified in Step 3: Note: After you perform this step, the etcd cluster is unreachable. # systemctl stop etcd ↵ |
5 |
Transfer the snapshot file obtained in Step 2 to each etcd cluster member. |
Restore database on members | |
6 |
Perform Step 8 to Step 16 on each etcd cluster member station. |
7 |
Go to Step 17. |
8 |
Log in as the root user. |
9 |
Navigate to the directory that contains the transferred snapshot file. |
10 |
Enter the following: # tar xzf path/nsp-etcd_backup_timestamp.tar.gz ↵ where path is the absolute path of the snapshot file timestamp is the snapshot creation time The snapshot file is uncompressed. |
11 |
Enter the following: # ETCDCTL_API=3 etcdctl snapshot restore etcd.db --name etcdN --initial-cluster cluster --initial-cluster-token token --initial-advertise-peer-urls URL ↵ where N is the member number in the recorded ETCD_INITIAL_CLUSTER value, for example, 1, 2, or 3 cluster is the recorded ETCD_INITIAL_CLUSTER value token is the recorded ETCD_INITIAL_CLUSTER_TOKEN value URL is the recorded ETCD_INITIAL_ADVERTISE_PEER_URLS value The etcd database is restored. |
12 |
Enter the following to create a directory in which to store the previous database: # mkdir path/old_etcd_db ↵ where path is the absolute path of the created directory |
13 |
Enter the following to move the previous database files to the created directory: # mv /var/lib/etcd/* path/old_etcd_db ↵ where path is the absolute path of the directory created in Step 12 |
14 |
Enter the following: # mv ./etcdN.etcd/* /var/lib/etcd/ ↵ where N is the member number specified in Step 11 The uncompressed snapshot files move to the /var/lib/etcd directory. |
15 |
Enter the following: # systemctl start etcd ↵ The etcd service starts. |
16 |
Enter the following: # systemctl status etcd ↵ The etcd service status is displayed. The service is up if the following is displayed: Active: active (running) |
17 |
When the etcd service is up, close the open console windows. End of steps |