Adding and removing a node

Adding a static node to a cluster

You can add a static node manually or using the Cluster API Provider Static (CAPS).

Adding a static node manually

To add a bare-metal server to a cluster as a static node, follow these steps:

Use the existing NodeGroup custom resource or create a new one, setting the Static or CloudStatic value for the nodeType parameter.

Example of a NodeGroup resource named worker:
```
apiVersion: deckhouse.io/v1
kind: NodeGroup
metadata:
  name: worker
spec:
  nodeType: Static
```
Get the Base64-encoded script code to add and configure the node.

Example command to get the Base64-encoded script code to add a node to the worker NodeGroup:
```
NODE_GROUP=worker
d8 k -n d8-cloud-instance-manager get secret manual-bootstrap-for-${NODE_GROUP} -o json | jq '.data."bootstrap.sh"' -r
```
Pre-configure the new node according to your environment specifics:
- Add all the necessary mount points to the /etc/fstab file: NFS, Ceph, etc.
- Install the necessary packages.
- Set up network connectivity between the new node and the other nodes of the cluster.
Connect to the new node over SSH and run the following command, inserting the Base64 string you got in step 2:
```
echo <Base64-CODE> | base64 -d | bash
```

Adding a static node using CAPS

To learn more about Cluster API Provider Static (CAPS), refer to Configuring a node via CAPS.

Example of adding a static node to a cluster using CAPS:

To add a static node to a cluster (bare metal server or virtual machine), follow these steps:

Allocate a server with an installed operating system (OS) and set up network connectivity. If necessary, install additional OS-specific packages and add mount points to use on the node.
- Create a user (named caps in the following example) capable of running sudo by running the following command on the server:
```
useradd -m -s /bin/bash caps 
usermod -aG sudo caps
```
- Allow the user to run sudo commands without having to enter a password. To do that, add the following line to the sudo configuration on the server by either editing the /etc/sudoers file, running the sudo visudo command, or via any other method:
```
caps ALL=(ALL) NOPASSWD: ALL
```
- Generate a pair of SSH keys with an empty passphrase on the server using the following command:
```
ssh-keygen -t rsa -f caps-id -C "" -N ""
```
  The public and private keys of the caps user will be stored in the caps-id.pub and caps-id files in the current directory on the server.
- Add the generated public key to the /home/caps/.ssh/authorized_keys file of the caps user by running the following commands in the directory storing the keys on the server:
```
mkdir -p /home/caps/.ssh 
cat caps-id.pub >> /home/caps/.ssh/authorized_keys 
chmod 700 /home/caps/.ssh 
chmod 600 /home/caps/.ssh/authorized_keys
chown -R caps:caps /home/caps/
```
Create a SSHCredentials resource in the cluster.
- To access the added server, CAPS requires the private key of the service user caps. The Base64-encoded key is added to the SSHCredentials resource.
  
  To encode the private key to Base64, run the following command in the user key directory on the server:
```
base64 -w0 caps-id
```
- On any computer configured to manage the cluster, create an environment variable with the value of the Base64-encoded private key you generated earlier. To prevent the key from saving in the shell history, add a whitespace character at the beginning of the command:
```
CAPS_PRIVATE_KEY_BASE64=<PRIVATE_KEY_IN_BASE64>
```
- Create a SSHCredentials resource with the service user name and associated private key:
```
d8 k create -f - <<EOF
apiVersion: deckhouse.io/v1alpha1
kind: SSHCredentials
metadata:
  name: static-0-access
spec:
  user: caps
  privateSSHKey: "${CAPS_PRIVATE_KEY_BASE64}"
EOF
```

Create a StaticInstance resource in the cluster.

The StaticInstance resource defines the IP address of the static node server and the data required to access the server:

d8 k create -f - <<EOF
apiVersion: deckhouse.io/v1alpha1
kind: StaticInstance
metadata:
  name: static-0
spec:
  # Specify the static node server's IP address.
  address: "<SERVER-IP>"
  credentialsRef:
    kind: SSHCredentials
    name: static-0-access
EOF

Create a NodeGroup resource in the cluster:

d8 k create -f - <<EOF
apiVersion: deckhouse.io/v1
kind: NodeGroup
metadata:
  name: worker
spec:
  nodeType: Static
  staticInstances:
    count: 1
EOF

Wait until the NodeGroup resource is in the Ready state. To check the resource state, run the following command:

d8 k get ng worker

In the NodeGroup state, 1 node should appear in the READY column:

NAME     TYPE     READY   NODES   UPTODATE   INSTANCES   DESIRED   MIN   MAX   STANDBY   STATUS   AGE    SYNCED
worker   Static   1       1       1                                                                 15m   True

Adding a static node using CAPS and label selector filters

To connect different StaticInstance resources to different NodeGroup resources, you can use the label selector specified in the NodeGroup and StaticInstance metadata.

In the following example, you can see how three static nodes are distributed between two NodeGroup resources: one into the worker group, and two others into the front group.

Prepare the required resources (three servers) and create the SSHCredentials resources for them, following steps 1 and 2 from the previous scenario.

Create two NodeGroup in the cluster.

Specify labelSelector, so that only the corresponding servers could connect to the NodeGroup resources:

d8 k create -f - <<EOF
apiVersion: deckhouse.io/v1
kind: NodeGroup
metadata:
  name: front
spec:
  nodeType: Static
  staticInstances:
    count: 2
    labelSelector:
      matchLabels:
        role: front
---
apiVersion: deckhouse.io/v1
kind: NodeGroup
metadata:
  name: worker
spec:
  nodeType: Static
  staticInstances:
    count: 1
    labelSelector:
      matchLabels:
        role: worker
EOF

Create the StaticInstance resources in the cluster.

Specify the actual IP addresses of the servers and set the role label in metadata:

d8 k create -f - <<EOF
apiVersion: deckhouse.io/v1alpha1
kind: StaticInstance
metadata:
  name: static-front-1
  labels:
    role: front
spec:
  address: "<SERVER-FRONT-IP1>"
  credentialsRef:
    kind: SSHCredentials
    name: front-1-credentials
---
apiVersion: deckhouse.io/v1alpha1
kind: StaticInstance
metadata:
  name: static-front-2
  labels:
    role: front
spec:
  address: "<SERVER-FRONT-IP2>"
  credentialsRef:
    kind: SSHCredentials
    name: front-2-credentials
---
apiVersion: deckhouse.io/v1alpha1
kind: StaticInstance
metadata:
  name: static-worker-1
  labels:
    role: worker
spec:
  address: "<SERVER-WORKER-IP>"
  credentialsRef:
    kind: SSHCredentials
    name: worker-1-credentials
EOF

To check the result, run the following command:

d8 k get ng

In the output, you should see a list of created NodeGroup resources, with static nodes distributed between them:

NAME     TYPE     READY   NODES   UPTODATE   INSTANCES   DESIRED   MIN   MAX   STANDBY   STATUS   AGE    SYNCED
master   Static   1       1       1                                                               1h     True
front    Static   2       2       2                                                               1h     True

How do I know if something went wrong?

If a node in a NodeGroup isn’t updated (theUPTODATE value is less than the NODES value when executing the d8 k get nodegroup command) or you assume there are other problems that may be related to the node-manager module, check the logs of the bashible service. The bashible service runs on each node managed by the node-manager module.

To view the logs of the bashible service, run the following command on the node:

journalctl -fu bashible

Example of output when the bashible service has performed all necessary actions:

May 25 04:39:16 kube-master-0 systemd[1]: Started Bashible service.
May 25 04:39:16 kube-master-0 bashible.sh[1976339]: Configuration is in sync, nothing to do.
May 25 04:39:16 kube-master-0 systemd[1]: bashible.service: Succeeded.

Removing a node from a cluster

The procedure is valid for both a manually configured node (using the bootstrap script) and a node configured using CAPS.

To disconnect a node from a cluster and clean up the server (VM), run the following command on the node:

bash /var/lib/bashible/cleanup_static_node.sh --yes-i-am-sane-and-i-understand-what-i-am-doing

How do I clean up a node for adding to another cluster?

This is only necessary if you need to move a static node from one cluster to another. Note that these operations result in removing data from the local storage. If you only need to change a NodeGroup, follow the NodeGroup changing procedure instead.

If the node you are cleaning up has the LINSTOR/DRBD storage pools, to evict resources from the node and remove the LINSTOR/DRBD node, use the corresponding procedure for the sds-replicated-volume module.

To clean up a node for adding to another cluster, follow these steps:

Remove the node from the Kubernetes cluster:

d8 k drain <node> --ignore-daemonsets --delete-local-data
d8 k delete node <node>

Run the clean-up script on the node:

bash /var/lib/bashible/cleanup_static_node.sh --yes-i-am-sane-and-i-understand-what-i-am-doing

After the node is restarted, it can be added to another cluster.

FAQ

Can I delete a StaticInstance?

A StaticInstance in the Pending state can be deleted safely.

To delete a StaticInstance in any state other than Pending, such as Running, Cleaning, or Bootstrapping, do the following:

Add the label "node.deckhouse.io/allow-bootstrap": "false" to the StaticInstance.

Example command for adding a label:
```
d8 k label staticinstance d8cluster-worker node.deckhouse.io/allow-bootstrap=false
```
Wait until the StaticInstance status is changed to Pending.

To check the status of StaticInstance, use the command:
```
d8 k get staticinstances
```
Delete the StaticInstance.

Example command for deleting StaticInstance:
```
d8 k delete staticinstance d8cluster-worker
```
Decrease the NodeGroup.spec.staticInstances.count parameter’s value by 1.
Wait until the NodeGroup is in the Ready state.

How do I change the IP address of a StaticInstance?

You cannot change the IP address in the StaticInstance resource. If an incorrect address is specified in the StaticInstance, you have to delete the StaticInstance and create a new one.

How do I migrate a manually configured static node under CAPS control?

You need to clean up the node and then hand it over under CAPS control.

How do I change the NodeGroup of a static node?

If a node is under CAPS control, you can’t change the NodeGroup membership of such a node. The only way is to delete a StaticInstance and create a new one.

To switch an existing manually added static node to another NodeGroup, change its group label and delete its role label using the following commands:

d8 k label node --overwrite <node_name> node.deckhouse.io/group=<new_node_group_name>
d8 k label node <node_name> node-role.kubernetes.io/<old_node_group_name>-

Applying the changes will take some time.

How do I know what is running on a node while it is being created?

To find out what’s happening on a node (for example, when it’s taking too long to create or stuck in the Pending state), you can check the cloud-init logs. To do that, follow these steps:

Find the node that is currently bootstrapping:

d8 k get instances | grep Pending

An output example:

dev-worker-2a6158ff-6764d-nrtbj   Pending   46s

Get information about connection parameters for viewing logs:

d8 k get instances dev-worker-2a6158ff-6764d-nrtbj -o yaml | grep 'bootstrapStatus' -B0 -A2

An output example:

bootstrapStatus:
  description: Use 'nc 192.168.199.178 8000' to get bootstrap logs.
  logsEndpoint: 192.168.199.178:8000

To view the cloud-init logs for diagnostics, run the command you got (nc 192.168.199.178 8000 according to the example above).

The logs of the initial node configuration are located at /var/log/cloud-init-output.log.

Adding a static node to a cluster

Adding a static node manually

Adding a static node using CAPS

Adding a static node using CAPS and label selector filters

How do I know if something went wrong?

Removing a node from a cluster

How do I clean up a node for adding to another cluster?

FAQ

Can I delete a StaticInstance?

How do I change the IP address of a StaticInstance?

How do I migrate a manually configured static node under CAPS control?

How do I change the NodeGroup of a static node?

How do I know what is running on a node while it is being created?

Request trial access

Thank you

Error

Request callback

Thank you

Something went wrong

Book your sessions

Thank you

Error

Request demo

Thank you

Error

Get the PCI SSC Compliance Report

Thank you

Error