Adding nodes to a bare-metal cluster
Manual method
-
Enable the
node-managermodule. -
Create a NodeGroup object with the type
Static:apiVersion: deckhouse.io/v1 kind: NodeGroup metadata: name: worker spec: nodeType: StaticIn this resource, specify the
Staticnode type. For all NodeGroup objects in the cluster, Deckhouse automatically generates abootstrap.shscript used to add nodes to the group. When adding nodes manually, you need to copy this script to the server and run it.You can obtain the script from the Deckhouse web interface under the “Node Groups → Scripts” tab or via the following
d8 kcommand:d8 k -n d8-cloud-instance-manager get secrets manual-bootstrap-for-worker -ojsonpath="{.data.bootstrap\.sh}"The script needs to be decoded from Base64 and then executed as
root. -
Once the script finishes, the server will be added to the cluster as a node in the specified group.
Automatic method
If you have previously increased the number of master nodes in the cluster in the NodeGroup master (parameter spec.staticInstances.count), before adding nodes using automatic method, make sure that they will not be “captured”.
DKP supports automatic addition of physical (bare-metal) servers to the cluster without the need to manually run an installation script on each node. To enable this:
- Prepare the server (OS, networking):
- Install a supported operating system.
- Configure networking and ensure the server is accessible via SSH.
- Create a system user (e.g.,
ubuntu) for SSH access. - Ensure the user can execute commands using
sudo.
- Create an SSHCredentials object to define access to the server. DKP uses this object to connect to the server over SSH. It specifies:
- A private SSH key.
- The OS user.
- The SSH port.
-
(Optional) A
sudopassword, if required.Example:
apiVersion: deckhouse.io/v1alpha1 kind: SSHCredentials metadata: name: static-nodes spec: privateSSHKey: | -----BEGIN OPENSSH PRIVATE KEY----- LS0tLS1CRUdJlhrdG...................VZLS0tLS0K -----END OPENSSH PRIVATE KEY----- sshPort: 22 sudoPassword: password user: ubuntuImportant. The private key must match the corresponding public key added to the
~/.ssh/authorized_keysfile on the server.
-
Create a StaticInstance` object for each server:
apiVersion: deckhouse.io/v1alpha1 kind: StaticInstance metadata: name: static-0 labels: static-node: auto spec: address: 192.168.1.10 credentialsRef: apiVersion: deckhouse.io/v1alpha1 kind: SSHCredentials name: static-nodesA separate StaticInstance resource must be created for each server, but the same SSHCredentials can be reused to access multiple servers.
Possible StaticInstance states:
Pending: The server has not yet been configured; the corresponding node is not present in the cluster.Bootstrapping: The server is being configured and the node is being added to the cluster.Running: The server is successfully configured and the node has joined the cluster.Cleaning: The server is being cleaned up and the node is being removed from the cluster.
These states reflect the current stage of node management. CAPS automatically transitions a StaticInstance between these states depending on whether a node needs to be added or removed from a group.
-
Create a NodeGroup resource describing how DKP should use these servers:
apiVersion: deckhouse.io/v1 kind: NodeGroup metadata: name: worker spec: nodeType: Static staticInstances: count: 3 labelSelector: matchLabels: static-node: auto nodeTemplate: labels: node-role.deckhouse.io/worker: ""This section defines parameters for using StaticInstance resources:
countspecifies how many nodes will be added to the group.labelSelectordefines the rules for selecting nodes.
When using the Cluster API Provider Static (CAPS), it is important to correctly set the
nodeTypetoStaticand provide thestaticInstancessection in the NodeGroup resource:- If the
labelSelectoris not specified, CAPS will use any available StaticInstance resources in the cluster. - The same StaticInstance can be used in multiple NodeGroups if it matches the filters.
- CAPS automatically maintains the number of nodes in the group according to the
countparameter. - When a node is removed, CAPS performs cleanup and disconnection, and the corresponding StaticInstance transitions to the
Pendingstatus, allowing it to be reused.
After the node group is created, a script for adding servers to the group will become available. DKP will wait for the required number of StaticInstance objects that match the specified labels. As soon as such an object appears, DKP will use the provided IP address and SSH connection parameters to run the bootstrap.sh script and add the server to the group.
Modifying a static cluster configuration
The static cluster settings are stored in the StaticClusterConfiguration structure.
To modify the static cluster parameters, run the following command:
d8 platform edit static-cluster-configuration
Moving a static node between NodeGroups
During the migration of static nodes between NodeGroups, the node is cleaned up and bootstrapped again, and the Node object is recreated.
-
Create a new NodeGroup resource, for example named
front, which will manage the static node labeledrole: front:d8 k create -f - <<EOF apiVersion: deckhouse.io/v1 kind: NodeGroup metadata: name: front spec: nodeType: Static staticInstances: count: 1 labelSelector: matchLabels: role: front -
Change the
rolelabel of the existing StaticInstance fromworkertofront.
This will allow the new NodeGroup namedfrontto manage this node:d8 k label staticinstance static-worker-1 role=front --overwrite -
Update the
workerNodeGroup resource by decreasing thecountparameter from1to0:d8 k patch nodegroup worker -p '{"spec": {"staticInstances": {"count": 0}}}' --type=merge
Manual cleanup of a static node
To remove a node from the cluster and clean the server, run the /var/lib/bashible/cleanup_static_node.sh script, which is already present on every static node.
Example of disconnecting a node from the cluster and cleaning the server:
bash /var/lib/bashible/cleanup_static_node.sh --yes-i-am-sane-and-i-understand-what-i-am-doing
This instruction applies both to nodes manually configured using the bootstrap script and to nodes configured via CAPS.
NodeGroup example
Example NodeGroup definition for static nodes
For virtual machines on hypervisors or physical servers, use static nodes by setting nodeType: Static in the NodeGroup.
Example:
apiVersion: deckhouse.io/v1
kind: NodeGroup
metadata:
name: worker
spec:
nodeType: Static
Nodes are added to such a group manually using preconfigured scripts or automatically via CAPS.
Changing the CRI for a NodeGroup
CRI (Container Runtime Interface) is a standard interface between the kubelet and the container runtime.
CRI can only be switched between Containerd and NotManaged via the cri.type parameter.
To change the CRI for a NodeGroup, set the cri.type parameter to either Containerd or NotManaged.
Example YAML manifest:
apiVersion: deckhouse.io/v1
kind: NodeGroup
metadata:
name: worker
spec:
nodeType: Static
cri:
type: Containerd
You can also perform this operation using a patch:
-
To set
Containerd:d8 k patch nodegroup <NodeGroup name> --type merge -p '{"spec":{"cri":{"type":"Containerd"}}}' -
To set
NotManaged:d8 k patch nodegroup <NodeGroup name> --type merge -p '{"spec":{"cri":{"type":"NotManaged"}}}'
When changing the cri.type for a NodeGroup created using dhctl, you must also update this value in dhctl config edit provider-cluster-configuration and in the NodeGroup object settings.
After changing the CRI for a NodeGroup, the node-manager module will sequentially reboot the nodes, applying the new CRI.
Node updates involve disruption. Depending on the disruption settings for the NodeGroup, the node-manager module will either automatically update the nodes or require manual approval.
Changing the NodeGroup of a static node
If a node is managed by CAPS, you can’t change its associated NodeGroup.
The only option is to delete the StaticInstance and create a new one.
To move an existing manually added static node from one NodeGroup to another, you need to update the group label on the node:
d8 k label node --overwrite <node_name> node.deckhouse.io/group=<new_node_group_name>
d8 k label node <node_name> node-role.kubernetes.io/<old_node_group_name>-
It will take some time for the changes to take effect.
Changing the IP address in a StaticInstance
You cannot change the IP address of a StaticInstance resource.
If an incorrect address is specified in a StaticInstance, you need to delete the StaticInstance and create a new one.
Deleting a StaticInstance
A StaticInstance in the Pending state can be safely deleted without any issues.
To delete a StaticInstance that is in any state other than Pending (Running, Cleaning, Bootstrapping), follow these steps:
-
Add the label
"node.deckhouse.io/allow-bootstrap": "false"to the StaticInstance.Example command for adding a label:
d8 k label staticinstance d8cluster-worker node.deckhouse.io/allow-bootstrap=false -
Wait until the StaticInstance transitions to the
Pendingstate.To check the status of StaticInstance, use the command:
d8 k get staticinstances -
Delete the StaticInstance.
Example command for deleting StaticInstance:
d8 k delete staticinstance d8cluster-worker -
Decrease the
NodeGroup.spec.staticInstances.countparameter by 1.