How do I add a master node?

Static or hybrid cluster

Adding a master node to a static or hybrid cluster has no difference from adding a regular node to a cluster. To do this, use the corresponding instruction. All the necessary actions to configure a cluster control plane components on the new master nodes are performed automatically. Wait until the master nodes appear in Ready status.

Cloud cluster

Make sure you have all the necessary quota limits, before adding nodes.

To add one or more master nodes to a cloud cluster, follow these steps:

  1. Determine the Deckhouse version and edition used in the cluster by running the following command on the master node or a host with configured kubectl access to the cluster:

    kubectl -n d8-system get deployment deckhouse \
    -o jsonpath='version-{.metadata.annotations.core\.deckhouse\.io\/version}, edition-{.metadata.annotations.core\.deckhouse\.io\/edition}' \
    | tr '[:upper:]' '[:lower:]'
  2. Run the corresponding version and edition of the Deckhouse installer:

    docker run --pull=always -it -v "$HOME/.ssh/:/tmp/.ssh/" \<DECKHOUSE_EDITION>/install:<DECKHOUSE_VERSION> bash

    For example, if the Deckhouse version in the cluster is v1.28.0 and the Deckhouse edition is ee, the command to run the installer will be:

    docker run --pull=always -it -v "$HOME/.ssh/:/tmp/.ssh/" bash

    Change the container registry address if necessary (e.g, if you use an internal container registry).

  3. Run the following command inside the installer container (use the --ssh-bastion-* parameters if using a bastion host):

    dhctl config edit provider-cluster-configuration --ssh-agent-private-keys=/tmp/.ssh/<SSH_KEY_FILENAME> --ssh-user=<USERNAME> \
    --ssh-host <SSH_HOST>
  4. Specify the required number of master node replicas in the masterNodeGroup.replicas field and save changes.
  5. Start scaling process by running the following command (specify the appropriate cluster access parameters, as in the previous step):

    dhctl converge --ssh-agent-private-keys=/tmp/.ssh/<SSH_KEY_FILENAME> --ssh-user=<USERNAME> --ssh-host <SSH_HOST>
  6. Answer Yes to the question Do you want to CHANGE objects state in the cloud?.

All the other actions are performed automatically. Wait until the master nodes appears in Ready status.

How do I delete the master node?

  1. Check if the deletion lead to the etcd cluster losing its quorum:
    • If the deletion does not lead to the etcd cluster losing its quorum:
      • If a virtual machine with a master node can be deleted (there are no other necessary services on it), then you can delete the virtual machine in the usual way.
      • If you can’t delete the master right away (for example, it is used for backups or it is involved in the deployment process), then you have to stop the Container Runtime on the node: In the case of Docker:

        systemctl stop docker
        systemctl disable docker

        In the case of Containerd:

        systemctl stop containerd
        systemctl disable containerd
        kill $(ps ax | grep containerd-shim | grep -v grep |awk '{print $1}')
    • If the deletion may result in etcd losing its quorum (the 2 -> 1 mirgation), stop kubelet on the node (without stopping the etcd container):

      systemctl stop kubelet
      systemctl stop bashible.timer
      systemctl stop bashible
      systemctl disable kubelet
      systemctl disable bashible.timer
      systemctl disable bashible
  2. Delete the Node object from Kubernetes.
  3. Wait until the etcd member is automatically deleted.

How do I dismiss the master role while keeping the node?

  1. Remove the master and "" labels, then wait for the etcd member to be automatically deleted.
  2. Exec to the node and run the following commands:

    rm -f /etc/kubernetes/manifests/{etcd,kube-apiserver,kube-scheduler,kube-controller-manager}.yaml
    rm -f /etc/kubernetes/{scheduler,controller-manager}.conf
    rm -f /etc/kubernetes/authorization-webhook-config.yaml
    rm -f /etc/kubernetes/admin.conf /root/.kube/config
    rm -rf /etc/kubernetes/deckhouse
    rm -rf /etc/kubernetes/pki/{ca.key,apiserver*,etcd/,front-proxy*,sa.*}
    rm -rf /var/lib/etcd/member/

How do I view the list of etcd members?

Option 1

  1. Exec to the etcd Pod:

    kubectl -n kube-system exec -ti $(kubectl -n kube-system get pod -l component=etcd,tier=control-plane -o name | head -n1) sh
  2. Execute the command:

    ETCDCTL_API=3 etcdctl --cacert /etc/kubernetes/pki/etcd/ca.crt --cert /etc/kubernetes/pki/etcd/ca.crt \
    --key /etc/kubernetes/pki/etcd/ca.key --endpoints member list

Option 2

Use the etcdctl endpoint status command. The fith parameter in the output table will be true for the leader.


$ ETCDCTL_API=3 etcdctl --cacert /etc/kubernetes/pki/etcd/ca.crt --cert /etc/kubernetes/pki/etcd/ca.crt \  
  --key /etc/kubernetes/pki/etcd/ca.key --endpoints endpoint status, ade526d28b1f92f7, 3.5.3, 177 MB, false, false, 42007, 406566258, 406566258,, d282ac2ce600c1ce, 3.5.3, 182 MB, true, false, 42007, 406566258, 406566258,

What if something went wrong?

The control-plane-manager saves backups to /etc/kubernetes/deckhouse/backup. They can be useful in diagnosing the issue.

What if the etcd cluster fails?

  1. Stop (delete the /etc/kubernetes/manifests/etcd.yaml file) etcd on all nodes except one. This last node will serve as a starting point for the new multi-master cluster.
  2. On the last node, edit etcd manifest /etc/kubernetes/manifests/etcd.yaml and add the parameter --force-new-cluster to spec.containers.command.
  3. After the new cluster is ready, remove the --force-new-cluster parameter.

Caution! This operation is unsafe and breaks the guarantees given by the consensus protocol. Note that it brings the cluster to the state that was saved on the node. Any pending entries will be lost.

How do I configure additional audit policies?

  1. Enable the following flag in the d8-system/deckhouse ConfigMap:

    controlPlaneManager: |
        auditPolicyEnabled: true
  2. Create the kube-system/audit-policy Secret containing a base64-encoded yaml file:

    apiVersion: v1
    kind: Secret
      name: audit-policy
      namespace: kube-system
      audit-policy.yaml: <base64>

    The minimum viable example of the audit-policy.yaml file looks as follows:

    kind: Policy
    - level: Metadata
      - RequestReceived

    You can find the detailed information about configuring the audit-policy.yaml file at the following links:

    Create a Secret from the file:

    kubectl -n kube-system create secret generic audit-policy --from-file=./audit-policy.yaml

How to omit Deckhouse built-in policy rules?

Set apiserver.basicAuditPolicyEnabled to false.

An example:

controlPlaneManager: |
    auditPolicyEnabled: true
    basicAuditPolicyEnabled: false

How stream audit log to stdout instead of files?

Set apiserver.auditLog.output to stdout.

An example:

controlPlaneManager: |
    auditPolicyEnabled: true
      output: Stdout

How to deal with the audit log?

There must be some log scraper on master nodes (log-shipper, promtail, filebeat) that will monitor the log file:


The following fixed parameters of log rotation are in use:

  • The maximum disk space is limited to 1000 Mb.
  • Logs older than 7 days will be deleted.

Depending on the Policy settings and the number of requests to the apiserver, the amount of logs collected may be high. Thus, in some cases, logs can only be kept for less than 30 minutes.

Cautionary note

Note that the current implementation of this feature isn’t safe and may lead to a temporary failure of the control plane.

The apiserver will not be able to start if there are unsupported options or a typo in the Secret.

If apiserver is unable to start, you have to manually disable the --audit-log-* parameters in the /etc/kubernetes/manifests/kube-apiserver.yaml manifest and restart apiserver using the following command:

docker stop $(docker ps | grep kube-apiserver- | awk '{print $1}')
# Or (depending on your CRI).
crictl stopp $(crictl pods --name=kube-apiserver -q)

After the restart, you will be able to fix the Secret or delete it:

kubectl -n kube-system delete secret audit-policy

How do I speed up the restart of Pods if the connection to the node has been lost?

By default, a node is marked as unavailable if it does not report its state for 40 seconds. After another 5 minutes, its Pods will be rescheduled to other nodes. Thus, the overall application unavailability lasts approximately 6 minutes.

In specific cases, if an application cannot run in multiple instances, there is a way to lower its unavailability time:

  1. Reduce the period required for the node to become Unreachable if the connection to it is lost by setting the nodeMonitorGracePeriodSeconds parameter.
  2. Set a lower timeout for evicting Pods on a failed node using the failedNodePodEvictionTimeoutSeconds parameter.

An example

controlPlaneManager: |
  nodeMonitorGracePeriodSeconds: 10
  failedNodePodEvictionTimeoutSeconds: 50

In this case, if the connection to the node is lost, the applications will be restarted in about 1 minute.

Cautionary note

Both these parameters directly impact the CPU and memory resources consumed by the control plane. By lowering timeouts, we force system components to send statuses more frequently and check the resource state more often.

When deciding on the appropriate threshold values, consider resources consumed by the control nodes (graphs can help you with this). Note that the lower parameters are, the more resources you may need to allocate to these nodes.

How do make etcd backup?

Login into any control-plane node with root user and use next script:

#!/usr/bin/env bash

for pod in $(kubectl get pod -n kube-system -l component=etcd,tier=control-plane -o name); do
  if kubectl -n kube-system exec "$pod" -- sh -c "ETCDCTL_API=3 /usr/bin/etcdctl --cacert /etc/kubernetes/pki/etcd/ca.crt --cert /etc/kubernetes/pki/etcd/ca.crt --key /etc/kubernetes/pki/etcd/ca.key --endpoints snapshot save /tmp/${pod##*/}.snapshot" && \
  kubectl -n kube-system exec "$pod" -- gzip -c /tmp/${pod##*/}.snapshot | zcat > "${pod##*/}.snapshot" && \
  kubectl -n kube-system exec "$pod" -- sh -c "cd /tmp && sha256sum ${pod##*/}.snapshot" | sha256sum -c && \
  kubectl -n kube-system exec "$pod" -- rm "/tmp/${pod##*/}.snapshot"; then
    mv "${pod##*/}.snapshot" etcd-backup.snapshot

In the current directory etcd snapshot file etcd-backup.snapshot will be created from one of an etcd cluster members. From this file, you can restore the previous etcd cluster state in the future.

Also, we recommend making a backup of the /etc/kubernetes directory, which contains:

  • manifests and configurations of control-plane components;
  • Kubernetes cluster PKI. This directory will help to quickly restore a cluster in case of complete loss of control-plane nodes without creating a new cluster and without rejoin the remaining nodes into the new cluster.

We recommend encrypting etcd snapshot backups as well as backup of the directory /etc/kubernetes/ and saving them outside the Deckhouse cluster. You can use one of third-party files backup tools, for example: Restic, Borg, Duplicity, etc.

You can see here for learn about etcd disaster recovery procedures from snapshots.