Address

If the public domain template in the %s.example.com cluster, the web application can be accessed at https://commander.example.com.

Workspaces

Working with Deckhouse Commander entities is carried out within workspaces. Workspaces can be created and given a new name. In the future, access to workspaces will be possible to control in detail.

Users manage clusters, cluster templates and inventory within a workspace. Also, an API access token is issued within the workspace. The visibility of objects for such a token will be limited only to what is located in the workspace.

Cluster Management

We recommend installing Commander in a management cluster. This cluster should serve the purpose of centralized management and collecting information from the entire application infrastructure, including application clusters. We call clusters managed by Commander application clusters. Commander is the source of truth for cluster configuration. Next, we will look at how this is implemented in practice.

Cluster Status

Infrastructure

Cluster management is reduced to three types of operations: creation, deletion, and modification. At any given time, a cluster in Commander has one of the following «infrastructure statuses»:

  • New — a cluster configuration has been created in Commander, but the cluster itself is still waiting to be created.
  • Configuration Error — a configuration for the cluster has been created with errors in Commander, so the cluster itself will not be created.
  • In Creation — Commander is deploying the cluster.
  • Ready — the cluster is created, and the state of the infrastructure matches the configuration specified in Commander.
  • Changing — Commander brings the cluster state to the specified configuration.
  • Change Error, Creation Error, Deletion Error — internal or external errors that occurred during cluster management.
  • Archived — the cluster is no longer tracked by Commander; it has been previously deleted or left without Commander management.

The Commander performs operations asynchronously using tasks, based on which operations with the cluster are carried out.

Tasks and, consequently, operations can be installation, removal, change of the cluster, or verification of its configuration against the actual state. Operations are shown inside the cluster in the “cloud” tab (including static clusters). A log of the execution is available for each task. The result of the task execution determines the infrastructure status of the cluster.

Infrastructure operations are performed by the Cluster Manager component. The speed at which the Cluster Manager takes tasks for execution is determined by the number of clusters and the number of replicas of the Cluster Manager. If the total number of tasks significantly exceeds the number of Cluster Manager replicas, then operations on clusters will be delayed.

Kubernetes

In addition to its infrastructure status, a cluster also has a Kubernetes configuration status. It indicates whether the cluster complies with the configuration of manifests for Kubernetes. Resource manifests (simply “resources” hereafter) are part of the cluster configuration.

The state of Kubernetes configuration can have three statuses:

  • Configured: complete compliance
  • Not Configured: discrepancy between configuration and cluster state
  • No Data: configuration state data is outdated

The component installed within the application cluster, known as the Commander agent or commander-agent (hereafter simply “agent”), is responsible for ensuring that the cluster matches the given configuration for resources. The agent always tries to bring the cluster configuration into compliance with the specified one.

The agent connects to the Commander API and downloads resource manifests, then applies them. If a resource created by the agent is deleted in the application cluster, the agent will recreate it within a minute. If a resource is deleted from the cluster configuration, the agent removes the resource in the application cluster. If the agent cannot apply a resource for some reason, the Kubernetes status in Commander will be “not configured”.

In addition to synchronizing the resource configuration in Kubernetes, the agent provides Commander with telemetry data:

  • The current version of the Deckhouse Kubernetes Platform
  • Availability of an update to the latest version of the Deckhouse Kubernetes Platform
  • The Deckhouse Kubernetes platform update channel
  • Kubernetes version
  • Availability of system components
  • Alerts that require user attention (alerts, manual confirmation of node reboot, etc.)
  • Key cluster metrics: total CPU count, memory size, disk storage size, and total number of nodes.

Creation

Clusters are created based on cluster templates. To create a cluster, the user selects a template, fills in the input parameters of the template (these are provided by the template), and then clicks on the “install” button. This gives the cluster a configuration and binds it to the template, specifically to a specific version of the template. The template or the version can be changed.

As the user fills in the inputs, the cluster configuration is rendered as YAML. If errors are found in the configuration, the Commander interface will show them. If the user saves a new cluster with errors, its installation will not begin until the errors are corrected. In other words, the cluster will have the status “Configuration error,” and the installation task will not be created until the configuration is changed to be correct. Errors in cluster configuration can be caused by both template code and incorrectly filled input parameters.

Once the configuration becomes valid, an installation task for the cluster is created, after which the cluster manager creates the cluster. If the cluster is being created on pre-created machines, Commander configures the Deckhouse Kubernetes Platform components on them and then creates the specified Kubernetes resources. If the cloud platform or virtualization platform API is used, Commander creates the infrastructure before the steps mentioned above. The exact set of cloud resources depends on the cloud provider.

After successful cluster installation, Commander will periodically check its configuration. If the infrastructure configuration diverges from that declared in Commander, Commander will create a task to change the infrastructure to bring it to its declared state. The configuration discrepancy can occur on either the infrastructure side or the Commander side. In the first case, it means a change in the cloud API, for example, if something was manually changed in the cloud resource configuration. In the second case, it indicates a change in cluster configuration, which we will discuss in the next section.

Update

Changing the cluster configuration means that a new configuration has been saved to the cluster, different from the previous one. This may be due to changes in the input parameters of the current cluster template. It may also be due to moving the cluster to a new version of the template or even to a different template.

When the cluster configuration changes, Commander creates a task to change the cluster infrastructure. The agent brings the Kubernetes configuration to the desired state in parallel with the infrastructure change.

Cluster configuration changes can lead to destructive changes in the infrastructure. For example, this may be a change in virtual machines that require their deletion or recreation. Another example is a change in the composition of cloud availability zones. When Commander detects destructive changes, it does not enact those changes until the user confirms them.

Deletion

Deleting clusters in Commander can be achieved in two ways. Both methods are available in the cluster on equal terms.

The first method is clearing the infrastructure of the cluster. In this case, Commander creates the deletion task. Static resources are cleared of Deckhouse Kubernetes Platform components, and cloud resources are removed (e.g., virtual machines). After deletion, the cluster configuration does not disappear, and the cluster moves to the archive. Its configuration can be restored if needed, but the cluster will no longer be listed among active clusters. This distinguishes the archived cluster from the active one.

Another way to delete a cluster is manual deletion. Commander will move the cluster to the archive, but it will not clear the infrastructure. This method can be useful if Commander cannot handle the correct deletion of the cluster by the first method for some reason. In that case, the cluster will have a “deletion error” status. The user will have to manually clean up the resources occupied by Deckhouse Kubernetes Platform, and move the cluster to archive manually.

Cluster Configuration

Cluster configuration consists of several sections:

Section Type Purpose
Input Parameters Scheme Scheme of template input parameters
Kubernetes YAML Template Kubernetes configuration
ClusterConfiguration
Placement YAML Template Infrastructure configuration
<Provider>ClusterConfiguration or StaticClusterConfiguration
SSH Parameters YAML Template SSH connection to the master nodes
Resources YAML Template Cluster resources, including ModuleConfig except system ones
Primary Resources YAML Template Cluster resources, including ModuleConfig except system ones
Start-up Configuration YAML Template Installation configuration
InitConfiguration and system ModuleConfig

Cluster Parameters

This is a template user configuration. See Input Parameters.

Kubernetes

Settings for the Kubernetes version, pod and service subnets. See ClusterConfiguration.

Placement

Features of cluster placement in the infrastructure. Here, for a static cluster, the configuration may remain empty.

For cloud clusters, specify the features of access to the cloud API, nodes that will be created automatically and tracked (including master nodes), settings for availability zones, etc.

SSH configuration

apiVersion: dhctl.deckhouse.io/v1
kind: SSHConfig

sshBastionHost: 10.1.2.3              # Bastion host is optional.
sshBastionPort: 2233
sshBastionUser: debian

sshUser: ubuntu
sshPort: 22
sshAgentPrivateKeys:                  # The list of private keys,
- key: |                            # at least one key is required
    -----BEGIN RSA PRIVATE KEY-----
    .............................
    -----END RSA PRIVATE KEY-----
  passphrase: qwerty123             # Key password, optional

sshExtraArgs: -vvv                    # Extra arguments for SSH command

---

apiVersion: dhctl.deckhouse.io/v1     # Target hosts.
kind: SSHHost                         # Commonly there are 1 or 3 hosts
host: 172.16.0.1                      # to be used as control plane nodes
---
apiVersion: dhctl.deckhouse.io/v1
kind: SSHHost
host: 172.16.0.2
---
apiVersion: dhctl.deckhouse.io/v1
kind: SSHHost
host: 172.16.0.3

Resources

Arbitrary manifests of Kubernetes and Deckhouse resources, except for the settings of built-in modules of the Deckhouse Kubernetes Platform. The Commander will synchronize these resources.

Primary Resources

Arbitrary manifests of Kubernetes and Deckhouse resources, except for the settings of the built-in Deckhouse Kubernetes Platform modules. The Commander will not synchronize these resources.

Initial Configuration

This section specifies the registration and access to it (see InitConfiguration). Also in this section, the settings of Deckhouse Kubernetes Platform (DKP) built-in modules are specified, for example, the template of the domain for service web interfaces, TLS certificate settings, or the channel of updates.

Templates

Commander is designed to manage uniform clusters. Since all cluster configuration sections are in YAML format, cluster templating is the process of marking up the desired YAML configuration with parameters and describing the schema for these parameters. For template markup of YAML, the go template syntax and sprig function set are used. To describe the schema of input parameters, a custom syntax similar to OpenAPI3 is used, but it is simpler.

The cluster configuration is created by substituting the input parameters into the section templates. The input parameters are validated by the schema defined for them. The schema of the input parameters in the Commander web application can be set using both text configuration and a visual form designer. Read about the input parameters in the section on working with templates.

Templates have versions. When a template is updated, a new version of the template is created. The previous version of the template remains available for use in clusters. However, the template author can make the template versions unavailable for use.

Each cluster in Commander has a configuration that was obtained from the template (unless the cluster was already imported). The cluster also “remembers” on the basis of which template and which version it is configured. Thanks to this binding, a set of cluster input parameters is displayed in the cluster as a web form from a given version of a given template.

When a cluster is transferred to a new template or a new version of a template, the set of input parameters may change. This may include the appearance of mandatory parameters that were not filled initially and do not have default values. Then, when switching from one template (version) to another, it may be necessary to change or supplement the input parameters so that the new configuration is created correctly.

Inside the template interface, there is a list of clusters whose configuration is based on this template at the moment. From this interface, you can switch many clusters to a new (or old) version of the template in just a few clicks. This operation will fail if the cluster configuration contains errors. This can also happen because there may be missing mandatory input parameters that are not provided on the current version of the template but are present in the new one.

Creating and maintaining a template can be a laborious engineering task that requires testing the installation and updating of clusters. Versions of templates may accumulate during this work.

To make it easier to navigate through the versions, Commander provides an option to leave a comment for versions. There is also an option to hide template versions from template users. This can be useful to protect the user from a knowingly non-working version of the template.

Inventory, Catalogs

Catalogs and records

In some cases, in clusters, it is necessary to use the same data repeatedly. For example, for many clusters, you can provide the option to choose a release channel for updating the Deckhouse Kubernetes Platform or the address of the container registry from which images will be fetched.

To avoid having to fix such data in templates, use Inventory. Inventory is a collection of catalogs with data. Each catalog defines a data schema, after which the catalog is populated with records. The records are validated against the specified data schema.

When creating a catalog, you can choose how to use the records:

  1. A record in the catalog can be used simultaneously in several clusters.
  2. A record in the catalog can only be used in one cluster; deleting or detaching a cluster frees up the record for use in other clusters.

The first option is suitable for reusable configuration. The second option is for using pre-prepared infrastructure. This can include dedicated subnets, pre-created load balancers, virtual machines, domain names, IP addresses, and so on. It is convenient to prepare such data in advance and track whether they are being used and, if so, in which clusters.

During catalog creation, the user specifies the name of the catalog, the schema, and the identifier. The identifier cannot be changed, while the catalog name can be changed at any time. The data schema can only be changed if there are no records in the catalog that are used in any cluster.

The data schema for the catalog is defined by the same syntax and visual constructor as the input parameters for the cluster template. An example of a catalog schema:

- key: hostname
  type: string
  title: Hostname
  unique: true
  pattern: ^[a-z0-9.-]+$
  identifier: true

- key: ip
  type: string
  title: IP Address
  format: ipv4
  unique: true
  identifier: true

How to use a catalog in a cluster

In the cluster template, you need to indicate that the field is a selection from the catalog, for this, use its identifier. Example of a parameter:

- key: workerMachines     # parameter name in the template
  title: Workers
  catalog: worker-nodes   # the catalog identifier
  minItems: 1
  maxItems: 10

Even though specific catalog is defined in the template input parameters, when in cluster, the catalog might be switched to any other catalog accessible in the workspace.

Importing data for catalogs

Catalogs can be imported via API or through the interface by uploading a JSON file. If the identifier of an existing catalog is specified in this file, then records during import will be added to it regardless of compliance with the data schema. The data schema will not be overwritten if the catalog already exists. An example of a catalog file with records that can be imported:

{
  "name": "Рабочие хосты",
  "slug": "worker-nodes",
  "params": [
    {
      "key": "hostname",
      "type": "string",
      "title": "Имя хоста",
      "unique": true,
      "pattern": "^[a-z0-9.-]+$",
      "identifier": true
    },
    {
      "key": "ip",
      "type": "string",
      "title": "IP-адрес",
      "format": "ipv4",
      "unique": true,
      "identifier": true
    }
  ],
  "resources": [
    { "values": { "ip": "10.128.0.39", "hostname": "worker-1" } },
    { "values": { "ip": "10.128.0.47", "hostname": "worker-2" } },
    { "values": { "ip": "10.128.0.24", "hostname": "worker-3" } },
    { "values": { "ip": "10.128.0.17", "hostname": "worker-4" } },
    { "values": { "ip": "10.128.0.55", "hostname": "worker-5" } },
    { "values": { "ip": "10.128.0.49", "hostname": "worker-6" } }
  ]
}

Cluster migration between workspaces

Clusters are created within a workspace. However, it is possible to transfer the created cluster from one workspace to another. During the transfer, the cluster will be detached from its template, and the template will remain in the original workspace. The inventory used in the cluster will be transferred or copied to the new cluster workspace, depending on the mode of using records: exclusively used records will be transferred, and non-exclusively used ones will be copied. At the same time, missing directories with the correct identifier will be created in the new workspace.

Integration API and Tokens

The Commander API provides a limited set of actions:

  1. Create, change, and delete clusters
  2. Create, change and delete resources in catalogs
  3. Read templates
  4. Read resource catalogs

To access the API in Commander, you can issue a token. The token can have either rights to all possible operations in the API or only read rights.

Details of the API implementation are described in the Integration API section.

Audit

For all entities, Commander keeps a history of changes. Clusters, templates, resources, catalogs, API access tokens - for all of them, a history of actions and changes is recorded, which can be used to track who, when and what actions were performed in Commander.

Currently, this functionality only relates to work related to the Commander API. In the future, an audit log from application clusters will be available in Commander.