Instance

Scope: Cluster
Version: v1alpha1

Describes an implementation-independent ephemeral machine resource.

NodeGroup

Scope: Cluster

Describes the runtime parameters of the node group.

Example:

# NodeGroup for cloud nodes in AWS.
apiVersion: deckhouse.io/v1
kind: NodeGroup
metadata:
  name: test
spec:
  nodeType: CloudEphemeral
  cloudInstances:
    zones:
      - eu-west-1a
      - eu-west-1b
    minPerZone: 1
    maxPerZone: 2
    classReference:
      kind: AWSInstanceClass
      name: test
  nodeTemplate:
    labels:
      tier: test
---
# NodeGroup for static nodes on bare metal servers (or VMs).
apiVersion: deckhouse.io/v1
kind: NodeGroup
metadata:
  name: worker
spec:
  nodeType: Static
  • metadata
    object
    • metadata.name
      string

      Pattern: ^[a-z0-9]([-a-z0-9]*[a-z0-9])?$

      Maximum length: 42

  • spec
    object

    Required value

    • spec.chaos
      object

      Chaos monkey settings.

      Example:

      chaos:
        mode: DrainAndDelete
        period: 24h
      
      • spec.chaos.mode
        string

        The chaos monkey mode:

        • DrainAndDelete — drains and deletes a node when triggered;
        • Disabled — leaves this NodeGroup intact.

        Default: "Disabled"

        Allowed values: Disabled, DrainAndDelete

      • spec.chaos.period
        string

        The time interval to use for the chaos monkey.

        It is specified as a string containing the time unit in hours and minutes: 30m, 1h, 2h30m, 24h.

        Default: "6h"

        Pattern: ^([0-9]+h([0-9]+m)?|[0-9]+m)$

    • spec.cloudInstances
      object

      Parameter for provisioning the cloud-based VMs.

      Caution! Can only be used together with nodeType: CloudEphemeral.

      • spec.cloudInstances.classReference
        object

        Required value

        The reference to the InstanceClass object. It is unique for each cloud-provider-* module.

        • spec.cloudInstances.classReference.kind
          string

          The object type (e.g., OpenStackInstanceClass). The object type is specified in the documentation of the corresponding cloud-provider- module.

          Allowed values: OpenStackInstanceClass, GCPInstanceClass, VsphereInstanceClass, AWSInstanceClass, YandexInstanceClass, AzureInstanceClass, VCDInstanceClass, ZvirtInstanceClass

        • spec.cloudInstances.classReference.name
          string

          The name of the required InstanceClass object (e.g., finland-medium).

      • spec.cloudInstances.maxPerZone
        integer

        Required value

        The maximum number of instances for the group in each zone.

        This value is used as the upper bound in cluster-autoscaler.

        Allowed values: 0 <= X

      • spec.cloudInstances.maxSurgePerZone
        integer

        The maximum number of instances to rollout simultaneously in the group in each zone.

        Default: 1

        Allowed values: 0 <= X

      • spec.cloudInstances.maxUnavailablePerZone
        integer

        The maximum number of unavailable instances (during rollout) in the group in each zone.

        Default: 0

        Allowed values: 0 <= X

      • spec.cloudInstances.minPerZone
        integer

        Required value

        The minimum number of instances for the group in each zone.

        This value is used in the MachineDeployment object and as a lower bound in cluster-autoscaler.

        Allowed values: 0 <= X

      • spec.cloudInstances.priority
        integer

        Priority of the node group.

        When scaling a cluster, the autoscaler will first select node groups with a higher priority set. If several node groups have the same priority, the autoscaler randomly selects a group of them.

        Using priorities can be convenient to prefer ordering cheaper nodes (for example, spot instances) over more expensive ones.

      • spec.cloudInstances.quickShutdown
        boolean

        Lowers CloudEphemeral machine drain timeout to 5 minutes.

      • spec.cloudInstances.standby
        integer or string

        The summary number of overprovisioned nodes for this NodeGroup in all zones.

        An overprovisioned node is a cluster node on which resources are reserved that are available at any time for scaling. The presence of such a node allows the cluster autoscaler not to wait for node initialization (which may take several minutes), but to immediately place a load on it.

        The value can be an absolute number (for example, 2) or a percentage of desired nodes (for example, 10%). If a percentage is specified, the absolute number is calculated based on the percentage of the maximum number of nodes (the maxPerZone parameter) rounded down, but not less than one.

        Pattern: ^[0-9]+%?$

      • spec.cloudInstances.standbyHolder
        object

        Amount of reserved resources.

        Used to determine whether to order overprovisioned nodes.

        • spec.cloudInstances.standbyHolder.notHeldResources
          Deprecated
          object

          Deprecated: the parameter is no longer used. Use the overprovisioningRate parameter.

          Describes the resources that will not be held (consumed) by the standby holder.

          • spec.cloudInstances.standbyHolder.notHeldResources.cpu
            integer or string

            Describes the amount of CPU that will not be held by standby holder on Nodes from this NodeGroup.

            The value can be an absolute number of cpus (for example, 2) as well as a milli representation (for example, 1500m).

            Pattern: ^[0-9]+m?$

          • spec.cloudInstances.standbyHolder.notHeldResources.memory
            integer or string

            Describes the amount of memory that will not be held by standby holder on Nodes from this NodeGroup.

            The value can be an absolute number of bytes (for example, 128974848) as well as a fixed-point number using one of memory suffixes: G, Gi, M, Mi.

            Pattern: ^[0-9]+(\.[0-9]+)?(E|P|T|G|M|K|Ei|Pi|Ti|Gi|Mi|Ki)?$

        • spec.cloudInstances.standbyHolder.overprovisioningRate
          integer

          Percentage of reserved resources calculated from the capacity of a node of a NodeGroup.

          Default: 50

          Allowed values: 1 <= X <= 80

      • spec.cloudInstances.zones
        array of strings

        List of availability zones to create instances in.

        The default value depends on the cloud provider selected and usually corresponds to all zones of the region being used.

        Example:

        zones:
        - Helsinki
        - Espoo
        - Tampere
        
    • spec.cri
      object

      Container runtime parameters.

      • spec.cri.containerd
        object

        Containerd runtime parameters.

        If used, cri.type must be set to Containerd.

        • spec.cri.containerd.maxConcurrentDownloads
          integer

          Set the max concurrent downloads for each pull.

          Default: 3

      • spec.cri.docker
        Deprecated
        object

        Docker settings for nodes.

        • spec.cri.docker.manage
          boolean

          Enable Docker maintenance from bashible.

          Default: true

        • spec.cri.docker.maxConcurrentDownloads
          integer

          Set the max concurrent downloads for each pull.

          Default: 3

      • spec.cri.notManaged
        object

        Settings for not managed CRI for nodes.

        • spec.cri.notManaged.criSocketPath
          string

          Path to CRI socket.

      • spec.cri.type
        string

        Container runtime type.

        Value defaultCRI from the initial cluster configration (cluster-configuration.yaml parameter from the d8-cluster-configuration secret in the kube-system namespace) is used if not specified.

        Note! The Docker is deprecated.

        Allowed values: Docker, Containerd, NotManaged

    • spec.disruptions
      object

      Disruptions settings for nodes.

      Example:

      disruptions:
        approvalMode: Automatic
        automatic:
          drainBeforeApproval: false
          windows:
          - from: '06:00'
            to: '08:00'
            days:
            - Tue
            - Sun
      
      • spec.disruptions.approvalMode
        string

        The approval mode for disruptive updates:

        • Manual — disable automatic disruption approval; the alert will be displayed if disruption is needed. Caution! The master node group update mode must be Manual to avoid issues with draining.
        • Automatic — automatically approve disruption-involving updates.
        • RollingUpdate — in this mode, a new node with new settings will be created; then, the old node will be deleted. Available only for cloud nodes.

        If the RollingUpdate mode is not used, when updating, the node is first drained and then updated (rebooted) and put back into operation (uncordoned). Note that in this case, the cluster must have sufficient resources to accommodate the load while the node being updated is unavailable. In the RollingUpdate mode, the node is replaced by the updated node, i.e., an extra node appears in the cluster for the duration of the update. In cloud infrastructures, the RollingUpdate mode is convenient, for example, if there are no resources in the cluster to temporarily host the load from the node being updated.

        Default: "Automatic"

        Allowed values: Manual, Automatic, RollingUpdate

      • spec.disruptions.automatic
        object

        Additional parameters for the Automatic mode.

        • spec.disruptions.automatic.drainBeforeApproval
          boolean

          Drain Pods from the nodes before approving disruption.

          Caution! This setting ignores (nodes will be approved without draining Pods):

          • for the nodeGroup master with a single node;
          • for a single ready node in a nodeGroup picked out for Deckhouse placement.

          Default: true

        • spec.disruptions.automatic.windows
          array of objects

          Time windows for node disruptive updates.

          • spec.disruptions.automatic.windows.days
            array of strings

            Days of the week when node could be updated.

            Examples:

            days: Mon
            
            days: Wed
            
            • Element of the array
              string

              Day of the week.

              Allowed values: Mon, Tue, Wed, Thu, Fri, Sat, Sun

          • spec.disruptions.automatic.windows.from
            string

            Required value

            Start time of disruptive update window (UTC timezone).

            Pattern: ^(?:\d|[01]\d|2[0-3]):[0-5]\d$

            Example:

            from: '13:00'
            
          • spec.disruptions.automatic.windows.to
            string

            Required value

            End time of disruptive update window (UTC timezone).

            Pattern: ^(?:\d|[01]\d|2[0-3]):[0-5]\d$

            Example:

            to: '18:30'
            
      • spec.disruptions.rollingUpdate
        object

        Additional parameters for the RollingUpdate mode.

        • spec.disruptions.rollingUpdate.windows
          array of objects

          Time windows for node disruptive updates.

          • spec.disruptions.rollingUpdate.windows.days
            array of strings

            Days of the week when node could be updated.

            Examples:

            days: Mon
            
            days: Wed
            
            • Element of the array
              string

              Day of the week.

              Allowed values: Mon, Tue, Wed, Thu, Fri, Sat, Sun

          • spec.disruptions.rollingUpdate.windows.from
            string

            Required value

            Start time of disruptive update window (UTC timezone).

            Pattern: ^(?:\d|[01]\d|2[0-3]):[0-5]\d$

            Example:

            from: '13:00'
            
          • spec.disruptions.rollingUpdate.windows.to
            string

            Required value

            End time of disruptive update window (UTC timezone).

            Pattern: ^(?:\d|[01]\d|2[0-3]):[0-5]\d$

            Example:

            to: '18:30'
            
    • spec.fencing
      object

      Enable fencing controller for this group.

      • spec.fencing.mode
        string

        Required value

        Supported fencing modes:

        • Watchdog - use watchdog kernel module for fencing.

        The Watchdog implementation includes two main components:

        1. Fencing-agent is a DaemonSet that is deployed on a specific group of nodes (nodegroup). Once started, the agent activates Watchdog and sets a special label node-manager.deckhouse.io/fencing-enabled on the node where it is functioning. The agent regularly checks if the Kubernetes API is available. If the API is available, the agent sends a signal to Watchdog, which resets the watchdog timer. The agent also monitors special service labels on the node and, depending on their presence, enables or disables the Watchdog. A softdog kernel module with certain parameters (soft_margin=60 and soft_panic=1) is used as Watchdog. This means that the watchdog timeout time is 60 seconds. After this time expires, kernel-panic occurs and the node remains in this state until the user reboots it.

        2. Fencing-controller - A controller that monitors all nodes with the node-manager.deckhouse.io/fencing-enabled label set. If any node becomes unavailable for more than 60 seconds, the controller removes all pods from that node and then removes the node itself.

        Allowed values: Watchdog

    • spec.kubelet
      object

      Kubelet settings for nodes.

      • spec.kubelet.containerLogMaxFiles
        integer

        How many rotated log files to store before deleting them.

        Default: 4

        Allowed values: 1 <= X <= 20

      • spec.kubelet.containerLogMaxSize
        string

        Maximum log file size before it is rotated.

        Default: "50Mi"

        Pattern: \d+[Ei|Pi|Ti|Gi|Mi|Ki|E|P|T|G|M|k|m]

      • spec.kubelet.maxPods
        integer

        Set the max count of pods per node.

        Default: 110

      • spec.kubelet.resourceReservation
        object

        Management of resource reservation for system daemons on a node.

        More info in the Kubernetes documentation.

        • spec.kubelet.resourceReservation.mode
          string

          Specify whether to:

          • Off — Disable resource reservation.
          • Auto — Reserve resources based on the Node capacity.
          • Static — Provide your own resource reservation values via the static parameter.

          Note that currently we do not use a dedicated cgroup for resource reservation (-system-reserved-cgroup is not used).

          Default: "Auto"

        • spec.kubelet.resourceReservation.static
          object

          Resource reservation parameters for the ‘Static’ mode.

          • spec.kubelet.resourceReservation.static.cpu
            integer or string

            Pattern: \d+[m]

          • spec.kubelet.resourceReservation.static.ephemeralStorage
            integer or string

            Pattern: \d+[Ei|Pi|Ti|Gi|Mi|Ki|E|P|T|G|M|k|m]

          • spec.kubelet.resourceReservation.static.memory
            integer or string

            Pattern: \d+[Ei|Pi|Ti|Gi|Mi|Ki|E|P|T|G|M|k|m]

      • spec.kubelet.rootDir
        string

        Directory path for managing kubelet files (volume mounts,etc).

        Default: "/var/lib/kubelet"

    • spec.nodeTemplate
      object

      Specification of some of the fields that will be maintained in all nodes of the group.

      Example:

      nodeTemplate:
        labels:
          environment: production
          app: warp-drive-ai
        annotations:
          ai.fleet.com/discombobulate: 'true'
        taints:
        - effect: NoExecute
          key: ship-class
          value: frigate
      
      • spec.nodeTemplate.annotations
        object

        Similar to the standard metadata.annotations field.

        Example:

        annotations:
          ai.fleet.com/discombobulate: 'true'
        
      • spec.nodeTemplate.labels
        object

        Similar to the standard metadata.labels field.

        Example:

        labels:
          environment: production
          app: warp-drive-ai
        
      • spec.nodeTemplate.taints
        array of objects

        Similar to the .spec.taints field of the Node object.

        Caution! Only effect, key, value fields are available.

        Example:

        taints:
        - effect: NoExecute
          key: ship-class
          value: frigate
        
        • spec.nodeTemplate.taints.effect
          string

          Allowed values: NoSchedule, PreferNoSchedule, NoExecute

        • spec.nodeTemplate.taints.key
          string
        • spec.nodeTemplate.taints.value
          string
    • spec.nodeType
      string

      Required value

      The type of nodes this group provides:

      • CloudEphemeral — nodes for this group will be automatically created (and deleted) in the cloud of the specified cloud provider;
      • CloudPermanent — nodes from ProviderClusterConfiguration will be created via dhctl;
      • CloudStatic — a static node (created manually or using any external tools) hosted in the cloud integrated with one of the cloud providers. This node has the CSI running, and it is managed by the cloud-controller-manager: the Node object automatically gets the information about the zone and region based on the cloud data; if a node gets deleted from the cloud, its corresponding Node object will be deleted in Kubernetes;
      • Static — a static node hosted on a bare metal or virtual machine. The cloud-controller-manager does not manage the node even if one of the cloud providers is enabled.

      Allowed values: CloudEphemeral, CloudPermanent, CloudStatic, Static

    • spec.operatingSystem
      object

      Operating System settings for nodes.

      • spec.operatingSystem.manageKernel
        Deprecated
        boolean

        This parameter has no effect. Earlier, it enabled kernel maintenance on behalf of bashible.

        Default: true

    • spec.staticInstances
      object

      Parameter for provisioning static machines to the cluster.

      • spec.staticInstances.count
        integer

        The number of instances to create.

        Default: 0

        Allowed values: 0 <= X

      • spec.staticInstances.labelSelector
        object

        A label selector is a label query over a set of resources. The result of matchLabels and matchExpressions are ANDed. An empty label selector matches all objects. A null label selector matches no objects.

        • spec.staticInstances.labelSelector.matchExpressions
          array of objects

          matchExpressions is a list of label selector requirements. The requirements are ANDed.

          A label selector requirement is a selector that contains values, a key, and an operator that relates the key and values.

          • spec.staticInstances.labelSelector.matchExpressions.key
            string

            key is the label key that the selector applies to.

          • spec.staticInstances.labelSelector.matchExpressions.operator
            string

            operator represents a key’s relationship to a set of values. Valid operators are In, NotIn, Exists and DoesNotExist.

          • spec.staticInstances.labelSelector.matchExpressions.values
            array of strings

            values is an array of string values. If the operator is In or NotIn, the values array must be non-empty. If the operator is Exists or DoesNotExist, the values array must be empty. This array is replaced during a strategic merge patch.

            • Element of the array
              string

              Pattern: [a-z0-9]([-a-z0-9]*[a-z0-9])?

              Length: 1..63

        • spec.staticInstances.labelSelector.matchLabels
          object

          matchLabels is a map of {key,value} pairs. A single {key,value} in the matchLabels map is equivalent to an element of matchExpressions, whose key field is “key”, the operator is “In”, and the values array contains only “value”. The requirements are ANDed.

    • spec.update
      object
      • spec.update.maxConcurrent
        integer or string

        Maximum number of concurrently updating nodes.

        Can be set as absolute count or as a percent of total nodes.

        Default: 1

        Pattern: ^[1-9][0-9]*%?$

Describes the runtime parameters of the node group.

  • metadata
    object
    • metadata.name
      string

      Pattern: ^[a-z0-9]([-a-z0-9]*[a-z0-9])?$

      Maximum length: 42

  • spec
    object

    Required value

    • spec.chaos
      object

      Chaos monkey settings.

      Example:

      chaos:
        mode: DrainAndDelete
        period: 24h
      
      • spec.chaos.mode
        string

        The chaos monkey mode:

        • DrainAndDelete — drains and deletes a node when triggered;
        • Disabled — leaves this NodeGroup intact.

        Default: "Disabled"

        Allowed values: Disabled, DrainAndDelete

      • spec.chaos.period
        string

        The time interval to use for the chaos monkey (can be specified in the Go format).

        Default: "6h"

        Pattern: ^[0-9]+[mh]{1}$

    • spec.cloudInstances
      object

      Parameter for provisioning the cloud-based VMs.

      Caution! Can only be used together with nodeType: CloudEphemeral.

      • spec.cloudInstances.classReference
        object

        Required value

        The reference to the InstanceClass object. It is unique for each cloud-provider-* module.

        • spec.cloudInstances.classReference.kind
          string

          The object type (e.g., OpenStackInstanceClass). The object type is specified in the documentation of the corresponding cloud-provider- module.

          Allowed values: OpenStackInstanceClass, GCPInstanceClass, VsphereInstanceClass, AWSInstanceClass, YandexInstanceClass, AzureInstanceClass, VCDInstanceClass, ZvirtInstanceClass

        • spec.cloudInstances.classReference.name
          string

          The name of the required InstanceClass object (e.g., finland-medium).

      • spec.cloudInstances.maxPerZone
        integer

        Required value

        The maximum number of instances for the group in each zone.

        This value is used as the upper bound in cluster-autoscaler.

        Allowed values: 0 <= X

      • spec.cloudInstances.maxSurgePerZone
        integer

        The maximum number of instances to rollout simultaneously in the group in each zone.

        Default: 1

        Allowed values: 0 <= X

      • spec.cloudInstances.maxUnavailablePerZone
        integer

        The maximum number of unavailable instances (during rollout) in the group in each zone.

        Default: 0

        Allowed values: 0 <= X

      • spec.cloudInstances.minPerZone
        integer

        Required value

        The minimum number of instances for the group in each zone.

        This value is used in the MachineDeployment object and as a lower bound in cluster-autoscaler.

        Allowed values: 0 <= X

      • spec.cloudInstances.standby
        integer or string

        The summary number of overprovisioned nodes for this NodeGroup in all zones.

        An overprovisioned node is a cluster node on which resources are reserved that are available at any time for scaling. The presence of such a node allows the cluster autoscaler not to wait for node initialization (which may take several minutes), but to immediately place a load on it.

        The value can be an absolute number (for example, 2) or a percentage of desired nodes (for example, 10%). If a percentage is specified, the absolute number is calculated based on the percentage of the maximum number of nodes (the maxPerZone parameter) rounded down, but not less than one.

        Pattern: ^[0-9]+%?$

      • spec.cloudInstances.standbyHolder
        object

        Amount of reserved resources.

        Used to determine whether to order overprovisioned nodes.

        • spec.cloudInstances.standbyHolder.notHeldResources
          object

          Describes the resources that will not be held (consumed) by the standby holder.

          • spec.cloudInstances.standbyHolder.notHeldResources.cpu
            integer or string

            Describes the amount of CPU that will not be held by standby holder on Nodes from this NodeGroup.

            The value can be an absolute number of cpus (for example, 2) as well as a milli representation (for example, 1500m).

            Pattern: ^[0-9]+m?$

          • spec.cloudInstances.standbyHolder.notHeldResources.memory
            integer or string

            Describes the amount of memory that will not be held by standby holder on Nodes from this NodeGroup.

            The value can be an absolute number of bytes (for example, 128974848) as well as a fixed-point number using one of memory suffixes: G, Gi, M, Mi.

            Pattern: ^[0-9]+(\.[0-9]+)?(E|P|T|G|M|K|Ei|Pi|Ti|Gi|Mi|Ki)?$

      • spec.cloudInstances.zones
        array of strings

        List of availability zones to create instances in.

        The default value depends on the cloud provider selected and usually corresponds to all zones of the region being used.

        Example:

        zones:
        - Helsinki
        - Espoo
        - Tampere
        
    • spec.cri
      object

      Container runtime parameters.

      • spec.cri.containerd
        object

        Containerd runtime parameters.

        If used, cri.type must be set to Containerd.

        • spec.cri.containerd.maxConcurrentDownloads
          integer

          Set the max concurrent downloads for each pull.

          Default: 3

      • spec.cri.docker
        object

        Docker settings for nodes.

        Note! the Docker is deprecated.

        • spec.cri.docker.manage
          boolean

          Enable Docker maintenance from bashible.

          Default: true

        • spec.cri.docker.maxConcurrentDownloads
          integer

          Set the max concurrent downloads for each pull.

          Default: 3

      • spec.cri.notManaged
        object

        Settings for not managed CRI for nodes.

        • spec.cri.notManaged.criSocketPath
          string

          Path to CRI socket.

      • spec.cri.type
        string

        Container runtime type.

        Value defaultCRI from the initial cluster configration (cluster-configuration.yaml parameter from the d8-cluster-configuration secret in the kube-system namespace) is used if not specified.

        Note! the Docker is deprecated.

        Allowed values: Docker, Containerd, NotManaged

    • spec.disruptions
      object

      Disruptions settings for nodes.

      Example:

      disruptions:
        approvalMode: Automatic
        automatic:
          drainBeforeApproval: false
          windows:
          - from: '06:00'
            to: '08:00'
            days:
            - Tue
            - Sun
      
      • spec.disruptions.approvalMode
        string

        The approval mode for disruptive updates:

        • Manual — disable automatic disruption approval; the alert will be displayed if disruption is needed. Caution! The master node group update mode must be Manual to avoid issues with draining.
        • Automatic — automatically approve disruption-involving updates.
        • RollingUpdate — in this mode, a new node with new settings will be created; then, the old node will be deleted. Available only for cloud nodes.

        If the RollingUpdate mode is not used, when updating, the node is first drained and then updated (rebooted) and put back into operation (uncordoned). Note that in this case, the cluster must have sufficient resources to accommodate the load while the node being updated is unavailable. In the RollingUpdate mode, the node is replaced by the updated node, i.e., an extra node appears in the cluster for the duration of the update. In cloud infrastructures, the RollingUpdate mode is convenient, for example, if there are no resources in the cluster to temporarily host the load from the node being updated.

        Default: "Automatic"

        Allowed values: Manual, Automatic, RollingUpdate

      • spec.disruptions.automatic
        object

        Additional parameters for the Automatic mode.

        • spec.disruptions.automatic.drainBeforeApproval
          boolean

          Drain Pods from the nodes before approving disruption.

          Caution! This setting ignores (nodes will be approved without draining Pods):

          • for the nodeGroup master with a single node;
          • for a single ready node in a nodeGroup picked out for Deckhouse placement.

          Default: true

        • spec.disruptions.automatic.windows
          array of objects

          Time windows for node disruptive updates.

          • spec.disruptions.automatic.windows.days
            array of strings

            Days of the week when node could be updated.

            Examples:

            days: Mon
            
            days: Wed
            
            • Element of the array
              string

              Day of the week.

              Allowed values: Mon, Tue, Wed, Thu, Fri, Sat, Sun

          • spec.disruptions.automatic.windows.from
            string

            Required value

            Start time of disruptive update window (UTC timezone).

            Pattern: ^(?:\d|[01]\d|2[0-3]):[0-5]\d$

            Example:

            from: '13:00'
            
          • spec.disruptions.automatic.windows.to
            string

            Required value

            End time of disruptive update window (UTC timezone).

            Pattern: ^(?:\d|[01]\d|2[0-3]):[0-5]\d$

            Example:

            to: '18:30'
            
      • spec.disruptions.rollingUpdate
        object

        Additional parameters for the RollingUpdate mode.

        • spec.disruptions.rollingUpdate.windows
          array of objects

          Time windows for node disruptive updates.

          • spec.disruptions.rollingUpdate.windows.days
            array of strings

            Days of the week when node could be updated.

            Examples:

            days: Mon
            
            days: Wed
            
            • Element of the array
              string

              Day of the week.

              Allowed values: Mon, Tue, Wed, Thu, Fri, Sat, Sun

          • spec.disruptions.rollingUpdate.windows.from
            string

            Required value

            Start time of disruptive update window (UTC timezone).

            Pattern: ^(?:\d|[01]\d|2[0-3]):[0-5]\d$

            Example:

            from: '13:00'
            
          • spec.disruptions.rollingUpdate.windows.to
            string

            Required value

            End time of disruptive update window (UTC timezone).

            Pattern: ^(?:\d|[01]\d|2[0-3]):[0-5]\d$

            Example:

            to: '18:30'
            
    • spec.kubelet
      object

      Kubelet settings for nodes.

      • spec.kubelet.containerLogMaxFiles
        integer

        How many rotated log files to store before deleting them.

        WARNING! This parameter does nothing if CRI type is Docker.

        Default: 4

        Allowed values: 1 <= X <= 20

      • spec.kubelet.containerLogMaxSize
        string

        Maximum log file size before it is rotated.

        WARNING! This parameter does nothing if CRI type is Docker.

        Default: "50Mi"

        Pattern: \d+[Ei|Pi|Ti|Gi|Mi|Ki|E|P|T|G|M|k|m]

      • spec.kubelet.maxPods
        integer

        Set the max count of pods per node.

        Default: 110

      • spec.kubelet.rootDir
        string

        Directory path for managing kubelet files (volume mounts,etc).

        Default: "/var/lib/kubelet"

    • spec.nodeTemplate
      object

      Specification of some of the fields that will be maintained in all nodes of the group.

      Example:

      nodeTemplate:
        labels:
          environment: production
          app: warp-drive-ai
        annotations:
          ai.fleet.com/discombobulate: 'true'
        taints:
        - effect: NoExecute
          key: ship-class
          value: frigate
      
      • spec.nodeTemplate.annotations
        object

        Similar to the standard metadata.annotations field.

        Example:

        annotations:
          ai.fleet.com/discombobulate: 'true'
        
      • spec.nodeTemplate.labels
        object

        Similar to the standard metadata.labels field.

        Example:

        labels:
          environment: production
          app: warp-drive-ai
        
      • spec.nodeTemplate.taints
        array of objects

        Similar to the .spec.taints field of the Node object.

        Caution! Only effect, key, value fields are available.

        Example:

        taints:
        - effect: NoExecute
          key: ship-class
          value: frigate
        
        • spec.nodeTemplate.taints.effect
          string

          Allowed values: NoSchedule, PreferNoSchedule, NoExecute

        • spec.nodeTemplate.taints.key
          string
        • spec.nodeTemplate.taints.value
          string
    • spec.nodeType
      string

      Required value

      The type of nodes this group provides.

      • Cloud — nodes for this group will be automatically created (and deleted) in the cloud of the specified cloud provider;
      • Static — a static node hosted on a bare metal or virtual machine. The cloud-controller-manager does not manage the node even of one of the cloud providers is enabled;
      • Hybrid — a static node (created manually or using any external tools) hosted in the cloud integrated with one of the cloud provider. This node has the CSI running, and it is managed by the cloud-controller-manager: the Node object automatically gets the information about the zone and region based on the cloud data; if a node gets deleted from the cloud, its corresponding Node object will be deleted in Kubernetes.

      Allowed values: Cloud, Static, Hybrid

    • spec.operatingSystem
      object

      Operating System settings for nodes.

      • spec.operatingSystem.manageKernel
        boolean

        Enable kernel maintenance from bashible.

        Default: true

Defines the runtime parameters of a node group.

  • metadata
    object
    • metadata.name
      string

      Pattern: ^[a-z0-9]([-a-z0-9]*[a-z0-9])?$

      Maximum length: 42

  • spec
    object

    Required value

    • spec.chaos
      object

      Chaos monkey settings.

      Example:

      chaos:
        mode: DrainAndDelete
        period: 24h
      
      • spec.chaos.mode
        string

        The chaos monkey mode:

        • DrainAndDelete — drains and deletes a node when triggered;
        • Disabled — leaves this NodeGroup intact.

        Default: "Disabled"

        Allowed values: Disabled, DrainAndDelete

      • spec.chaos.period
        string

        The time interval to use for the chaos monkey (can be specified in the Go format).

        Default: "6h"

        Pattern: ^[0-9]+[mh]{1}$

    • spec.cloudInstances
      object

      Parameter for provisioning the cloud-based VMs.

      Caution! Can only be used together with nodeType: CloudEphemeral.

      • spec.cloudInstances.classReference
        object

        Required value

        The reference to the InstanceClass object. It is unique for each cloud-provider-* module.

        • spec.cloudInstances.classReference.kind
          string

          The object type (e.g., OpenStackInstanceClass). The object type is specified in the documentation of the corresponding cloud-provider- module.

          Allowed values: OpenStackInstanceClass, GCPInstanceClass, VsphereInstanceClass, AWSInstanceClass, YandexInstanceClass, AzureInstanceClass, VCDInstanceClass, ZvirtInstanceClass

        • spec.cloudInstances.classReference.name
          string

          The name of the required InstanceClass object (e.g., finland-medium).

      • spec.cloudInstances.maxPerZone
        integer

        Required value

        The maximum number of instances for the group in each zone.

        This value is used as the upper bound in cluster-autoscaler.

        With a value of 0, you need to set capacity for some InstanceClass. Get more details in the description of the necessary InstanceClass.

        Allowed values: 0 <= X

      • spec.cloudInstances.maxSurgePerZone
        integer

        The maximum number of instances to rollout simultaneously in the group in each zone.

        Default: 1

        Allowed values: 0 <= X

      • spec.cloudInstances.maxUnavailablePerZone
        integer

        The maximum number of unavailable instances (during rollout) in the group in each zone.

        Default: 0

        Allowed values: 0 <= X

      • spec.cloudInstances.minPerZone
        integer

        Required value

        The minimum number of instances for the group in each zone.

        This value is used in the MachineDeployment object and as a lower bound in cluster-autoscaler.

        Allowed values: 0 <= X

      • spec.cloudInstances.standby
        integer or string

        The summary number of overprovisioned nodes for this NodeGroup all zones.

        An overprovisioned node is a cluster node on which resources are reserved that are available at any time for scaling. The presence of such a node allows the cluster autoscaler not to wait for node initialization (which may take several minutes), but to immediately place a load on it.

        The value can be an absolute number (for example, 2) or a percentage of desired nodes (for example, 10%). If a percentage is specified, the absolute number is calculated based on the percentage of the maximum number of nodes (the maxPerZone parameter) rounded down, but not less than one.

        Pattern: ^[0-9]+%?$

      • spec.cloudInstances.standbyHolder
        object

        Amount of reserved resources.

        Used to determine whether to order overprovisioned nodes.

        • spec.cloudInstances.standbyHolder.notHeldResources
          object

          Describes the resources that will not be held (consumed) by the standby holder.

          • spec.cloudInstances.standbyHolder.notHeldResources.cpu
            integer or string

            Describes the amount of CPU that will not be held by standby holder on Nodes from this NodeGroup.

            The value can be an absolute number of cpus (for example, 2) as well as a milli representation (for example, 1500m).

            Pattern: ^[0-9]+m?$

          • spec.cloudInstances.standbyHolder.notHeldResources.memory
            integer or string

            Describes the amount of memory that will not be held by standby holder on Nodes from this NodeGroup.

            The value can be an absolute number of bytes (for example, 128974848) as well as a fixed-point number using one of memory suffixes: G, Gi, M, Mi.

            Pattern: ^[0-9]+(\.[0-9]+)?(E|P|T|G|M|K|Ei|Pi|Ti|Gi|Mi|Ki)?$

      • spec.cloudInstances.zones
        array of strings

        List of availability zones to create instances in.

        The default value depends on the cloud provider selected and usually corresponds to all zones of the region being used.

        Example:

        zones:
        - Helsinki
        - Espoo
        - Tampere
        
    • spec.cri
      object

      Container runtime parameters.

      • spec.cri.containerd
        object

        Containerd runtime parameters.

        If used, cri.type must be set to Containerd.

        • spec.cri.containerd.maxConcurrentDownloads
          integer

          Set the max concurrent downloads for each pull.

          Default: 3

      • spec.cri.type
        string

        Container runtime type.

        Value defaultCRI from the initial cluster configration (cluster-configuration.yaml parameter from the d8-cluster-configuration secret in the kube-system namespace) is used if not specified.

        Note! the Docker is deprecated.

        Allowed values: Docker, Containerd, NotManaged

    • spec.disruptions
      object

      Disruptions settings for nodes.

      Example:

      disruptions:
        approvalMode: Automatic
        automatic:
          drainBeforeApproval: false
          windows:
          - from: '06:00'
            to: '08:00'
            days:
            - Tue
            - Sun
      
      • spec.disruptions.approvalMode
        string

        The approval mode for disruptive updates:

        • Manual — disable automatic disruption approval; the alert will be displayed if disruption is needed. Caution! The master node group update mode must be Manual to avoid issues with draining.
        • Automatic — automatically approve disruption-involving updates.
        • RollingUpdate — in this mode, a new node with new settings will be created; then, the old node will be deleted. Available only for cloud nodes.

        If the RollingUpdate mode is not used, when updating, the node is first drained and then updated (rebooted) and put back into operation (uncordoned). Note that in this case, the cluster must have sufficient resources to accommodate the load while the node being updated is unavailable. In the RollingUpdate mode, the node is replaced by the updated node, i.e., an extra node appears in the cluster for the duration of the update. In cloud infrastructures, the RollingUpdate mode is convenient, for example, if there are no resources in the cluster to temporarily host the load from the node being updated.

        Default: "Automatic"

        Allowed values: Manual, Automatic, RollingUpdate

      • spec.disruptions.automatic
        object

        Additional parameters for the Automatic mode.

        • spec.disruptions.automatic.drainBeforeApproval
          boolean

          Drain Pods from the nodes before approving disruption.

          Caution! This setting ignores (nodes will be approved without draining Pods):

          • for the nodeGroup master with a single node;
          • for a single ready node in a nodeGroup picked out for Deckhouse placement.

          Default: true

        • spec.disruptions.automatic.windows
          array of objects

          Time windows for node disruptive updates.

          • spec.disruptions.automatic.windows.days
            array of strings

            Days of the week when node could be updated.

            Examples:

            days: Mon
            
            days: Wed
            
            • Element of the array
              string

              Day of the week.

              Allowed values: Mon, Tue, Wed, Thu, Fri, Sat, Sun

          • spec.disruptions.automatic.windows.from
            string

            Required value

            Start time of disruptive update window (UTC timezone).

            Pattern: ^(?:\d|[01]\d|2[0-3]):[0-5]\d$

            Example:

            from: '13:00'
            
          • spec.disruptions.automatic.windows.to
            string

            Required value

            End time of disruptive update window (UTC timezone).

            Pattern: ^(?:\d|[01]\d|2[0-3]):[0-5]\d$

            Example:

            to: '18:30'
            
      • spec.disruptions.rollingUpdate
        object

        Additional parameters for the RollingUpdate mode.

        • spec.disruptions.rollingUpdate.windows
          array of objects

          Time windows for node disruptive updates.

          • spec.disruptions.rollingUpdate.windows.days
            array of strings

            Days of the week when node could be updated.

            Examples:

            days: Mon
            
            days: Wed
            
            • Element of the array
              string

              Day of the week.

              Allowed values: Mon, Tue, Wed, Thu, Fri, Sat, Sun

          • spec.disruptions.rollingUpdate.windows.from
            string

            Required value

            Start time of disruptive update window (UTC timezone).

            Pattern: ^(?:\d|[01]\d|2[0-3]):[0-5]\d$

            Example:

            from: '13:00'
            
          • spec.disruptions.rollingUpdate.windows.to
            string

            Required value

            End time of disruptive update window (UTC timezone).

            Pattern: ^(?:\d|[01]\d|2[0-3]):[0-5]\d$

            Example:

            to: '18:30'
            
    • spec.docker
      object

      Docker settings for nodes.

      If used, cri.type must be set to Docker.

      Note! the Docker is deprecated.

      • spec.docker.manage
        boolean

        Enable Docker maintenance from bashible.

        Default: true

      • spec.docker.maxConcurrentDownloads
        integer

        Set the max concurrent downloads for each pull.

        Default: 3

    • spec.kubelet
      object

      Kubelet settings for nodes.

      • spec.kubelet.containerLogMaxFiles
        integer

        How many rotated log files to store before deleting them.

        WARNING! This parameter does nothing if CRI type is Docker.

        Default: 4

        Allowed values: 1 <= X <= 20

      • spec.kubelet.containerLogMaxSize
        string

        Maximum log file size before it is rotated.

        WARNING! This parameter does nothing if CRI type is Docker.

        Default: "50Mi"

        Pattern: \d+[Ei|Pi|Ti|Gi|Mi|Ki|E|P|T|G|M|k|m]

      • spec.kubelet.maxPods
        integer

        Set the max count of pods per node.

        Default: 110

      • spec.kubelet.rootDir
        string

        Directory path for managing kubelet files (volume mounts,etc).

        Default: "/var/lib/kubelet"

    • spec.kubernetesVersion
      string

      The desired minor version of Kubernetes.

      By default, it corresponds to the version selected for the cluster globally (see installation documentation) or to the current version of the control plane (if the global version is not defined).

      Allowed values: 1.26, 1.27, 1.28, 1.29, 1.30

      Example:

      kubernetesVersion: '1.27'
      
    • spec.nodeTemplate
      object

      Specification of some of the fields that will be maintained in all nodes of the group.

      Example:

      nodeTemplate:
        labels:
          environment: production
          app: warp-drive-ai
        annotations:
          ai.fleet.com/discombobulate: 'true'
        taints:
        - effect: NoExecute
          key: ship-class
          value: frigate
      
      • spec.nodeTemplate.annotations
        object

        Similar to the standard metadata.annotations field.

        Example:

        annotations:
          ai.fleet.com/discombobulate: 'true'
        
      • spec.nodeTemplate.labels
        object

        Similar to the standard metadata.labels field.

        Example:

        labels:
          environment: production
          app: warp-drive-ai
        
      • spec.nodeTemplate.taints
        array of objects

        Similar to the .spec.taints field of the Node object.

        Caution! Only effect, key, value fields are available.

        Example:

        taints:
        - effect: NoExecute
          key: ship-class
          value: frigate
        
        • spec.nodeTemplate.taints.effect
          string

          Allowed values: NoSchedule, PreferNoSchedule, NoExecute

        • spec.nodeTemplate.taints.key
          string
        • spec.nodeTemplate.taints.value
          string
    • spec.nodeType
      string

      Required value

      The type of nodes this group provides.

      • Cloud — nodes for this group will be automatically created (and deleted) in the cloud of the specified cloud provider;
      • Static — a static node hosted on a bare metal or virtual machine. The cloud-controller-manager does not manage the node even of one of the cloud providers is enabled;
      • Hybrid — a static node (created manually or using any external tools) hosted in the cloud integrated with one of the cloud provider. This node has the CSI running, and it is managed by the cloud-controller-manager: the Node object automatically gets the information about the zone and region based on the cloud data; if a node gets deleted from the cloud, its corresponding Node object will be deleted in Kubernetes.

      Allowed values: Cloud, Static, Hybrid

    • spec.operatingSystem
      Deprecated
      object

      Operating System settings for nodes.

      • spec.operatingSystem.manageKernel
        Deprecated
        boolean

        Enable kernel maintenance from bashible.

        Default: true

    • spec.static
      object

      Static node parameters

      • spec.static.internalNetworkCIDRs
        array of strings

        Subnet CIDR

NodeGroupConfiguration

Scope: Cluster
Version: v1alpha1

Executes bash scripts on nodes.

Read more in the module documentation.

  • spec
    object

    Required value

    • spec.bundles
      array of strings

      Required value

      Bundles for step execution. You can set '*' for selecting all bundles.

      See the list of possible bundles in the allowedBundles module parameter.

      Examples:

      bundles:
      - ubuntu-lts
      - centos-7
      
      bundles:
      - ubuntu-lts
      
      bundles:
      - "*"
      
    • spec.content
      string

      Required value

      A bash script that does the same things you would do in a configuration step.

      You can use Go Template to generate a script.

      The list of parameters available for use in templates can be retrieved from the bashible-apiserver-context Secret as follows:

      kubectl -n d8-cloud-instance-manager get secrets bashible-apiserver-context -o jsonpath='{.data.input\.yaml}'| base64 -d
      

      For example:

      {{- range .nodeUsers }}
      echo 'Tuning environment for user {{ .name }}'
      # Some code for tuning user environment
      {{- end }}
      

      You can also use the pre-defined bashbooster commands in the script. For example:

      bb-event-on 'bb-package-installed' 'post-install'
      post-install() {
        bb-log-info "Setting reboot flag due to kernel was updated"
        bb-flag-set reboot
      }
      
    • spec.nodeGroups
      array of strings

      Required value

      List of NodeGroups to apply the step for. You can set '*' for selecting all NodeGroups.

      Examples:

      nodeGroups:
      - master
      - worker
      
      nodeGroups:
      - worker
      
      nodeGroups:
      - "*"
      
    • spec.weight
      integer

      Order of the step execution.

      Default: 100

NodeUser

Scope: Cluster

Defines the linux users to create on all nodes.

The user’s home directory is created in the /home/deckhouse/ directory.

  • spec
    object

    Required value

    • spec.extraGroups
      array of strings

      Node user additional system groups.

      Examples:

      extraGroups:
      - docker
      
      extraGroups:
      - docker
      - ftp
      
    • spec.isSudoer
      boolean

      Persistence of node user in sudo group.

      Default: false

      Example:

      isSudoer: 'true'
      
    • spec.nodeGroups
      array of strings

      List of NodeGroups to apply the user for.

      Default: ["*"]

      Examples:

      nodeGroups:
      - master
      - worker
      
      nodeGroups:
      - worker
      
      nodeGroups:
      - "*"
      
    • spec.passwordHash
      string

      Required value

      Hashed user password.

      The format corresponds to the password hashes in /etc/shadow. You can get it using the following command: openssl passwd -6.

      Example:

      passwordHash: "$2a$10$F9ey7zW.sVliT224RFxpWeMsgzO.D9YRG54a8T36/K2MCiT41nzmC"
      
    • spec.sshPublicKey
      Deprecated
      string

      Node user SSH public key.

      Either sshPublicKey or sshPublicKeys must be specified.

      Example:

      sshPublicKey: ssh-rsa AAABBB
      
    • spec.sshPublicKeys
      array of strings

      Node user SSH public keys.

      Either sshPublicKey or sshPublicKeys must be specified.

      Example:

      sshPublicKeys:
      - ssh-rsa AAABBB
      - cert-authority,principals="name" ssh-rsa BBBCCC
      
    • spec.uid
      number

      Required value

      Node user ID.

      We recommend using the values >= 1100 to avoid conflicts with manually created users.

      This parameter does not change during the entire resource life.

      Allowed values: 1001 <= X

      Example:

      uid: 1100
      

Defines the linux users to create on all nodes.

The user’s home directory is created in the /home/deckhouse/ directory.

  • spec
    object

    Required value

    • spec.extraGroups
      array of strings

      Node user additional system groups.

      Examples:

      extraGroups:
      - docker
      
      extraGroups:
      - docker
      - ftp
      
    • spec.isSudoer
      boolean

      Persistence of node user in sudo group.

      Default: false

      Example:

      isSudoer: 'true'
      
    • spec.nodeGroups
      array of strings

      List of NodeGroups to apply the user for.

      Default: ["*"]

      Examples:

      nodeGroups:
      - master
      - worker
      
      nodeGroups:
      - worker
      
      nodeGroups:
      - "*"
      
    • spec.passwordHash
      string

      Required value

      Hashed user password.

      The format corresponds to the password hashes in /etc/shadow. You can get it using the following command: openssl passwd -6.

      Example:

      passwordHash: "$2a$10$F9ey7zW.sVliT224RFxpWeMsgzO.D9YRG54a8T36/K2MCiT41nzmC"
      
    • spec.sshPublicKey
      Deprecated
      string

      Node user SSH public key.

      Either sshPublicKey or sshPublicKeys must be specified.

      Example:

      sshPublicKey: ssh-rsa AAABBB
      
    • spec.sshPublicKeys
      array of strings

      Node user SSH public keys.

      Either sshPublicKey or sshPublicKeys must be specified.

      Example:

      sshPublicKeys:
      - ssh-rsa AAABBB
      - cert-authority,principals="name" ssh-rsa BBBCCC
      
    • spec.uid
      number

      Required value

      Node user ID.

      We recommend using the values >= 1100 to avoid conflicts with manually created users.

      This parameter does not change during the entire resource life.

      Allowed values: 1001 <= X

      Example:

      uid: 1100
      

SSHCredentials

Scope: Cluster
Version: v1alpha1

Contains credentials required by Cluster API Provider Static (CAPS) to connect over SSH. CAPS connects to the server (virtual machine) defined in the StaticInstance custom resource to manage its state.

A reference to this resource is specified in the credentialsRef parameter of the StaticInstance resource.

  • apiVersion
    string

    APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info…

  • kind
    string

    Kind is a string value representing the REST resource this object represents. Servers may infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info…

  • metadata
    object
  • spec
    object

    SSHCredentialsSpec defines the desired state of SSHCredentials.

    • spec.privateSSHKey
      string

      Required value

      Private SSH key in PEM format encoded as base64 string.

    • spec.sshExtraArgs
      string

      A list of additional arguments to pass to the openssh command.

      Examples:

      sshExtraArgs: "-vvv"
      
      sshExtraArgs: "-c chacha20-poly1305@openssh.com"
      
      sshExtraArgs: "-c aes256-gcm@openssh.com"
      
      sshExtraArgs: "-m umac-64-etm@openssh.com"
      
      sshExtraArgs: "-m hmac-sha2-512-etm@openssh.com"
      
    • spec.sshPort
      integer

      A port to connect to the host via SSH.

      Default: 22

      Allowed values: 1 <= X <= 65535

    • spec.sudoPassword
      string

      A sudo password for the user.

    • spec.user
      string

      Required value

      A username to connect to the host via SSH.

StaticInstance

Scope: Cluster
Version: v1alpha1

StaticInstance describes a machine for the Cluster API Provider Static.

  • apiVersion
    string

    APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info…

  • kind
    string

    Kind is a string value representing the REST resource this object represents. Servers may infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info…

  • metadata
    object
  • spec
    object

    StaticInstanceSpec defines the desired state of StaticInstance.

    • spec.address
      string

      Required value

      The IP address of the host.

      Pattern: ^[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}$

    • spec.credentialsRef
      object

      Required value

      The reference to the SSHCredentials object.

      • spec.credentialsRef.apiVersion
        string

        API version of the referent.

      • spec.credentialsRef.kind
        string

        Kind of the referent. More info…

      • spec.credentialsRef.name
        string

        Name of the referent. More info…