Instance

Scope: Cluster
Version: v1alpha1

Defines the cloud instance.

NodeGroup

Scope: Cluster

Describes the runtime parameters of the node group.

  • metadataobject
    • metadata.namestring

      Pattern: ^[a-z0-9]([-a-z0-9]*[a-z0-9])?$

      Maximum length: 42

  • specobject

    Required value

    • spec.chaosobject

      Chaos monkey settings.

      Example:

      mode: DrainAndDelete
      period: 24h
      
      • spec.chaos.modestring

        The chaos monkey mode:

        • DrainAndDelete — drains and deletes a node when triggered;
        • Disabled — leaves this NodeGroup intact.

        Default: "Disabled"

        Allowed values: Disabled, DrainAndDelete

      • spec.chaos.periodstring

        The time interval to use for the chaos monkey.

        It is specified as a string containing the time unit in hours and minutes: 30m, 1h, 2h30m, 24h.

        Default: "6h"

        Pattern: ^([0-9]+h([0-9]+m)?|[0-9]+m)$

    • spec.cloudInstancesobject

      Parameter for provisioning the cloud-based VMs.

      Caution! Can only be used together with nodeType: CloudEphemeral.

      • spec.cloudInstances.classReferenceobject

        Required value

        The reference to the InstanceClass object. It is unique for each cloud-provider-* module.

        • spec.cloudInstances.classReference.kindstring

          The object type (e.g., OpenStackInstanceClass). The object type is specified in the documentation of the corresponding cloud-provider- module.

          Allowed values: OpenStackInstanceClass, GCPInstanceClass, VsphereInstanceClass, AWSInstanceClass, YandexInstanceClass, AzureInstanceClass

        • spec.cloudInstances.classReference.namestring

          The name of the required InstanceClass object (e.g., finland-medium).

      • spec.cloudInstances.maxPerZoneinteger

        Required value

        The maximum number of instances for the group in each zone.

        This value is used as the upper bound in cluster-autoscaler.

        Allowed values: 0 <= X

      • spec.cloudInstances.maxSurgePerZoneinteger

        The maximum number of instances to rollout simultaneously in the group in each zone.

        Default: 1

        Allowed values: 0 <= X

      • spec.cloudInstances.maxUnavailablePerZoneinteger

        The maximum number of unavailable instances (during rollout) in the group in each zone.

        Default: 0

        Allowed values: 0 <= X

      • spec.cloudInstances.minPerZoneinteger

        Required value

        The minimum number of instances for the group in each zone.

        This value is used in the MachineDeployment object and as a lower bound in cluster-autoscaler.

        Allowed values: 0 <= X

      • spec.cloudInstances.priorityinteger

        Priority of the node group.

        When scaling a cluster, the autoscaler will first select node groups with a higher priority set. If several node groups have the same priority, the autoscaler randomly selects a group of them.

        Using priorities can be convenient to prefer ordering cheaper nodes (for example, spot instances) over more expensive ones.

      • spec.cloudInstances.quickShutdownboolean

        Lowers CloudEphemeral machine drain timeout to 5 minutes.

      • spec.cloudInstances.standbyinteger or string

        The summary number of overprovisioned nodes for this NodeGroup in all zones.

        An overprovisioned node is a cluster node on which resources are reserved that are available at any time for scaling. The presence of such a node allows the cluster autoscaler not to wait for node initialization (which may take several minutes), but to immediately place a load on it.

        The value can be an absolute number (for example, 2) or a percentage of desired nodes (for example, 10%). If a percentage is specified, the absolute number is calculated based on the percentage of the maximum number of nodes (the maxPerZone parameter) rounded down, but not less than one.

        Pattern: ^[0-9]+%?$

      • spec.cloudInstances.standbyHolderobject

        Amount of reserved resources.

        Used to determine whether to order overprovisioned nodes.

        • spec.cloudInstances.standbyHolder.notHeldResourcesDeprecatedobject

          Deprecated: the parameter is no longer used. Use the overprovisioningRate parameter.

          Describes the resources that will not be held (consumed) by the standby holder.

          • spec.cloudInstances.standbyHolder.notHeldResources.cpuinteger or string

            Describes the amount of CPU that will not be held by standby holder on Nodes from this NodeGroup.

            The value can be an absolute number of cpus (for example, 2) as well as a milli representation (for example, 1500m).

            Pattern: ^[0-9]+m?$

          • spec.cloudInstances.standbyHolder.notHeldResources.memoryinteger or string

            Describes the amount of memory that will not be held by standby holder on Nodes from this NodeGroup.

            The value can be an absolute number of bytes (for example, 128974848) as well as a fixed-point number using one of memory suffixes: G, Gi, M, Mi.

            Pattern: ^[0-9]+(\.[0-9]+)?(E|P|T|G|M|K|Ei|Pi|Ti|Gi|Mi|Ki)?$

        • spec.cloudInstances.standbyHolder.overprovisioningRateinteger

          Percentage of reserved resources calculated from the capacity of a node of a NodeGroup.

          Default: 50

          Allowed values: 1 <= X <= 80

      • spec.cloudInstances.zonesarray of strings

        List of availability zones to create instances in.

        The default value depends on the cloud provider selected and usually corresponds to all zones of the region being used.

        Example:

        zones:
        - Helsinki
        - Espoo
        - Tampere
        
    • spec.criobject

      Container runtime parameters.

      • spec.cri.containerdobject

        Containerd runtime parameters.

        If used, cri.type must be set to Containerd.

        • spec.cri.containerd.maxConcurrentDownloadsinteger

          Set the max concurrent downloads for each pull.

          Default: 3

      • spec.cri.dockerDeprecatedobject

        Docker settings for nodes.

        • spec.cri.docker.manageboolean

          Enable Docker maintenance from bashible.

          Default: true

        • spec.cri.docker.maxConcurrentDownloadsinteger

          Set the max concurrent downloads for each pull.

          Default: 3

      • spec.cri.notManagedobject

        Settings for not managed CRI for nodes.

        • spec.cri.notManaged.criSocketPathstring

          Path to CRI socket.

      • spec.cri.typestring

        Container runtime type.

        Value defaultCRI from the initial cluster configration (cluster-configuration.yaml parameter from the d8-cluster-configuration secret in the kube-system namespace) is used if not specified.

        Note! The Docker is deprecated.

        Allowed values: Docker, Containerd, NotManaged

    • spec.disruptionsobject

      Disruptions settings for nodes.

      Example:

      disruptions:
        approvalMode: Automatic
        automatic:
          drainBeforeApproval: false
          windows:
          - from: '06:00'
            to: '08:00'
            days:
            - Tue
            - Sun
      
      • spec.disruptions.approvalModestring

        The approval mode for disruptive updates:

        • Manual — disable automatic disruption approval; the alert will be displayed if disruption is needed. Caution! The master node group update mode must be Manual to avoid issues with draining.
        • Automatic — automatically approve disruption-involving updates.
        • RollingUpdate — in this mode, a new node with new settings will be created; then, the old node will be deleted. Available only for cloud nodes.

        If the RollingUpdate mode is not used, when updating, the node is first drained and then updated (rebooted) and put back into operation (uncordoned). Note that in this case, the cluster must have sufficient resources to accommodate the load while the node being updated is unavailable. In the RollingUpdate mode, the node is replaced by the updated node, i.e., an extra node appears in the cluster for the duration of the update. In cloud infrastructures, the RollingUpdate mode is convenient, for example, if there are no resources in the cluster to temporarily host the load from the node being updated.

        Default: "Automatic"

        Allowed values: Manual, Automatic, RollingUpdate

      • spec.disruptions.automaticobject

        Additional parameters for the Automatic mode.

        • spec.disruptions.automatic.drainBeforeApprovalboolean

          Drain Pods from the nodes before approving disruption.

          Caution! This setting ignores (nodes will be approved without draining Pods):

          • for the nodeGroup master with a single node;
          • for a single ready node in a nodeGroup picked out for Deckhouse placement.

          Default: true

        • spec.disruptions.automatic.windowsarray of objects

          Time windows for node disruptive updates.

          • spec.disruptions.automatic.windows.daysarray of strings

            Days of the week when node could be updated.

            Example:

            days:
            - Mon
            - Wed
            
            • Element of the arraystring

              Day of the week.

              Allowed values: Mon, Tue, Wed, Thu, Fri, Sat, Sun

          • spec.disruptions.automatic.windows.fromstring

            Required value

            Start time of disruptive update window (UTC timezone).

            Pattern: ^(?:\d|[01]\d|2[0-3]):[0-5]\d$

            Example:

            from: '13:00'
            
          • spec.disruptions.automatic.windows.tostring

            Required value

            End time of disruptive update window (UTC timezone).

            Pattern: ^(?:\d|[01]\d|2[0-3]):[0-5]\d$

            Example:

            to: '18:30'
            
      • spec.disruptions.rollingUpdateobject

        Additional parameters for the RollingUpdate mode.

        • spec.disruptions.rollingUpdate.windowsarray of objects

          Time windows for node disruptive updates.

          • spec.disruptions.rollingUpdate.windows.daysarray of strings

            Days of the week when node could be updated.

            Example:

            days:
            - Mon
            - Wed
            
            • Element of the arraystring

              Day of the week.

              Allowed values: Mon, Tue, Wed, Thu, Fri, Sat, Sun

          • spec.disruptions.rollingUpdate.windows.fromstring

            Required value

            Start time of disruptive update window (UTC timezone).

            Pattern: ^(?:\d|[01]\d|2[0-3]):[0-5]\d$

            Example:

            from: '13:00'
            
          • spec.disruptions.rollingUpdate.windows.tostring

            Required value

            End time of disruptive update window (UTC timezone).

            Pattern: ^(?:\d|[01]\d|2[0-3]):[0-5]\d$

            Example:

            to: '18:30'
            
    • spec.kubeletobject

      Kubelet settings for nodes.

      • spec.kubelet.containerLogMaxFilesinteger

        How many rotated log files to store before deleting them.

        Default: 4

        Allowed values: 1 <= X <= 20

      • spec.kubelet.containerLogMaxSizestring

        Maximum log file size before it is rotated.

        Default: "50Mi"

        Pattern: \d+[Ei|Pi|Ti|Gi|Mi|Ki|E|P|T|G|M|k|m]

      • spec.kubelet.maxPodsinteger

        Set the max count of pods per node.

        Default: 110

      • spec.kubelet.resourceReservationobject

        Management of resource reservation for system daemons on a node.

        More info in the Kubernetes documentation.

        • spec.kubelet.resourceReservation.modestring

          Specify whether to:

          • Off — Disable resource reservation.
          • Auto — Reserve resources based on the Node capacity.
          • Static — Provide your own resource reservation values via the static parameter.

          Note that currently we do not use a dedicated cgroup for resource reservation (-system-reserved-cgroup is not used).

          Default: "Auto"

        • spec.kubelet.resourceReservation.staticobject

          Resource reservation parameters for the ‘Static’ mode.

          • spec.kubelet.resourceReservation.static.cpuinteger or string

            Pattern: \d+[m]

          • spec.kubelet.resourceReservation.static.ephemeralStorageinteger or string

            Pattern: \d+[Ei|Pi|Ti|Gi|Mi|Ki|E|P|T|G|M|k|m]

          • spec.kubelet.resourceReservation.static.memoryinteger or string

            Pattern: \d+[Ei|Pi|Ti|Gi|Mi|Ki|E|P|T|G|M|k|m]

      • spec.kubelet.rootDirstring

        Directory path for managing kubelet files (volume mounts,etc).

        Default: "/var/lib/kubelet"

    • spec.nodeTemplateobject

      Specification of some of the fields that will be maintained in all nodes of the group.

      Example:

      labels:
        environment: production
        app: warp-drive-ai
      annotations:
        ai.fleet.com/discombobulate: "true"
      taints:
      - effect: NoExecute
        key: ship-class
        value: frigate
      
      • spec.nodeTemplate.annotationsobject

        Similar to the standard metadata.annotations field.

        Example:

        annotations:
          ai.fleet.com/discombobulate: "true"
        
      • spec.nodeTemplate.labelsobject

        Similar to the standard metadata.labels field.

        Example:

        labels:
          environment: production
          app: warp-drive-ai
        
      • spec.nodeTemplate.taintsarray of objects

        Similar to the .spec.taints field of the Node object.

        Caution! Only effect, key, value fields are available.

        Example:

        taints:
        - effect: NoExecute
          key: ship-class
          value: frigate
        
        • spec.nodeTemplate.taints.effectstring

          Allowed values: NoSchedule, PreferNoSchedule, NoExecute

        • spec.nodeTemplate.taints.keystring
        • spec.nodeTemplate.taints.valuestring
    • spec.nodeTypestring

      Required value

      The type of nodes this group provides:

      • CloudEphemeral — nodes for this group will be automatically created (and deleted) in the cloud of the specified cloud provider;
      • CloudPermanent — nodes from ProviderClusterConfiguration will be created via dhctl;
      • CloudStatic — a static node (created manually or using any external tools) hosted in the cloud integrated with one of the cloud providers. This node has the CSI running, and it is managed by the cloud-controller-manager: the Node object automatically gets the information about the zone and region based on the cloud data; if a node gets deleted from the cloud, its corresponding Node object will be deleted in Kubernetes;
      • Static — a static node hosted on a bare metal or virtual machine. The cloud-controller-manager does not manage the node even if one of the cloud providers is enabled.

      Allowed values: CloudEphemeral, CloudPermanent, CloudStatic, Static

    • spec.operatingSystemobject

      Operating System settings for nodes.

      • spec.operatingSystem.manageKernelDeprecatedboolean

        This parameter has no effect. Earlier, it enabled kernel maintenance on behalf of bashible.

        Default: true

    • spec.updateobject
      • spec.update.maxConcurrentinteger or string

        Maximum number of concurrently updating nodes.

        Can be set as absolute count or as a percent of total nodes.

        Default: 1

        Pattern: ^[1-9][0-9]*%?$

Describes the runtime parameters of the node group.

  • metadataobject
    • metadata.namestring

      Pattern: ^[a-z0-9]([-a-z0-9]*[a-z0-9])?$

      Maximum length: 42

  • specobject

    Required value

    • spec.chaosobject

      Chaos monkey settings.

      Example:

      mode: DrainAndDelete
      period: 24h
      
      • spec.chaos.modestring

        The chaos monkey mode:

        • DrainAndDelete — drains and deletes a node when triggered;
        • Disabled — leaves this NodeGroup intact.

        Default: "Disabled"

        Allowed values: Disabled, DrainAndDelete

      • spec.chaos.periodstring

        The time interval to use for the chaos monkey (can be specified in the Go format).

        Default: "6h"

        Pattern: ^[0-9]+[mh]{1}$

    • spec.cloudInstancesobject

      Parameter for provisioning the cloud-based VMs.

      Caution! Can only be used together with nodeType: CloudEphemeral.

      • spec.cloudInstances.classReferenceobject

        Required value

        The reference to the InstanceClass object. It is unique for each cloud-provider-* module.

        • spec.cloudInstances.classReference.kindstring

          The object type (e.g., OpenStackInstanceClass). The object type is specified in the documentation of the corresponding cloud-provider- module.

          Allowed values: OpenStackInstanceClass, GCPInstanceClass, VsphereInstanceClass, AWSInstanceClass, YandexInstanceClass, AzureInstanceClass

        • spec.cloudInstances.classReference.namestring

          The name of the required InstanceClass object (e.g., finland-medium).

      • spec.cloudInstances.maxPerZoneinteger

        Required value

        The maximum number of instances for the group in each zone.

        This value is used as the upper bound in cluster-autoscaler.

        Allowed values: 0 <= X

      • spec.cloudInstances.maxSurgePerZoneinteger

        The maximum number of instances to rollout simultaneously in the group in each zone.

        Default: 1

        Allowed values: 0 <= X

      • spec.cloudInstances.maxUnavailablePerZoneinteger

        The maximum number of unavailable instances (during rollout) in the group in each zone.

        Default: 0

        Allowed values: 0 <= X

      • spec.cloudInstances.minPerZoneinteger

        Required value

        The minimum number of instances for the group in each zone.

        This value is used in the MachineDeployment object and as a lower bound in cluster-autoscaler.

        Allowed values: 0 <= X

      • spec.cloudInstances.standbyinteger or string

        The summary number of overprovisioned nodes for this NodeGroup in all zones.

        An overprovisioned node is a cluster node on which resources are reserved that are available at any time for scaling. The presence of such a node allows the cluster autoscaler not to wait for node initialization (which may take several minutes), but to immediately place a load on it.

        The value can be an absolute number (for example, 2) or a percentage of desired nodes (for example, 10%). If a percentage is specified, the absolute number is calculated based on the percentage of the maximum number of nodes (the maxPerZone parameter) rounded down, but not less than one.

        Pattern: ^[0-9]+%?$

      • spec.cloudInstances.standbyHolderobject

        Amount of reserved resources.

        Used to determine whether to order overprovisioned nodes.

        • spec.cloudInstances.standbyHolder.notHeldResourcesobject

          Describes the resources that will not be held (consumed) by the standby holder.

          • spec.cloudInstances.standbyHolder.notHeldResources.cpuinteger or string

            Describes the amount of CPU that will not be held by standby holder on Nodes from this NodeGroup.

            The value can be an absolute number of cpus (for example, 2) as well as a milli representation (for example, 1500m).

            Pattern: ^[0-9]+m?$

          • spec.cloudInstances.standbyHolder.notHeldResources.memoryinteger or string

            Describes the amount of memory that will not be held by standby holder on Nodes from this NodeGroup.

            The value can be an absolute number of bytes (for example, 128974848) as well as a fixed-point number using one of memory suffixes: G, Gi, M, Mi.

            Pattern: ^[0-9]+(\.[0-9]+)?(E|P|T|G|M|K|Ei|Pi|Ti|Gi|Mi|Ki)?$

      • spec.cloudInstances.zonesarray of strings

        List of availability zones to create instances in.

        The default value depends on the cloud provider selected and usually corresponds to all zones of the region being used.

        Example:

        zones:
        - Helsinki
        - Espoo
        - Tampere
        
    • spec.criobject

      Container runtime parameters.

      • spec.cri.containerdobject

        Containerd runtime parameters.

        If used, cri.type must be set to Containerd.

        • spec.cri.containerd.maxConcurrentDownloadsinteger

          Set the max concurrent downloads for each pull.

          Default: 3

      • spec.cri.dockerobject

        Docker settings for nodes.

        Note! the Docker is deprecated.

        • spec.cri.docker.manageboolean

          Enable Docker maintenance from bashible.

          Default: true

        • spec.cri.docker.maxConcurrentDownloadsinteger

          Set the max concurrent downloads for each pull.

          Default: 3

      • spec.cri.notManagedobject

        Settings for not managed CRI for nodes.

        • spec.cri.notManaged.criSocketPathstring

          Path to CRI socket.

      • spec.cri.typestring

        Container runtime type.

        Value defaultCRI from the initial cluster configration (cluster-configuration.yaml parameter from the d8-cluster-configuration secret in the kube-system namespace) is used if not specified.

        Note! the Docker is deprecated.

        Allowed values: Docker, Containerd, NotManaged

    • spec.disruptionsobject

      Disruptions settings for nodes.

      Example:

      disruptions:
        approvalMode: Automatic
        automatic:
          drainBeforeApproval: false
          windows:
          - from: '06:00'
            to: '08:00'
            days:
            - Tue
            - Sun
      
      • spec.disruptions.approvalModestring

        The approval mode for disruptive updates:

        • Manual — disable automatic disruption approval; the alert will be displayed if disruption is needed. Caution! The master node group update mode must be Manual to avoid issues with draining.
        • Automatic — automatically approve disruption-involving updates.
        • RollingUpdate — in this mode, a new node with new settings will be created; then, the old node will be deleted. Available only for cloud nodes.

        If the RollingUpdate mode is not used, when updating, the node is first drained and then updated (rebooted) and put back into operation (uncordoned). Note that in this case, the cluster must have sufficient resources to accommodate the load while the node being updated is unavailable. In the RollingUpdate mode, the node is replaced by the updated node, i.e., an extra node appears in the cluster for the duration of the update. In cloud infrastructures, the RollingUpdate mode is convenient, for example, if there are no resources in the cluster to temporarily host the load from the node being updated.

        Default: "Automatic"

        Allowed values: Manual, Automatic, RollingUpdate

      • spec.disruptions.automaticobject

        Additional parameters for the Automatic mode.

        • spec.disruptions.automatic.drainBeforeApprovalboolean

          Drain Pods from the nodes before approving disruption.

          Caution! This setting ignores (nodes will be approved without draining Pods):

          • for the nodeGroup master with a single node;
          • for a single ready node in a nodeGroup picked out for Deckhouse placement.

          Default: true

        • spec.disruptions.automatic.windowsarray of objects

          Time windows for node disruptive updates.

          • spec.disruptions.automatic.windows.daysarray of strings

            Days of the week when node could be updated.

            Example:

            days:
            - Mon
            - Wed
            
            • Element of the arraystring

              Day of the week.

              Allowed values: Mon, Tue, Wed, Thu, Fri, Sat, Sun

          • spec.disruptions.automatic.windows.fromstring

            Required value

            Start time of disruptive update window (UTC timezone).

            Pattern: ^(?:\d|[01]\d|2[0-3]):[0-5]\d$

            Example:

            from: '13:00'
            
          • spec.disruptions.automatic.windows.tostring

            Required value

            End time of disruptive update window (UTC timezone).

            Pattern: ^(?:\d|[01]\d|2[0-3]):[0-5]\d$

            Example:

            to: '18:30'
            
      • spec.disruptions.rollingUpdateobject

        Additional parameters for the RollingUpdate mode.

        • spec.disruptions.rollingUpdate.windowsarray of objects

          Time windows for node disruptive updates.

          • spec.disruptions.rollingUpdate.windows.daysarray of strings

            Days of the week when node could be updated.

            Example:

            days:
            - Mon
            - Wed
            
            • Element of the arraystring

              Day of the week.

              Allowed values: Mon, Tue, Wed, Thu, Fri, Sat, Sun

          • spec.disruptions.rollingUpdate.windows.fromstring

            Required value

            Start time of disruptive update window (UTC timezone).

            Pattern: ^(?:\d|[01]\d|2[0-3]):[0-5]\d$

            Example:

            from: '13:00'
            
          • spec.disruptions.rollingUpdate.windows.tostring

            Required value

            End time of disruptive update window (UTC timezone).

            Pattern: ^(?:\d|[01]\d|2[0-3]):[0-5]\d$

            Example:

            to: '18:30'
            
    • spec.kubeletobject

      Kubelet settings for nodes.

      • spec.kubelet.containerLogMaxFilesinteger

        How many rotated log files to store before deleting them.

        WARNING! This parameter does nothing if CRI type is Docker.

        Default: 4

        Allowed values: 1 <= X <= 20

      • spec.kubelet.containerLogMaxSizestring

        Maximum log file size before it is rotated.

        WARNING! This parameter does nothing if CRI type is Docker.

        Default: "50Mi"

        Pattern: \d+[Ei|Pi|Ti|Gi|Mi|Ki|E|P|T|G|M|k|m]

      • spec.kubelet.maxPodsinteger

        Set the max count of pods per node.

        Default: 110

      • spec.kubelet.rootDirstring

        Directory path for managing kubelet files (volume mounts,etc).

        Default: "/var/lib/kubelet"

    • spec.nodeTemplateobject

      Specification of some of the fields that will be maintained in all nodes of the group.

      Example:

      labels:
        environment: production
        app: warp-drive-ai
      annotations:
        ai.fleet.com/discombobulate: "true"
      taints:
      - effect: NoExecute
        key: ship-class
        value: frigate
      
      • spec.nodeTemplate.annotationsobject

        Similar to the standard metadata.annotations field.

        Example:

        annotations:
          ai.fleet.com/discombobulate: "true"
        
      • spec.nodeTemplate.labelsobject

        Similar to the standard metadata.labels field.

        Example:

        labels:
          environment: production
          app: warp-drive-ai
        
      • spec.nodeTemplate.taintsarray of objects

        Similar to the .spec.taints field of the Node object.

        Caution! Only effect, key, value fields are available.

        Example:

        taints:
        - effect: NoExecute
          key: ship-class
          value: frigate
        
        • spec.nodeTemplate.taints.effectstring

          Allowed values: NoSchedule, PreferNoSchedule, NoExecute

        • spec.nodeTemplate.taints.keystring
        • spec.nodeTemplate.taints.valuestring
    • spec.nodeTypestring

      Required value

      The type of nodes this group provides.

      • Cloud — nodes for this group will be automatically created (and deleted) in the cloud of the specified cloud provider;
      • Static — a static node hosted on a bare metal or virtual machine. The cloud-controller-manager does not manage the node even of one of the cloud providers is enabled;
      • Hybrid — a static node (created manually or using any external tools) hosted in the cloud integrated with one of the cloud provider. This node has the CSI running, and it is managed by the cloud-controller-manager: the Node object automatically gets the information about the zone and region based on the cloud data; if a node gets deleted from the cloud, its corresponding Node object will be deleted in Kubernetes.

      Allowed values: Cloud, Static, Hybrid

    • spec.operatingSystemobject

      Operating System settings for nodes.

      • spec.operatingSystem.manageKernelboolean

        Enable kernel maintenance from bashible.

        Default: true

Defines the runtime parameters of a node group.

  • metadataobject
    • metadata.namestring

      Pattern: ^[a-z0-9]([-a-z0-9]*[a-z0-9])?$

      Maximum length: 42

  • specobject

    Required value

    • spec.chaosobject

      Chaos monkey settings.

      Example:

      mode: DrainAndDelete
      period: 24h
      
      • spec.chaos.modestring

        The chaos monkey mode:

        • DrainAndDelete — drains and deletes a node when triggered;
        • Disabled — leaves this NodeGroup intact.

        Default: "Disabled"

        Allowed values: Disabled, DrainAndDelete

      • spec.chaos.periodstring

        The time interval to use for the chaos monkey (can be specified in the Go format).

        Default: "6h"

        Pattern: ^[0-9]+[mh]{1}$

    • spec.cloudInstancesobject

      Parameter for provisioning the cloud-based VMs.

      Caution! Can only be used together with nodeType: CloudEphemeral.

      • spec.cloudInstances.classReferenceobject

        Required value

        The reference to the InstanceClass object. It is unique for each cloud-provider-* module.

        • spec.cloudInstances.classReference.kindstring

          The object type (e.g., OpenStackInstanceClass). The object type is specified in the documentation of the corresponding cloud-provider- module.

          Allowed values: OpenStackInstanceClass, GCPInstanceClass, VsphereInstanceClass, AWSInstanceClass, YandexInstanceClass, AzureInstanceClass

        • spec.cloudInstances.classReference.namestring

          The name of the required InstanceClass object (e.g., finland-medium).

      • spec.cloudInstances.maxPerZoneinteger

        Required value

        The maximum number of instances for the group in each zone.

        This value is used as the upper bound in cluster-autoscaler.

        With a value of 0, you need to set capacity for some InstanceClass. Get more details in the description of the necessary InstanceClass.

        Allowed values: 0 <= X

      • spec.cloudInstances.maxSurgePerZoneinteger

        The maximum number of instances to rollout simultaneously in the group in each zone.

        Default: 1

        Allowed values: 0 <= X

      • spec.cloudInstances.maxUnavailablePerZoneinteger

        The maximum number of unavailable instances (during rollout) in the group in each zone.

        Default: 0

        Allowed values: 0 <= X

      • spec.cloudInstances.minPerZoneinteger

        Required value

        The minimum number of instances for the group in each zone.

        This value is used in the MachineDeployment object and as a lower bound in cluster-autoscaler.

        Allowed values: 0 <= X

      • spec.cloudInstances.standbyinteger or string

        The summary number of overprovisioned nodes for this NodeGroup all zones.

        An overprovisioned node is a cluster node on which resources are reserved that are available at any time for scaling. The presence of such a node allows the cluster autoscaler not to wait for node initialization (which may take several minutes), but to immediately place a load on it.

        The value can be an absolute number (for example, 2) or a percentage of desired nodes (for example, 10%). If a percentage is specified, the absolute number is calculated based on the percentage of the maximum number of nodes (the maxPerZone parameter) rounded down, but not less than one.

        Pattern: ^[0-9]+%?$

      • spec.cloudInstances.standbyHolderobject

        Amount of reserved resources.

        Used to determine whether to order overprovisioned nodes.

        • spec.cloudInstances.standbyHolder.notHeldResourcesobject

          Describes the resources that will not be held (consumed) by the standby holder.

          • spec.cloudInstances.standbyHolder.notHeldResources.cpuinteger or string

            Describes the amount of CPU that will not be held by standby holder on Nodes from this NodeGroup.

            The value can be an absolute number of cpus (for example, 2) as well as a milli representation (for example, 1500m).

            Pattern: ^[0-9]+m?$

          • spec.cloudInstances.standbyHolder.notHeldResources.memoryinteger or string

            Describes the amount of memory that will not be held by standby holder on Nodes from this NodeGroup.

            The value can be an absolute number of bytes (for example, 128974848) as well as a fixed-point number using one of memory suffixes: G, Gi, M, Mi.

            Pattern: ^[0-9]+(\.[0-9]+)?(E|P|T|G|M|K|Ei|Pi|Ti|Gi|Mi|Ki)?$

      • spec.cloudInstances.zonesarray of strings

        List of availability zones to create instances in.

        The default value depends on the cloud provider selected and usually corresponds to all zones of the region being used.

        Example:

        zones:
        - Helsinki
        - Espoo
        - Tampere
        
    • spec.criobject

      Container runtime parameters.

      • spec.cri.containerdobject

        Containerd runtime parameters.

        If used, cri.type must be set to Containerd.

        • spec.cri.containerd.maxConcurrentDownloadsinteger

          Set the max concurrent downloads for each pull.

          Default: 3

      • spec.cri.typestring

        Container runtime type.

        Value defaultCRI from the initial cluster configration (cluster-configuration.yaml parameter from the d8-cluster-configuration secret in the kube-system namespace) is used if not specified.

        Note! the Docker is deprecated.

        Allowed values: Docker, Containerd, NotManaged

    • spec.disruptionsobject

      Disruptions settings for nodes.

      Example:

      disruptions:
        approvalMode: Automatic
        automatic:
          drainBeforeApproval: false
          windows:
          - from: '06:00'
            to: '08:00'
            days:
            - Tue
            - Sun
      
      • spec.disruptions.approvalModestring

        The approval mode for disruptive updates:

        • Manual — disable automatic disruption approval; the alert will be displayed if disruption is needed. Caution! The master node group update mode must be Manual to avoid issues with draining.
        • Automatic — automatically approve disruption-involving updates.
        • RollingUpdate — in this mode, a new node with new settings will be created; then, the old node will be deleted. Available only for cloud nodes.

        If the RollingUpdate mode is not used, when updating, the node is first drained and then updated (rebooted) and put back into operation (uncordoned). Note that in this case, the cluster must have sufficient resources to accommodate the load while the node being updated is unavailable. In the RollingUpdate mode, the node is replaced by the updated node, i.e., an extra node appears in the cluster for the duration of the update. In cloud infrastructures, the RollingUpdate mode is convenient, for example, if there are no resources in the cluster to temporarily host the load from the node being updated.

        Default: "Automatic"

        Allowed values: Manual, Automatic, RollingUpdate

      • spec.disruptions.automaticobject

        Additional parameters for the Automatic mode.

        • spec.disruptions.automatic.drainBeforeApprovalboolean

          Drain Pods from the nodes before approving disruption.

          Caution! This setting ignores (nodes will be approved without draining Pods):

          • for the nodeGroup master with a single node;
          • for a single ready node in a nodeGroup picked out for Deckhouse placement.

          Default: true

        • spec.disruptions.automatic.windowsarray of objects

          Time windows for node disruptive updates.

          • spec.disruptions.automatic.windows.daysarray of strings

            Days of the week when node could be updated.

            Example:

            days:
            - Mon
            - Wed
            
            • Element of the arraystring

              Day of the week.

              Allowed values: Mon, Tue, Wed, Thu, Fri, Sat, Sun

          • spec.disruptions.automatic.windows.fromstring

            Required value

            Start time of disruptive update window (UTC timezone).

            Pattern: ^(?:\d|[01]\d|2[0-3]):[0-5]\d$

            Example:

            from: '13:00'
            
          • spec.disruptions.automatic.windows.tostring

            Required value

            End time of disruptive update window (UTC timezone).

            Pattern: ^(?:\d|[01]\d|2[0-3]):[0-5]\d$

            Example:

            to: '18:30'
            
      • spec.disruptions.rollingUpdateobject

        Additional parameters for the RollingUpdate mode.

        • spec.disruptions.rollingUpdate.windowsarray of objects

          Time windows for node disruptive updates.

          • spec.disruptions.rollingUpdate.windows.daysarray of strings

            Days of the week when node could be updated.

            Example:

            days:
            - Mon
            - Wed
            
            • Element of the arraystring

              Day of the week.

              Allowed values: Mon, Tue, Wed, Thu, Fri, Sat, Sun

          • spec.disruptions.rollingUpdate.windows.fromstring

            Required value

            Start time of disruptive update window (UTC timezone).

            Pattern: ^(?:\d|[01]\d|2[0-3]):[0-5]\d$

            Example:

            from: '13:00'
            
          • spec.disruptions.rollingUpdate.windows.tostring

            Required value

            End time of disruptive update window (UTC timezone).

            Pattern: ^(?:\d|[01]\d|2[0-3]):[0-5]\d$

            Example:

            to: '18:30'
            
    • spec.dockerobject

      Docker settings for nodes.

      If used, cri.type must be set to Docker.

      Note! the Docker is deprecated.

      • spec.docker.manageboolean

        Enable Docker maintenance from bashible.

        Default: true

      • spec.docker.maxConcurrentDownloadsinteger

        Set the max concurrent downloads for each pull.

        Default: 3

    • spec.kubeletobject

      Kubelet settings for nodes.

      • spec.kubelet.containerLogMaxFilesinteger

        How many rotated log files to store before deleting them.

        WARNING! This parameter does nothing if CRI type is Docker.

        Default: 4

        Allowed values: 1 <= X <= 20

      • spec.kubelet.containerLogMaxSizestring

        Maximum log file size before it is rotated.

        WARNING! This parameter does nothing if CRI type is Docker.

        Default: "50Mi"

        Pattern: \d+[Ei|Pi|Ti|Gi|Mi|Ki|E|P|T|G|M|k|m]

      • spec.kubelet.maxPodsinteger

        Set the max count of pods per node.

        Default: 110

      • spec.kubelet.rootDirstring

        Directory path for managing kubelet files (volume mounts,etc).

        Default: "/var/lib/kubelet"

    • spec.kubernetesVersionstring

      The desired minor version of Kubernetes.

      By default, it corresponds to the version selected for the cluster globally (see installation documentation) or to the current version of the control plane (if the global version is not defined).

      Allowed values: 1.23, 1.24, 1.25, 1.26, 1.27

      Example:

      kubernetesVersion: '1.23'
      
    • spec.nodeTemplateobject

      Specification of some of the fields that will be maintained in all nodes of the group.

      Example:

      labels:
        environment: production
        app: warp-drive-ai
      annotations:
        ai.fleet.com/discombobulate: "true"
      taints:
      - effect: NoExecute
        key: ship-class
        value: frigate
      
      • spec.nodeTemplate.annotationsobject

        Similar to the standard metadata.annotations field.

        Example:

        annotations:
          ai.fleet.com/discombobulate: "true"
        
      • spec.nodeTemplate.labelsobject

        Similar to the standard metadata.labels field.

        Example:

        labels:
          environment: production
          app: warp-drive-ai
        
      • spec.nodeTemplate.taintsarray of objects

        Similar to the .spec.taints field of the Node object.

        Caution! Only effect, key, value fields are available.

        Example:

        taints:
        - effect: NoExecute
          key: ship-class
          value: frigate
        
        • spec.nodeTemplate.taints.effectstring

          Allowed values: NoSchedule, PreferNoSchedule, NoExecute

        • spec.nodeTemplate.taints.keystring
        • spec.nodeTemplate.taints.valuestring
    • spec.nodeTypestring

      Required value

      The type of nodes this group provides.

      • Cloud — nodes for this group will be automatically created (and deleted) in the cloud of the specified cloud provider;
      • Static — a static node hosted on a bare metal or virtual machine. The cloud-controller-manager does not manage the node even of one of the cloud providers is enabled;
      • Hybrid — a static node (created manually or using any external tools) hosted in the cloud integrated with one of the cloud provider. This node has the CSI running, and it is managed by the cloud-controller-manager: the Node object automatically gets the information about the zone and region based on the cloud data; if a node gets deleted from the cloud, its corresponding Node object will be deleted in Kubernetes.

      Allowed values: Cloud, Static, Hybrid

    • spec.operatingSystemDeprecatedobject

      Operating System settings for nodes.

      • spec.operatingSystem.manageKernelDeprecatedboolean

        Enable kernel maintenance from bashible.

        Default: true

    • spec.staticobject

      Static node parameters

      • spec.static.internalNetworkCIDRsarray of strings

        Subnet CIDR

NodeGroupConfiguration

Scope: Cluster
Version: v1alpha1

  • specobject

    Required value

    • spec.bundlesarray of strings

      Required value

      Bundles for step execution. You can set '*' for selecting all bundles.

      See the list of possible bundles in the allowedBundles module parameter.

      Examples:

      bundles:
      - ubuntu-lts
      - centos-7
      
      bundles:
      - ubuntu-lts
      
      bundles:
      - "*"
      
    • spec.contentstring

      Required value

      The content of the step. Can be either Go Template or plain bash.

    • spec.nodeGroupsarray of strings

      Required value

      List of NodeGroups to apply the step for. You can set '*' for selecting all NodeGroups.

      Examples:

      nodeGroups:
      - master
      - worker
      
      nodeGroups:
      - worker
      
      nodeGroups:
      - "*"
      
    • spec.weightinteger

      Order of the step execution.

      Default: 100

NodeUser

Scope: Cluster

Defines the linux users to create on all nodes.

The user’s home directory is created in the /home/deckhouse/ directory.

  • specobject

    Required value

    • spec.extraGroupsarray of strings

      Node user additional system groups.

      Examples:

      extraGroups:
      - docker
      
      extraGroups:
      - docker
      - ftp
      
    • spec.isSudoerboolean

      Persistence of node user in sudo group.

      Default: false

      Example:

      isSudoer: true
      
    • spec.nodeGroupsarray of strings

      List of NodeGroups to apply the user for.

      Default: ["*"]

      Examples:

      nodeGroups:
      - master
      - worker
      
      nodeGroups:
      - worker
      
      nodeGroups:
      - "*"
      
    • spec.passwordHashstring

      Hashed user password.

      The format corresponds to the password hashes in /etc/shadow. Yoou can get it using the following command: openssl passwd -6.

      Example:

      passwordHash: "$2a$10$F9ey7zW.sVliT224RFxpWeMsgzO.D9YRG54a8T36/K2MCiT41nzmC"
      
    • spec.sshPublicKeystring

      Node user SSH public key.

      Either sshPublicKey or sshPublicKeys must be specified.

      Example:

      sshPublicKey: ssh-rsa AAABBB
      
    • spec.sshPublicKeysarray of strings

      Node user SSH public keys.

      Either sshPublicKey or sshPublicKeys must be specified.

      Example:

      sshPublicKeys:
      - ssh-rsa AAABBB
      - cert-authority,principals="name" ssh-rsa BBBCCC
      
    • spec.uidnumber

      Required value

      Node user ID.

      This parameter does not change during the entire resource life.

      Allowed values: 1001 <= X

      Example:

      uid: 1001
      

Defines the linux users to create on all nodes.

The user’s home directory is created in the /home/deckhouse/ directory.

  • specobject

    Required value

    • spec.extraGroupsarray of strings

      Node user additional system groups.

      Examples:

      extraGroups:
      - docker
      
      extraGroups:
      - docker
      - ftp
      
    • spec.isSudoerboolean

      Persistence of node user in sudo group.

      Default: false

      Example:

      isSudoer: true
      
    • spec.nodeGroupsarray of strings

      List of NodeGroups to apply the user for.

      Default: ["*"]

      Examples:

      nodeGroups:
      - master
      - worker
      
      nodeGroups:
      - worker
      
      nodeGroups:
      - "*"
      
    • spec.passwordHashstring

      Hashed user password.

      The format corresponds to the password hashes in /etc/shadow. Yoou can get it using the following command: openssl passwd -6.

      Example:

      passwordHash: "$2a$10$F9ey7zW.sVliT224RFxpWeMsgzO.D9YRG54a8T36/K2MCiT41nzmC"
      
    • spec.sshPublicKeystring

      Node user SSH public key.

      Either sshPublicKey or sshPublicKeys must be specified.

      Example:

      sshPublicKey: ssh-rsa AAABBB
      
    • spec.sshPublicKeysarray of strings

      Node user SSH public keys.

      Either sshPublicKey or sshPublicKeys must be specified.

      Example:

      sshPublicKeys:
      - ssh-rsa AAABBB
      - cert-authority,principals="name" ssh-rsa BBBCCC
      
    • spec.uidnumber

      Required value

      Node user ID.

      This parameter does not change during the entire resource life.

      Allowed values: 1001 <= X

      Example:

      uid: 1001