Descheduler

Scope: Cluster

Descheduler is a description of a single descheduler instance.

  • apiVersion
    string
  • kind
    string
  • metadata
    object
    • metadata.name
      string
  • spec
    object

    Required value

    Defines the behavior of a descheduler instance.

    • spec.namespaceLabelSelector
      object

      Limiting the pods which are processed by namespace labels.

      • spec.namespaceLabelSelector.matchExpressions
        array of objects

        List of label expressions that a node should have to qualify for the filter condition.

        Example:

        matchExpressions:
        - key: tier
          operator: In
          values:
          - production
          - staging
        - key: tier
          operator: NotIn
          values:
          - production
        
        • spec.namespaceLabelSelector.matchExpressions.key
          string

          A label name.

        • spec.namespaceLabelSelector.matchExpressions.operator
          string

          A comparison operator.

          Allowed values: In, NotIn, Exists, DoesNotExist

        • spec.namespaceLabelSelector.matchExpressions.values
          array of strings

          A label value.

          • Element of the array
            string

            Pattern: [a-z0-9]([-a-z0-9]*[a-z0-9])?

            Length: 1..63

      • spec.namespaceLabelSelector.matchLabels
        object
    • spec.nodeLabelSelector
      object

      Limiting the pods which are processed to fit evicted pods by labels in set representation. If set, nodeSelector must not be set.

      • spec.nodeLabelSelector.matchExpressions
        array of objects

        List of label expressions that a node should have to qualify for the filter condition.

        Example:

        matchExpressions:
        - key: tier
          operator: In
          values:
          - production
          - staging
        - key: tier
          operator: NotIn
          values:
          - production
        
        • spec.nodeLabelSelector.matchExpressions.key
          string

          A label name.

        • spec.nodeLabelSelector.matchExpressions.operator
          string

          A comparison operator.

          Allowed values: In, NotIn, Exists, DoesNotExist

        • spec.nodeLabelSelector.matchExpressions.values
          array of strings

          A label value.

          • Element of the array
            string

            Pattern: [a-z0-9]([-a-z0-9]*[a-z0-9])?

            Length: 1..63

      • spec.nodeLabelSelector.matchLabels
        object
    • spec.podLabelSelector
      object

      Limiting the pods which are processed by labels.

      • spec.podLabelSelector.matchExpressions
        array of objects

        List of label expressions that a node should have to qualify for the filter condition.

        Example:

        matchExpressions:
        - key: tier
          operator: In
          values:
          - production
          - staging
        - key: tier
          operator: NotIn
          values:
          - production
        
        • spec.podLabelSelector.matchExpressions.key
          string

          A label name.

        • spec.podLabelSelector.matchExpressions.operator
          string

          A comparison operator.

          Allowed values: In, NotIn, Exists, DoesNotExist

        • spec.podLabelSelector.matchExpressions.values
          array of strings

          A label value.

          • Element of the array
            string

            Pattern: [a-z0-9]([-a-z0-9]*[a-z0-9])?

            Length: 1..63

      • spec.podLabelSelector.matchLabels
        object
    • spec.priorityClassThreshold
      object

      Limiting the pods which are processed by priority class. Only pods under the threshold can be evicted.

      You can specify either the name of the priority class (priorityClassThreshold.name), or the actual value of the priority class (priorityThreshold.value).

      By default, this threshold is set to the value of system-cluster-critical priority class.

      • spec.priorityClassThreshold.name
        string

        Name of the priority class.

      • spec.priorityClassThreshold.value
        integer

        Value of the priority class.

    • spec.strategies
      object

      Required value

      Settings of strategies for the Descheduler instances.

      • spec.strategies.highNodeUtilization
        object

        This strategy finds nodes that are under utilized and evicts Pods from the nodes in the hope that these pods will be scheduled compactly into fewer nodes. When combined with node auto-scaling, it helps reduce the number of underutilized nodes. The strategy works with the MostAllocated scheduler.

        In GKE, you cannot configure the default scheduler, but you can use the optimize-utilization strategy or deploy a second custom scheduler.

        Node resource usage takes into account extended resources and is based on pod requests and limits, not actual consumption.

        • spec.strategies.highNodeUtilization.enabled
          boolean

          Makes the strategy active.

          Default: false

        • spec.strategies.highNodeUtilization.thresholds
          object

          Sets threshold values to identify to identify under utilized nodes.

          If the resource usage of the node is below all threshold values, then the node is considered under utilized.

          • spec.strategies.highNodeUtilization.thresholds.cpu
            integer

            CPU fraction in percents

            Default: 20

          • spec.strategies.highNodeUtilization.thresholds.memory
            integer

            Memory fraction in percents

            Default: 20

          • spec.strategies.highNodeUtilization.thresholds.pods
            integer

            Pods count in percents

            Default: 20

      • spec.strategies.lowNodeUtilization
        object

        This strategy identifies under utilized nodes and evicts pods from other over utilized nodes. The strategy assumes that the evicted pods will be recreated on the under utilized nodes (following normal scheduler behavior).

        Under utilized node — A node whose resource usage is below all the threshold values specified in the thresholds section.

        Over utilized node — A node whose resource usage exceeds at least one of the threshold values specified in the targetThresholds section.

        Node resource usage takes into account extended resources and is based on pod requests and limits, not actual consumption.

        • spec.strategies.lowNodeUtilization.enabled
          boolean

          Makes the strategy active.

          Default: false

        • spec.strategies.lowNodeUtilization.targetThresholds
          object

          Sets threshold values to identify to identify over utilized nodes.

          If the resource usage of the node exceeds at least one of the threshold values, then the node is considered over utilized.

          • spec.strategies.lowNodeUtilization.targetThresholds.cpu
            integer

            CPU fraction in percents

            Default: 70

          • spec.strategies.lowNodeUtilization.targetThresholds.memory
            integer

            Memory fraction in percents

            Default: 70

          • spec.strategies.lowNodeUtilization.targetThresholds.pods
            integer

            Pods count in percents

            Default: 70

        • spec.strategies.lowNodeUtilization.thresholds
          object

          Sets threshold values to identify to identify under utilized nodes.

          If the resource usage of the node is below all threshold values, then the node is considered under utilized.

          • spec.strategies.lowNodeUtilization.thresholds.cpu
            integer

            CPU fraction in percents

            Default: 20

          • spec.strategies.lowNodeUtilization.thresholds.memory
            integer

            Memory fraction in percents

            Default: 20

          • spec.strategies.lowNodeUtilization.thresholds.pods
            integer

            Pods count in percents

            Default: 20

      • spec.strategies.removeDuplicates
        object

        The strategy ensures that no more than one pod of a ReplicaSet, ReplicationController, StatefulSet, or pods of a single Job is running on the same node. If there are two or more such pods, the module evicts the excess pods so that they are better distributed across the cluster.

        • spec.strategies.removeDuplicates.enabled
          boolean

          Makes the strategy active.

          Default: false

      • spec.strategies.removePodsViolatingInterPodAntiAffinity
        object

        The strategy ensures that pods violating inter-pod affinity and anti-affinity rules are evicted from nodes.

        • spec.strategies.removePodsViolatingInterPodAntiAffinity.enabled
          boolean

          Makes the strategy active.

          Default: false

      • spec.strategies.removePodsViolatingNodeAffinity
        object

        The strategy makes sure all pods violating node affinity are eventually removed from nodes.

        Essentially, depending on the settings of the parameter nodeAffinityType, the strategy temporarily implement the rule requiredDuringSchedulingIgnoredDuringExecution of the pod’s node affinity as the rule requiredDuringSchedulingRequiredDuringExecution, and the rule preferredDuringSchedulingIgnoredDuringExecution as the rule preferredDuringSchedulingPreferredDuringExecution.

        • spec.strategies.removePodsViolatingNodeAffinity.enabled
          boolean

          Makes the strategy active.

          Default: false

        • spec.strategies.removePodsViolatingNodeAffinity.nodeAffinityType
          array of strings

          Defines the list of node affinity rules used.

          Default: ["requiredDuringSchedulingIgnoredDuringExecution"]

          • Element of the array
            string

            Allowed values: requiredDuringSchedulingIgnoredDuringExecution, preferredDuringSchedulingIgnoredDuringExecution

Deprecated resource. Support for the resource might be removed in a later release.

Descheduler is a description of a single descheduler instance.

  • apiVersion
    string
  • kind
    string
  • metadata
    object
    • metadata.name
      string
  • spec
    object

    Required value

    Defines the behavior of a descheduler instance.

    • spec.deploymentTemplate
      object

      Defines Template of a descheduler Deployment.

      • spec.deploymentTemplate.nodeSelector
        object
      • spec.deploymentTemplate.tolerations
        array of objects
        • spec.deploymentTemplate.tolerations.effect
          string
        • spec.deploymentTemplate.tolerations.key
          string
        • spec.deploymentTemplate.tolerations.operator
          string
        • spec.deploymentTemplate.tolerations.tolerationSeconds
          integer
        • spec.deploymentTemplate.tolerations.value
          string
    • spec.deschedulerPolicy
      object

      globalParameters and strategies follow the descheduler’s documentation.

      • spec.deschedulerPolicy.globalParameters
        object

        Parameters that apply to all policies.

        • spec.deschedulerPolicy.globalParameters.evictFailedBarePods
          boolean

          Allows Pods without ownerReferences and in failed phase to be evicted.

        • spec.deschedulerPolicy.globalParameters.evictLocalStoragePods
          boolean

          Allows Pods using local storage to be evicted.

        • spec.deschedulerPolicy.globalParameters.evictSystemCriticalPods
          boolean

          Allows eviction of Pods of any priority (including Kubernetes system Pods).

        • spec.deschedulerPolicy.globalParameters.ignorePvcPods
          boolean

          Prevents Pods with PVCs from being evicted.

        • spec.deschedulerPolicy.globalParameters.maxNoOfPodsToEvictPerNamespace
          integer

          Restricts maximum of Pods to be evicted per namespace.

        • spec.deschedulerPolicy.globalParameters.maxNoOfPodsToEvictPerNode
          integer

          Restricts maximum of Pods to be evicted per node.

        • spec.deschedulerPolicy.globalParameters.nodeSelector
          string
      • spec.deschedulerPolicy.strategies
        object

        List of strategies with corresponding parameters for a given Descheduler instances.

        • spec.deschedulerPolicy.strategies.highNodeUtilization
          object

          This strategy finds nodes that are under utilized and evicts Pods from the nodes in the hope that these Pods will be scheduled compactly into fewer nodes.

          • spec.deschedulerPolicy.strategies.highNodeUtilization.enabled
            boolean

            Required value

          • spec.deschedulerPolicy.strategies.highNodeUtilization.namespaceFilter
            object

            Restricts Namespaces to which this strategy applies.

          • spec.deschedulerPolicy.strategies.highNodeUtilization.nodeFilter
            object

            Filters Nodes to which the strategy applies.

          • spec.deschedulerPolicy.strategies.highNodeUtilization.nodeFit
            boolean

            If set to true, the descheduler will consider whether or not the Pods that meet eviction criteria will fit on other nodes before evicting them.

          • spec.deschedulerPolicy.strategies.highNodeUtilization.priorityFilter
            object

            Only Pods with priority lower than this will be descheduled.

        • spec.deschedulerPolicy.strategies.lowNodeUtilization
          object

          This strategy finds nodes that are under utilized and evicts Pods, if possible, from other nodes in the hope that recreation of evicted Pods will be scheduled on these underutilized nodes.

          • spec.deschedulerPolicy.strategies.lowNodeUtilization.enabled
            boolean

            Required value

          • spec.deschedulerPolicy.strategies.lowNodeUtilization.namespaceFilter
            object

            Restricts Namespaces to which this strategy applies.

          • spec.deschedulerPolicy.strategies.lowNodeUtilization.nodeFilter
            object

            Filters Nodes to which the strategy applies.

          • spec.deschedulerPolicy.strategies.lowNodeUtilization.nodeFit
            boolean

            If set to true, the descheduler will consider whether or not the Pods that meet eviction criteria will fit on other nodes before evicting them.

          • spec.deschedulerPolicy.strategies.lowNodeUtilization.priorityFilter
            object

            Only Pods with priority lower than this will be descheduled.

        • spec.deschedulerPolicy.strategies.removeDuplicates
          object

          This strategy makes sure that there is only one Pod associated with a ReplicaSet (RS), ReplicationController (RC), StatefulSet, or Job running on the same node.

          • spec.deschedulerPolicy.strategies.removeDuplicates.enabled
            boolean

            Required value

          • spec.deschedulerPolicy.strategies.removeDuplicates.namespaceFilter
            object

            Restricts Namespaces to which this strategy applies.

          • spec.deschedulerPolicy.strategies.removeDuplicates.nodeFilter
            object

            Filters Nodes to which the strategy applies.

          • spec.deschedulerPolicy.strategies.removeDuplicates.nodeFit
            boolean

            If set to true, the descheduler will consider whether or not the Pods that meet eviction criteria will fit on other nodes before evicting them.

          • spec.deschedulerPolicy.strategies.removeDuplicates.priorityFilter
            object

            Only Pods with priority lower than this will be descheduled.

        • spec.deschedulerPolicy.strategies.removeFailedPods
          object

          This strategy evicts Pods that are in failed status phase.

          • spec.deschedulerPolicy.strategies.removeFailedPods.enabled
            boolean

            Required value

          • spec.deschedulerPolicy.strategies.removeFailedPods.namespaceFilter
            object

            Restricts Namespaces to which this strategy applies.

          • spec.deschedulerPolicy.strategies.removeFailedPods.nodeFilter
            object

            Filters Nodes to which the strategy applies.

          • spec.deschedulerPolicy.strategies.removeFailedPods.nodeFit
            boolean

            If set to true, the descheduler will consider whether or not the Pods that meet eviction criteria will fit on other nodes before evicting them.

          • spec.deschedulerPolicy.strategies.removeFailedPods.priorityFilter
            object

            Only Pods with priority lower than this will be descheduled.

        • spec.deschedulerPolicy.strategies.removePodsHavingTooManyRestarts
          object

          This strategy makes sure that Pods having too many restarts are removed from nodes.

          • spec.deschedulerPolicy.strategies.removePodsHavingTooManyRestarts.enabled
            boolean

            Required value

          • spec.deschedulerPolicy.strategies.removePodsHavingTooManyRestarts.namespaceFilter
            object

            Restricts Namespaces to which this strategy applies.

          • spec.deschedulerPolicy.strategies.removePodsHavingTooManyRestarts.nodeFilter
            object

            Filters Nodes to which the strategy applies.

          • spec.deschedulerPolicy.strategies.removePodsHavingTooManyRestarts.nodeFit
            boolean

            If set to true, the descheduler will consider whether or not the Pods that meet eviction criteria will fit on other nodes before evicting them.

          • spec.deschedulerPolicy.strategies.removePodsHavingTooManyRestarts.priorityFilter
            object

            Only Pods with priority lower than this will be descheduled.

        • spec.deschedulerPolicy.strategies.removePodsViolatingInterPodAntiAffinity
          object

          This strategy makes sure that Pods violating interpod anti-affinity are removed from nodes.

          • spec.deschedulerPolicy.strategies.removePodsViolatingInterPodAntiAffinity.enabled
            boolean

            Required value

          • spec.deschedulerPolicy.strategies.removePodsViolatingInterPodAntiAffinity.namespaceFilter
            object

            Restricts Namespaces to which this strategy applies.

          • spec.deschedulerPolicy.strategies.removePodsViolatingInterPodAntiAffinity.nodeFilter
            object

            Filters Nodes to which the strategy applies.

          • spec.deschedulerPolicy.strategies.removePodsViolatingInterPodAntiAffinity.nodeFit
            boolean

            If set to true, the descheduler will consider whether or not the Pods that meet eviction criteria will fit on other nodes before evicting them.

          • spec.deschedulerPolicy.strategies.removePodsViolatingInterPodAntiAffinity.priorityFilter
            object

            Only Pods with priority lower than this will be descheduled.

        • spec.deschedulerPolicy.strategies.removePodsViolatingNodeAffinity
          object

          This strategy makes sure all Pods violating node affinity are eventually removed from nodes.

          • spec.deschedulerPolicy.strategies.removePodsViolatingNodeAffinity.enabled
            boolean

            Required value

          • spec.deschedulerPolicy.strategies.removePodsViolatingNodeAffinity.namespaceFilter
            object

            Restricts Namespaces to which this strategy applies.

          • spec.deschedulerPolicy.strategies.removePodsViolatingNodeAffinity.nodeFilter
            object

            Filters Nodes to which the strategy applies.

          • spec.deschedulerPolicy.strategies.removePodsViolatingNodeAffinity.nodeFit
            boolean

            If set to true, the descheduler will consider whether or not the Pods that meet eviction criteria will fit on other nodes before evicting them.

          • spec.deschedulerPolicy.strategies.removePodsViolatingNodeAffinity.priorityFilter
            object

            Only Pods with priority lower than this will be descheduled.

        • spec.deschedulerPolicy.strategies.removePodsViolatingNodeTaints
          object

          This strategy makes sure that Pods violating NoSchedule taints on nodes are removed.

          • spec.deschedulerPolicy.strategies.removePodsViolatingNodeTaints.enabled
            boolean

            Required value

          • spec.deschedulerPolicy.strategies.removePodsViolatingNodeTaints.namespaceFilter
            object

            Restricts Namespaces to which this strategy applies.

          • spec.deschedulerPolicy.strategies.removePodsViolatingNodeTaints.nodeFilter
            object

            Filters Nodes to which the strategy applies.

          • spec.deschedulerPolicy.strategies.removePodsViolatingNodeTaints.nodeFit
            boolean

            If set to true, the descheduler will consider whether or not the Pods that meet eviction criteria will fit on other nodes before evicting them.

          • spec.deschedulerPolicy.strategies.removePodsViolatingNodeTaints.priorityFilter
            object

            Only Pods with priority lower than this will be descheduled.

        • spec.deschedulerPolicy.strategies.removePodsViolatingTopologySpreadConstraint
          object

          This strategy makes sure that Pods violating topology spread constraints are evicted from nodes.

          • spec.deschedulerPolicy.strategies.removePodsViolatingTopologySpreadConstraint.enabled
            boolean

            Required value

          • spec.deschedulerPolicy.strategies.removePodsViolatingTopologySpreadConstraint.namespaceFilter
            object

            Restricts Namespaces to which this strategy applies.

          • spec.deschedulerPolicy.strategies.removePodsViolatingTopologySpreadConstraint.nodeFilter
            object

            Filters Nodes to which the strategy applies.

          • spec.deschedulerPolicy.strategies.removePodsViolatingTopologySpreadConstraint.nodeFit
            boolean

            If set to true, the descheduler will consider whether or not the Pods that meet eviction criteria will fit on other nodes before evicting them.

          • spec.deschedulerPolicy.strategies.removePodsViolatingTopologySpreadConstraint.priorityFilter
            object

            Only Pods with priority lower than this will be descheduled.