Version 1.70
Important
-
The
ceph-csi
module has been removed. Use thecsi-ceph
module instead. Deckhouse will not be updated as long asceph-csi
is enabled in the cluster. Forcsi-ceph
migration instructions, refer to the module documentation. -
Version 1.12 of the NGINX Ingress Controller has been added. The default controller version has been changed to 1.10. All Ingress controllers that do not have an explicitly specified version (via the
controllerVersion
parameter in the IngressNginxController resource or thedefaultControllerVersion
parameter in theingress-nginx
module) will be restarted. -
The
falco_events
metric (from theruntime-audit-engine
module) has been removed. Thefalco_events
metric was considered deprecated since DKP 1.68. Use thefalcosecurity_falcosidekick_falco_events_total
metric instead. Dashboards and alerts based on thefalco_events
metric may stop working. -
All DKP components will be restarted during the update.
Major changes
- In the
Auto
update mode, patch version updates (for example, fromv1.70.1
tov1.70.2
) are now applied taking into account the update windows, if they are set. Previously, in this update mode, only minor version updates (for example, fromv1.69.x
tov1.70.x
) were applied with consideration to update windows, while patch version updates were applied as they appeared on a release channel. - A node can now be rebooted if the corresponding Node object has the
update.node.deckhouse.io/reboot
annotation set. - When cleaning up a static node, any local users created by Deckhouse Kubernetes Platform are now also removed.
- Added synchronization monitoring for Istio in multi-cluster configurations. A new alert
D8IstioRemoteClusterNotSynced
has been introduced and triggers in the following cases:- The remote cluster is offline.
- The remote API endpoint is not reachable.
- The remote
ServiceAccount
token is invalid or expired. - There is a TLS or certificate issue between the clusters.
- The
deckhouse-controller collect-debug-info
command now also collects debug information forIstio
, including:- Resources in the
d8-istio
namespace. - CRDs from the
istio.io
andgateway.networking.k8s.io
groups. Istio
logs.Sidecar
logs of a single randomly selected user application.
- Resources in the
- A new monitoring dashboard has been added to display OpenVPN certificate status. Upon expiration, server certificates will now be reissued, and client certificates will be removed. The following alerts have been added:
OpenVPNClientCertificateExpired
: Warns about expired client certificates.OpenVPNServerCACertificateExpired
: Warns about an expired OpenVPN CA certificate.OpenVPNServerCACertificateExpiringSoon
andOpenVPNServerCACertificateExpiringInAWeek
: Warn when an OpenVPN CA certificate is expiring in less than 30 or 7 days, respectively.OpenVPNServerCertificateExpired
: Warns about an expired OpenVPN server certificate.OpenVPNServerCertificateExpiringSoon
andOpenVPNServerCertificateExpiringInAWeek
: Warn when an OpenVPN server certificate is expiring in less than 30 or 7 days, respectively.
- Monitoring dashboards have been renamed and updated:
- “L2LoadBalancer” renamed to “MetalLB L2”; pool and column filtering added.
- “Metallb” renamed to “MetalLB BGP”; pool and column filtering added. The ARP request panel has been removed.
- “L2LoadBalancer / Pools” renamed to “MetalLB / Pools”.
-
The
upmeter
module’s PVC size has been increased to accommodate data retention for 13 months. In some cases, the previous PVC size was insufficient. -
The ModuleSource resource status now includes information about module versions in the source.
-
The Module resource status now includes information about the module’s lifecycle stage. A module can move through the following stages in its lifecycle: Experimental, Preview, General Availability, and Deprecated. For details on module lifecycle stages and how to evaluate its stability, refer to the corresponding section in the documentation.
-
It is now possible to use stronger or more modern encryption algorithms (such as
RSA-3072
,RSA-4096
, orECDSA-P256
) for control plane cluster certificates instead of the defaultRSA-2048
. You can use theencryptionAlgorithm
parameter in the ClusterConfiguration resource to configure this. -
The
descheduler
module can now be configured to evict pods that are using local storage. Use theevictLocalStoragePods
parameter in the module configuration to adjust this. - You can now adjust the logging level of the Ingress controller using the
controllerLogLevel
parameter in the IngressNginxController resource. The default log level isInfo
. Controlling the logging level can help prevent log collector overload during Ingress controller restarts.
Security
-
The severity level of alerts indicating security policy violations has been raised from 7 to 3.
-
The configuration for
Yandex Cloud
,Zvirt
, andDynamix
providers now usesOpenTofu
instead ofTerraform
. This enables easier provider updates, such as applying fixes for known vulnerabilities (CVEs). -
CVE vulnerabilities have been fixed in the following modules:
chrony
,descheduler
,dhctl
,node-manager
,registry-packages-proxy
,falco
,cni-cilium
, andvertical-pod-autoscaler
.
Component version updates
The following DKP components have been updated:
containerd
: 1.7.27runc
: 1.2.5go
: 1.24.2, 1.23.8golang.org/x/net
: v0.38.0mcm
: v0.36.0-flant.23ingress-nginx
: 1.12.1terraform-provider-aws
: 5.83.1Deckhouse CLI
: 0.12.1etcd
: v3.5.21
Version 1.69
Important
-
Support for Kubernetes 1.32 has been added, while support for Kubernetes 1.27 has been discontinued. The default Kubernetes version has been changed to 1.30. In future DKP releases, support for Kubernetes 1.28 will be removed.
-
All DKP components will be restarted during the update.
Major changes
-
The
ceph-csi
module is now deprecated. Plan to migrate to thecsi-ceph
module instead. For details, refer to the Ceph documentation. -
You can now grant access to Deckhouse web interfaces using user names via the
auth.allowedUserEmails
field. Access restriction is configured together with theauth.allowedUserGroups
parameter in configuration of the following modules with web interfaces:cilium-hubble
,dashboard
,deckhouse-tools
,documentation
,istio
,openvpn
,prometheus
, andupmeter
(example forprometheus
). -
A new dashboard Cilium Nodes Connectivity Status & Latency has been added to Grafana in the
cni-cilium
module. It helps monitor node network connectivity issues. The dashboard displays a connectivity matrix similar to thecilium-health status
command, using metrics that are already available in Prometheus. -
A new
D8KubernetesStaleTokensDetected
alert has been added in thecontrol-plane-manager
module that is triggered when stale service account tokens are detected in the cluster. -
You can now create a Project from an existing namespace and adopt existing objects into it. To do this, annotate the namespace and its resources with
projects.deckhouse.io/adopt
. This lets you switch to using Projects without recreating cluster resources. -
A
Terminating
status has been added to ModuleSource and ModuleRelease resources. The new status will be displayed when an attempt to delete one of them fails. -
The installer container now automatically configures cluster access after a successful bootstrap. A
kubeconfig
file is generated in~/.kube/config
, and a local TCP proxy is set up through an SSH tunnel. This allows you to use kubectl locally right away without manually connecting to the control-plane node via SSH. -
Changes to Kubernetes resources in multi-cluster and federation setups are now tracked directly via Kubernetes API. This enables faster synchronization between clusters and eliminates the use of outdated certificates. In addition, mounting of ConfigMap and Secret resources into Pods has been removed to eliminate family system compromise risks.
-
A new dynamicforward plugin has been added to CoreDNS, improving DNS query processing in the cluster. It integrates with
node-local-dns
, continuously monitorskube-dns
endpoints, and automatically updates the list of DNS forwarders. If the control-plane node is unavailable, DNS queries are still forwarded to available endpoints, improving cluster stability. -
A new log rotation approach has been introduced in the
loki
module. Now, old logs are automatically removed when disk usage exceeds a threshold: either 95% of PVC size or PVC size minus the size required to store two minutes of log data at the configured ingestion rate (ingestionRateMB
). TheretentionPeriodHours
parameter no longer controls the data retention and is used for monitoring alerts only. Ifloki
begins removing old logs before the set period is reached, aLokiRetentionPerionViolation
alert will be triggered, informing the user that they must reduce the value ofretentionPeriodHours
or increase the PVC size. -
A new
nodeDrainTimeoutSecond
parameter lets you set the maximum timeout when attempting to drain a node (in seconds) for each NodeGroup resource. Previously, you could only use the default value (10 minutes) or reduce it to 5 minutes using thequickShutdown
parameter, which is now deprecated. -
The
openvpn
module now includes adefaultClientCertExpirationDays
parameter, allowing you to define the lifetime of client certificates.
Security
Known vulnerabilities have been addressed in the following modules:
ingress-nginx
, istio
, prometheus
, and local-path-provisioner
.
Component version updates
The following DKP components have been updated:
cert-manager
: 1.17.1dashboard
: 1.6.1dex
: 2.42.0go-vcloud-director
: 2.26.1- Grafana: 10.4.15
- Kubernetes control plane: 1.29.14, 1.30.1, 1.31.6, 1.32.2
kube-state-metrics
(monitoring-kubernetes
): 2.15.0local-path-provisioner
: 0.0.31machine-controller-manager
: v0.36.0-flant.19pod-reloader
: 1.2.1prometheus
: 2.55.1- Terraform providers:
- OpenStack: 1.54.1
- vCD: 3.14.1
Version 1.68
Important
- After the update, the UID will change for all Grafana data sources created using the GrafanaAdditionalDatasource resource. If a data source was referenced by UID, that reference will no longer be valid.
Major changes
-
A new parameter,
iamNodeRole
, has been introduced for the AWS provider. It lets you specify the name of the IAM role to bind to all AWS instances of cluster nodes. This can come in handy if you need to grant additional permissions (for example, access to ECR, etc.). -
Creating nodes of the CloudPermanent type now takes less time. Now, CloudPermanent nodes are created in parallel. Previously, they were created in parallel only within a single group.
- Monitoring changes:
- Support for monitoring certificates in secrets of the
Opaque
type has been added. - Support for monitoring images in Amazon ECR has been added.
- A bug that could cause partial loss of metrics when Prometheus instances were restarted has been fixed.
- Support for monitoring certificates in secrets of the
-
When using a multi-cluster Istio configuration or federation, you can now explicitly specify the list of addresses used for inter-cluster requests. Previously, these addresses were determined automatically; however, in some configurations, they could not be resolved.
-
The DexAuthenticator resource now has a
highAvailability
parameter that controls high availability mode. In high availability mode, multiple replicas of the authenticator are launched. Previously, high availability mode of all authenticators was determined by a global parameter or by theuser-authn
module. All authenticators deployed by DKP now inherit the high availability mode of the corresponding module. -
Node labels can now be added, removed, or modified using files stored on the node in the
/var/lib/node_labels
directory and its subdirectories. The full set of applied labels is stored in thenode.deckhouse.io/last-applied-local-labels
annotation. -
Support for the Huawei Cloud provider has been added.
-
The new
keepDeletedFilesOpenedFor
parameter in thelog-shipper
module allows you to configure the period to keep the deleted log files open. This way, you can continue reading logs from deleted pods for some time if log storage is temporarily unavailable. -
TLS encryption for log collectors (Elasticsearch, Vector, Loki, Splunk, Logstash, Socket, Kafka) can now be configured using secrets, rather than by storing certificates in the ClusterLogDestination resources. The secret must reside in the
d8-log-shipper
namespace and have thelog-shipper.deckhouse.io/watch-secret: true
label. -
In the project status under the
resources
section, you can now see which project resources have been installed. Those resources are marked withinstalled: true
. - A new parameter,
--tf-resource-management-timeout
, has been added to the installer. It controls the resource creation timeout in cloud environments. By default, the timeout is set to 10 minutes. This parameter applies only to the following clouds: AWS, Azure, GCP, OpenStack.
Security
Known vulnerabilities have been addressed in the following modules:
admission-policy-engine
chrony
cloud-provider-azure
cloud-provider-gcp
cloud-provider-openstack
cloud-provider-yandex
cloud-provider-zvirt
cni-cilium
control-plane-manager
extended-monitoring
descheduler
documentation
ingress-nginx
istio
loki
metallb
monitoring-kubernetes
monitoring-ping
node-manager
operator-trivy
pod-reloader
prometheus
prometheus-metrics-adapter
registrypackages
runtime-audit-engine
terraform-manager
user-authn
vertical-pod-autoscaler
static-routing-manager
Component version updates
The following DKP components have been updated:
- Kubernetes Control Plane: 1.29.14, 1.30.10, 1.31.6
aws-node-termination-handler
: 1.22.1capcd-controller-manager
: 1.3.2cert-manager
: 1.16.2chrony
: 4.6.1cni-flannel
: 0.26.2docker_auth
: 1.13.0flannel-cni
: 1.6.0-flannel1gatekeeper
: 3.18.1jq
: 1.7.1kubernetes-cni
: 1.6.2kube-state-metrics
: 2.14.0vector
(log-shipper
): 0.44.0prometheus
: 2.55.1snapshot-controller
: 8.2.0yq4
: 3.45.1
Mandatory component restart
The following components will be restarted after updating DKP to 1.68:
- Kubernetes Control Plane
- Ingress controller
- Prometheus, Grafana
admission-policy-engine
chrony
cloud-provider-azure
cloud-provider-gcp
cloud-provider-openstack
cloud-provider-yandex
cloud-provider-zvirt
cni-cilium
control-plane-manager
descheduler
documentation
extended-monitoring
ingress-nginx
istio
kube-state-metrics
log-shipper
loki
metallb
monitoring-kubernetes
monitoring-ping
node-manager
openvpn
operator-trivy
prometheus
prometheus-metrics-adapter
pod-reloader
registrypackages
runtime-audit-engine
service-with-healthchecks
static-routing-manager
terraform-manager
user-authn
vertical-pod-autoscaler