This section describes the operation of logging system components in Deckhouse Virtualization Platform (DVP).
Log collection and delivery mechanism
DVP uses the log-shipper
module for log collection and delivery.
A separate log-shipper
instance runs on each cluster node and is configured based on Deckhouse resources.
The log-shipper
module uses Vector as a logging agent.
The combination of settings for log collection and delivery forms a pipeline.
-
Deckhouse monitors ClusterLoggingConfig, ClusterLogDestination, and PodLoggingConfig resources:
ClusterLoggingConfig
: Describes log sources at the cluster level, including collection, filtering, and parsing rules.PodLoggingConfig
: Describes log sources within a specified namespace, including collection, filtering, and parsing rules.ClusterLogDestination
: Sets log storage parameters.
- Based on the specified parameters, Deckhouse automatically creates a configuration file and saves it in a Kubernetes Secret.
- The Secret is mounted on all
log-shipper
agent pods. When the configuration changes, updates occur automatically using thereloader
sidecar container.
Log delivery schemes
DVP supports various log delivery topologies depending on reliability requirements and resource consumption.
Distributed
log-shipper
agents send logs directly to storage, such as Loki or Elasticsearch.
Advantages:
- Simple configuration.
- Available “out of the box” without additional dependencies, except for storage.
Disadvantages:
- Complex transformations consume more resources on application nodes.
Centralized
All logs are sent to one of the available aggregators, such as Logstash or Vector. Agents on nodes send logs as quickly as possible, consuming minimal resources. Complex transformations are performed on the aggregator side.
Advantages:
- Reduces resource consumption on application nodes.
- Users can configure any transformations in the aggregator and send logs to a much larger number of storage systems.
Disadvantages:
- Requires dedicated nodes for aggregators. Their number may increase depending on the load.
Streaming
The main task of this architecture is to send logs to a message queue (e.g., Kafka) as quickly as possible, from which they are transferred to long-term storage for further analysis in a service order.
Advantages:
- Reduces resource consumption on application nodes.
- Users can configure any transformations in the aggregator and send logs to a much larger number of storage systems.
- High reliability. Suitable for infrastructure where log delivery is a priority task.
Disadvantages:
- Adds an intermediate link (message queue).
- Requires dedicated nodes for aggregators. Their number may increase depending on the load.
Log processing
Message filters
Before sending logs, DVP can filter out unnecessary records to reduce the number of messages sent to storage.
For this, labelFilter
and logFilter
filters of the log-shipper
module are used.
Filters run immediately after combining strings using multiline parsing.
labelFilter
:- Rules are applied to message metadata.
- Fields for metadata (or labels) are populated based on the log source, so different sources will have different sets of fields.
- Rules are used, for example, to exclude messages from a specific container or pod matching a given label.
logFilter
:- Rules are applied to the original message.
- Allows excluding a message based on the value of a JSON field.
- If the message is not in JSON format, you can use a regular expression to search by string.
Both filters have a unified configuration structure:
field
: Data source for running filtering. Most often this is a label value or field from a JSON document.operator
: Action for comparison. Available options:In
,NotIn
,Regex
,NotRegex
,Exists
,DoesNotExist
.values
: This option has different values for different operators:In
,NotIn
: The field value must equal or not equal one of the values in thevalues
list.Regex
,NotRegex
: The value must match at least one or not match any regular expression from thevalues
list.Exists
,DoesNotExist
: Not supported.
Additional labels (extraLabels
) are added at the Destination stage, so filtering logs by them is not possible.
Metadata
When processing logs, log-shipper
automatically enriches messages with metadata depending on their source.
Enrichment occurs at the Source
stage.
Kubernetes
When collecting logs from Kubernetes pods and nodes, the following fields are automatically exported:
Label | Pod spec path |
---|---|
pod |
metadata.name |
namespace |
metadata.namespace |
pod_labels |
metadata.labels |
pod_ip |
status.podIP |
image |
spec.containers[].image |
container |
spec.containers[].name |
node |
spec.nodeName |
pod_owner |
metadata.ownerRef[0] |
Label | Node spec path |
---|---|
node_group |
metadata.labels[].node.deckhouse.io/group |
For Splunk, the pod_labels
field is not exported because it is a nested object that Splunk does not support.
File
When collecting logs from file sources, only the host
label is available,
which contains the hostname of the server from which the log was received.