The log-shipper module

Available in editions: CE, BE, SE, SE+, EE

The module deploys log collector agents on nodes in the cluster. The purpose of these agents is to do minimal transformations and send logs further. Each agent is a vector instance running with a configuration file generated by Deckhouse.

log-shipper architecture

Deckhouse is watching ClusterLoggingConfig, ClusterLogDestination and PodLoggingConfig custom resources. The combination of a logging source and log destination is called pipeline.
Deckhouse generates a configuration file and stores it into Kubernetes Secret.
Secret is mounted to all log-shipper agent Pods and the configuration is reloaded on changes by the reloader sidecar container.

Deployment topologies

This module deploys only agents on nodes. It is implied that logs are sent from the cluster using one of the following topologies.

Distributed

Agents send logs directly to the storage, e.g., Loki, Elasticsearch.

log-shipper distributed

Less complicated scheme to use.
Available out of the box without any external dependency besides storage.
Complicated transformations consume more resources.

Centralized

All logs are aggregated by one of the available aggregation destinations, e.g., Logstash, Vector. Agents on nodes do minimal transformations and try to send logs from nodes faster with less resource consumption. Complicated mappings are applied on the aggregator’s side.

log-shipper centralized

Lower resource consumption for applications on nodes.
Users can configure any possible mappings for aggregators and send logs to many more storages.
Dedicated nodes for aggregates can be scaled up and down on the loading changes.

Stream

The main goal of this architecture is to send messages to the queue system as quickly as possible, then other workers will read them and deliver them to the long-term storage for later analysis.

log-shipper stream

The same pros and cons as for centralized architecture, yet one more middle layer storage is added.
Increased durability. Suites for all infrastructures where logs delivery is crucial.

Metadata

On collecting, all sources enrich logs with metadata. The enrichment takes place at the Source stage.

Kubernetes

The following metadata fields will be exposed:

Label	Pod spec path
`pod`	metadata.name
`namespace`	metadata.namespace
`pod_labels`	metadata.labels
`pod_ip`	status.podIP
`image`	spec.containers[].image
`container`	spec.containers[].name
`node`	spec.nodeName
`pod_owner`	metadata.ownerRef[0]

Label	Node spec path
`node_group`	metadata.labels[].node.deckhouse.io/group

For Splunk, the pod_labels fields are not exported because it is a nested object, which is not supported by Splunk.

File

The host label is the only label that contains the server’s hostname.

Log filters

There are two filters for reducing the number of messages sent to storage, log filter and label filter.

log-shipper pipeline

They are executed right after concatenating lines together with the multiline log parser.

label filter - rules are run on message metadata. The metadata (or label) fields are populated based on the log source, and different sources will have different sets of fields. These rules are needed, for example, to discard messages from a specific container or pod with/without a label.
log filter - rules are run on the original message. It is possible to discard a message based on a JSON field, or, if the message is not in JSON format, use a regular expression to search the string.

Both filters have the same structured configuration:

field — the source of data to filter (most of the time it is a value of a label or a JSON parameter).
operator — action to apply to a value of the field. Possible options are In, NotIn, Regex, NotRegex, Exists, DoesNotExist.
values — defines different values for different operators:
- DoesNotExist, Exists — not supported;
- In, NotIn — a value of a field must / mustn’t be in the list of provided values;
- Regex, NotRegex — a value of a field must match any or mustn’t match all the provided regexes (values).

More examples can be found in the Examples section of the documentation.

Extra labels are added on the Destination stage of the pipeline, so it is impossible to run queries against them.

The log-shipper module

Deployment topologies

Distributed

Centralized

Stream

Metadata

Kubernetes

File

Log filters

Request trial access

Thank you

Error

Request callback

Thank you

Something went wrong

Book your sessions

Thank you

Error

Request demo

Thank you

Error

Get the PCI SSC Compliance Report

Thank you

Error