Kubernetes Logs
Kubernetes version >= 1.15
is required.
Example Configuration
Sample Output
1[sources.my_source_id]
2type = "kubernetes_logs"
1"F1015 11:01:46.499073 1 main.go:39] error getting server version: Get \"https://10.96.0.1:443/version?timeout=32s\": dial tcp 10.96.0.1:443: connect: network is unreachable"
1{
2 "log": {
3 "file": "/var/log/pods/kube-system_storage-provisioner_93bde4d0-9731-4785-a80e-cd27ba8ad7c2/storage-provisioner/1.log",
4 "kubernetes.container_image": "gcr.io/k8s-minikube/storage-provisioner:v3",
5 "kubernetes.container_name": "storage-provisioner",
6 "kubernetes.namespace_labels": {
7 "kubernetes.io/metadata.name": "kube-system"
8 },
9 "kubernetes.pod_ip": "192.168.1.1",
10 "kubernetes.pod_ips": [
11 "192.168.1.1",
12 "::1"
13 ],
14 "kubernetes.pod_labels": {
15 "addonmanager.kubernetes.io/mode": "Reconcile",
16 "gcp-auth-skip-secret": "true",
17 "integration-test": "storage-provisioner"
18 },
19 "kubernetes.pod_name": "storage-provisioner",
20 "kubernetes.pod_namespace": "kube-system",
21 "kubernetes.pod_node_name": "minikube",
22 "kubernetes.pod_uid": "93bde4d0-9731-4785-a80e-cd27ba8ad7c2",
23 "message": "F1015 11:01:46.499073 1 main.go:39] error getting server version: Get \"https://10.96.0.1:443/version?timeout=32s\": dial tcp 10.96.0.1:443: connect: network is unreachable",
24 "source_type": "kubernetes_logs",
25 "stream": "stderr",
26 "timestamp": "2020-10-15T11:01:46.499555308Z"
27 }
28}
Configuration Options
Required Options
type(required)
The component type. This is a required field for all components and tells Vector which component to use.
Type | Syntax | Default | Example |
---|---|---|---|
string | literal | ["kubernetes_logs"] |
Advanced Options
pod_annotation_fields(optional)
Configuration for how the events are annotated with Pod metadata.
Type | Syntax | Default | Example |
---|---|---|---|
hash | literal | [] |
namespace_annotation_fields(optional)
Configuration for how the events are annotated with Namespace metadata.
Type | Syntax | Default | Example |
---|---|---|---|
hash | literal | [] |
auto_partial_merge(optional)
Automatically merge partial messages into a single event. Partial here is in respect to messages that were split by the Kubernetes Container Runtime log driver.
Type | Syntax | Default | Example |
---|---|---|---|
bool |
ingestion_timestamp_field(optional)
The exact time the event was ingested into Vector.
Type | Syntax | Default | Example |
---|---|---|---|
string | literal |
kube_config_file(optional)
Optional path to a kubeconfig file readable by Vector. If not set, Vector will try to connect to Kubernetes using in-cluster configuration.
Type | Syntax | Default | Example |
---|---|---|---|
string | literal |
self_node_name(optional)
The name of the Kubernetes Node
this Vector instance runs at. Configured to use an env var by default, to be evaluated to a value provided by Kubernetes at Pod deploy time.
Type | Syntax | Default | Example |
---|---|---|---|
string | literal | ${VECTOR_SELF_NODE_NAME} |
exclude_paths_glob_patterns(optional)
A list of glob patterns to exclude from reading the files.
Type | Syntax | Default | Example |
---|---|---|---|
array | literal | **/*.gz**/*.tmp | ["**/exclude/**"] |
extra_field_selector(optional)
Specifies the field selector to filter Pod
s with, to be used in addition to the built-in Node
filter.
The name of the Kubernetes Node
this Vector instance runs at. Configured to use an env var by default, to be evaluated to a value provided by Kubernetes at Pod deploy time.
Type | Syntax | Default | Example |
---|---|---|---|
string | literal | ["metadata.name!=pod-name-to-exclude","metadata.name!=pod-name-to-exclude,metadata.name=mypod"] |
extra_label_selector(optional)
Specifies the label selector to filter Pod
s with, to be used in
addition to the built-in vector.dev/exclude
filter.
Type | Syntax | Default | Example |
---|---|---|---|
string | literal | ["my_custom_label!=my_value","my_custom_label!=my_value,my_other_custom_label=my_value"] |
max_line_bytes(optional)
The maximum number of a bytes a line can contain before being discarded. This protects against malformed lines or tailing incorrect files.
Type | Syntax | Default | Example |
---|---|---|---|
uint | 32768 |
fingerprint_lines(optional)
The number of lines to read when generating a unique fingerprint of a log file.
This is helpful when some containers share common first log lines.
WARNING: If the file has less than this amount of lines then it won't be read at all.
This is important since container logs are broken up into several files, so the greater
lines
value is, the greater the chance of it not reading the last file/logs of
the container.
Type | Syntax | Default | Example |
---|---|---|---|
uint | 1 |
glob_minimum_cooldown_ms(optional)
Delay between file discovery calls. This controls the interval at which Vector searches for files within a single pod.
Type | Syntax | Default | Example |
---|---|---|---|
uint | 60000 |
data_dir(optional)
The directory used to persist file checkpoint positions. By default, the global data_dir
option is used. Please make sure the Vector project has write permissions to this dir.
Type | Syntax | Default | Example |
---|---|---|---|
string | file_system_path | ["/var/lib/vector"] |
timezone(optional)
The name of the time zone to apply to timestamp conversions that do not contain an explicit time
zone. This overrides the global timezone
option.
The time zone name may be any name in the TZ database, or local
to
indicate system local time.
Type | Syntax | Default | Example |
---|---|---|---|
string | literal | local | ["local","America/NewYork","EST5EDT"] |
How it Works
Enrichment
Vector will enrich data with Kubernetes context. A comprehensive
list of fields can be found in the
kubernetes_logs
source output docs.
Filtering
Vector provides rich filtering options for Kubernetes log collection:
- Built-in
Pod
andcontainer
exclusion rules. - The
exclude_paths_glob_patterns
option allows you to exclude Kubernetes log files by the file name and path. - The
extra_field_selector
option specifies the field selector to filter Pods with, to be used in addition to the built-inNode
filter. - The
extra_label_selector
option specifies the label selector to filterPod
s with, to be used in addition to the built-invector.dev/exclude
filter.
Globbing
By default, the kubernetes_logs
source
ignores compressed and temporary files. This behavior can be configured with the
exclude_paths_glob_patterns
option.
Globbing is used to continually discover Pod
s log files
at a rate defined by the glob_minimum_cooldown
option. In environments when files are
rotated rapidly, we recommend lowering the glob_minimum_cooldown
to catch files
before they are compressed.
Pod exclusion
By default, the kubernetes_logs
source
will skip logs from the Pod
s that have a vector.dev/exclude: "true"
label.
You can configure additional exclusion rules via label or field selectors,
see the available options.
Container exclusion
The kubernetes_logs
source
can skip the logs from the individual container
s of a particular
Pod
. Add an annotation vector.dev/exclude-containers
to the
Pod
, and enumerate the name
s of all the container
s to exclude in
the value of the annotation like so:
vector.dev/exclude-containers: "container1,container2"
This annotation will make Vector skip logs originating from the
container1
and container2
of the Pod
marked with the annotation,
while logs from other container
s in the Pod
will still be
collected.
Kubernetes API communication
Vector communicates with the Kubernetes API to enrich the data it collects with Kubernetes context. Therefore, Vector must have access to communicate with the Kubernetes API server. If Vector is running in a Kubernetes cluster then Vector will connect to that cluster using the Kubernetes provided access information.
In addition to access, Vector implements proper desync handling to ensure communication is safe and reliable. This ensures that Vector will not overwhelm the Kubernetes API or compromise its stability.
Partial message merging
Vector, by default, will merge partial messages that are
split due to the Docker size limit. For everything else, it
is recommended to use the reduce
transform which offers
the ability to handle custom merging of things like
stacktraces.
Pod removal
To ensure all data is collected, Vector will continue to collect logs from the
Pod
for some time after its removal. This ensures that Vector obtains some of
the most important data, such as crash details.
Resource limits
Vector recommends the following resource limits.
State management
Testing & reliability
Vector is tested extensively against Kubernetes. In addition to Kubernetes
being Vector's most popular installation method, Vector implements a
comprehensive end-to-end test suite for all minor Kubernetes versions starting
with 1.15
.
State
This component is stateless, meaning its behavior is consistent across each input.
Checkpointing
Vector checkpoints the current read position after each
successful read. This ensures that Vector resumes where it left
off if restarted, preventing data from being read twice. The
checkpoint positions are stored in the data directory which is
specified via the global data_dir
option, but can be overridden
via the data_dir
option in the file source directly.
Kubernetes API access control
Vector requires access to the Kubernetes API.
Specifically, the kubernetes_logs
source
uses the /api/v1/pods
endpoint to "watch" the pods from
all namespaces.
Modern Kubernetes clusters run with RBAC (role-based access control)
scheme. RBAC-enabled clusters require some configuration to grant Vector
the authorization to access the Kubernetes API endpoints. As RBAC is
currently the standard way of controlling access to the Kubernetes API,
we ship the necessary configuration out of the box: see ClusterRole
,
ClusterRoleBinding
and a ServiceAccount
in our kubectl
YAML
config, and the rbac
configuration at the Helm chart.
If your cluster doesn't use any access control scheme and doesn't restrict access to the Kubernetes API, you don't need to do any extra configuration - Vector willjust work.
Clusters using legacy ABAC scheme are not officially supported
(although Vector might work if you configure access properly) -
we encourage switching to RBAC. If you use a custom access control
scheme - make sure Vector Pod
/ServiceAccount
is granted access to
the /api/v1/pods
resource.
Context
By default, the kubernetes_logs
source augments events with helpful
context keys.