Skip to main content
Blog

Kubernetes & Logging

By 4 april 2018No Comments

Kubernetes & Logging

When you are running services in a Kubernetes cluster they are probably writing log statements that contain all sorts of useful information you need to have look at now and then. Being the good developer you behaved and made sure all logging statements are written to the console. This gives you the possibility to use kubectl logs -f <pod-name> -c <container> to tail the logs of a specific pod and container. More that enough when you are developing the services. Not so useful once you get to production. A more robust logging solution is needed.

Elasticserach, Fluentd, Kibana

The combination of these three open source projects keeps popping up more and more when it comes to centralized log collection and analysis. Elasticsearch is an open source search engine known for its ease of use. Fluentd is an open source data collector, which lets you unify the data collection and consumption for a better use and understanding of data. Kibana is an open source Web UI that makes Elasticsearch user friendly for marketers, engineers and data scientists alike. By combining these three tools EFK (Elasticsearch + Fluentd + Kibana) we get a scalable, flexible, easy to use log collection and analytics pipeline.

Collecting Logs

When gathering the logs of a Kubernetes cluster there are two types of logs you want to collect. On one side we have the system logs of the cluster itself and on the other side we have the container logs. Before Fluentd will start collecting the logs we need to tell it where to find the logs by updating the fluent.conf file.

System Logs

Kubernetes components not running in a container write log information to files on the host system. They can mostly be found in the /var/log directory. To mark a host system log file for collection you simply add a <source> section to the configuration file for each file you want to collect. The <source> section for kubelet would look something like this.

<source>
  @id kubelet.log
  @type tail
  format multiline
  multiline_flush_interval 5s
  format_firstline /^\w\d{4}/
  format1 /^(?<severity>\w)(?<time>\d{4} [^\s]*)\s+(?<pid>\d+)\s+(?<source>[^ \]]+)\] (?<message>.*)/
  time_format %m%d %H:%M:%S.%N
  path /var/log/kubelet.log
  pos_file /var/log/es-kubelet.log.pos
  tag kubelet
</source>

Container Logs

Containers by default log there stdout en stderr to a file on the host file system. This is taken care of by the docker daemon. The files are located in /var/lib/docker/containers/. Each container has its own directory, where the directory name is the container id and the logs file looks like <container-id>=json.log. For example:

/var/lib/docker/containers/82cf69ea77a3dafa8dac3a6efd587664b7e9aadcc6f7db98859566d213d4371f/82cf69ea77a3dafa8dac3a6efd587664b7e9aadcc6f7db98859566d213d4371f.log-json.log

To have Fluentd collect the container logs we can add a <source> section like this:

<source>
  @id fluentd-containers.log
  @type tail
  path /var/lib/docker/containers/*/*.log
  pos_file /var/log/fluentd-containers.log.pos
  time_format %Y-%m-%dT%H:%M:%S.%NZ
  tag docker.*
  format json
  read_from_head true
</source>

This is pretty neat and gives us all containers, but since we are running Kubernetes we can do better. The Kubernetes kubelet makes a symbolic link to this file on the host machine in the /var/log/containers directory which includes the pod name and the Kubernetes container name:

sample-service-6b6f59f5cf-4v47j_default_sample-service-82cf69ea77a3dafa8dac3a6efd587664b7e9aadcc6f7db98859566d213d4371f.log -> /var/lib/docker/containers/82cf69ea77a3dafa8dac3a6efd587664b7e9aadcc6f7db98859566d213d4371f/82cf69ea77a3dafa8dac3a6efd587664b7e9aadcc6f7db98859566d213d4371f.log-json.log

The filename has meta data we can use in combination with the Fluentd Kubernetes Metadata plugin to add meta data to our log events. To achieve this we have to change the container <source> section to:

<source>
  @id fluentd-containers.log
  @type tail
  path /var/log/containers/*.log
  pos_file /var/log/fluentd-containers.log.pos
  time_format %Y-%m-%dT%H:%M:%S.%NZ
  tag kubernetes.*
  format json
  read_from_head true
</source>

This source will product log event tagged:

var.log.containers.sample-service-6b6f59f5cf-4v47j_default_sample-service-82cf69ea77a3dafa8dac3a6efd587664b7e9aadcc6f7db98859566d213d4371f.log

If we then apply the Fluentd Kubernetes Metadata filter by adding the following to the Fluentd config:

<filter kubernetes.**>
    @type kubernetes_metadata
</filter>

We get a log event with Kubernetes metadata embedded:

{
    "log":"2018-04-04 13:21:33.135  INFO [sample-service,5809caff2c4c218c,1da1ed2208d93495,false] 11 --- [ctor-http-nio-2] n.f.demo.sampleservice.SampleController  : Calling sample service.",
    "stream":"stdout",
    "time":"2018-04-04T13:21:33.137746827Z",
    "docker": {
        "container_id": "82cf69ea77a3dafa8dac3a6efd587664b7e9aadcc6f7db98859566d213d4371f"
    },
    "kubernetes": {
        "namespace_name": "default",
        "namespace_id: "724fc802-d4d6-4276-9aca-6f733c85a387",
        "pod_name": "sample-service-6b6f59f5cf-4v47j",
        "pod_id": "498a6ed4-3964-4579-b65d-52d0d9c82772",
        "container_name": "sample-service"
        "namespace_annotations": {
            "sample-service/release": "spring-2018"
        },
        "namespace_labels": {
            "product_version": "0.0.1"
        },
        "labels": {
           "fluentd-log-format": "spring-boot",
            "service": "sample-service"
        }
    }    
}

So far so good. The Kubernetes meta data plugin added meta data for our container. The only thing missing is proper parsing of the log attribute of the event. The sample service is a Spring Boot application using Spring Cloud Sleuth for tracing and it outputs log information using a very specific format.

<timestamp> <level> [<service>,<trace-id>,<span-id>,<exportable>] <pid> --- [<thread>] <source>: <message>

Before we write our log event to Elasticsearch we would like to parse this line to extract the different fields. As we are reading a lot of log files not every event originates from a Spring Boot application. We only want to parse the log attribute when the event is generated by a Spring Boot application. We can achieve this by using the rewrite tag filter plugin to mark events as spring-boot parsable.

<match kubernetes.**>
  @type rewrite_tag_filter
  <rule>
    key $.kubernetes.labels.fluentd-log-format
    pattern ^(.+)$
    tag $1.${tag}
  </rule>
  <rule>
    key log
    pattern ^(.*)$
    tag no-parse.${tag}
  </rule>
</match>

This will effectively prefix the tag of the event with the value of the fluentd-log-format label if you specified it in the labels section of the container metadata in the Kubernetes descriptor for the pod. The second rule is to make sure we match all other events too, or else they disappear.

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  namespace: services
  name: sample-service
spec:
  replicas: 1
  revisionHistoryLimit: 2
  minReadySeconds: 20
  template:
    metadata:
      labels:
        fluentd-log-format: "spring-boot"
        service: sample-service
    spec:
      containers:
      - name: sample-service
        image: sample-service:snapshot
        imagePullPolicy: IfNotPresent
        ports:
        - containerPort: 8080

Now we can use a filter, which triggers on the spring-boot tag prefix, to parse the log attribute and extract all the different fields.

<filter spring-boot.**>
  @type parser
  key_name log
  reserve_data true
  <parse>
    @type multiline
    format_firstline /\d{4}-\d{2}-\d{2}/
    format1 /^(?<time>\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}\.\d{3})\s+(?<level>[^\s]+)\s+(\[(?<service>[^,]*),(?<trace>[^,]*),(?<span>[^,]*),(?<exportable>[^\]]*)\]\s+)?(?<pid>\d+)\s+---\s+\[(?<thread>[^\]]+)\]\s+(?<source>[^\s]+)\s*:\s+(?<message>.*)/
    time_format %Y-%m-%d %H:%M:%S.%N
  </parse>
</filter>

When the events go through the filter we end up with:

{
    "log":"2018-04-04 13:21:33.135  INFO [sample-service,5809caff2c4c218c,1da1ed2208d93495,false] 11 --- [ctor-http-nio-2] n.f.demo.sampleservice.SampleController  : Calling sample service.",
    "stream":"stdout",
    "time":"2018-04-04T13:21:33.135",
    "docker": {
        "container_id": "82cf69ea77a3dafa8dac3a6efd587664b7e9aadcc6f7db98859566d213d4371f"
    },
    "kubernetes": {
        "namespace_name": "default",
        "namespace_id: "724fc802-d4d6-4276-9aca-6f733c85a387",
        "pod_name": "sample-service-6b6f59f5cf-4v47j",
        "pod_id": "498a6ed4-3964-4579-b65d-52d0d9c82772",
        "container_name": "sample-service"
        "namespace_annotations": {
            "sample-service/release": "spring-2018"
        },
        "namespace_labels": {
            "product_version": "0.0.1"
        },
        "labels": {
           "fluentd-log-format": "spring-boot",
            "service": "sample-service"
        }
    },
    "level": "INFO",
    "service": "sample-service",
    "trace": "5809caff2c4c218c",
    "span": "1da1ed2208d93495",
    "exportable": "false",
    "pid": "11",
    "thread": "ctor-http-nio-2",
    "source": "n.f.demo.sampleservice.SampleController",
    "message": "Calling sample service."   
}

When we send this to Elasticsearch we have a lot of indexed fields we can use for creating awesome dashboards in Kibana.

This is a cross-post from my personal blog. Follow me on @acuriouscoder if you like what you’re reading or subscribe to my blog on https://acuriouscoder.net.