loader

Monitoring Kubernetes and Docker Container Logs

When building containerized applications, logging is definitely one of the most important things to get right from a DevOps standpoint. Log management helps DevOps teams debug and troubleshoot issues faster, making it easier to identify patterns, spot bugs, and resolve them.

In this article, we’ll introduce how to generate logs from containers and how to explore and view logs in a central place.

Docker Logging: Why Are Logs Important When Using Docker

The importance of logging applies to a much larger extent to Dockerized applications. When an application in a Docker container emits logs, they are sent to the application’s stdout and stderr output streams.

The container’s logging driver can access these streams and send the logs to a file, a log collector running on the host, or a log management service endpoint.

By default, Docker uses a JSON-file driver, which writes JSON-formatted logs to a container-specific file on the host where the container is running.

The example below shows JSON logs created using the JSON-file driver:

{"log":"Hello World!\n","stream":"stdout","time":"2021-06-18T22:51:31.549390877Z"}

Before moving on, let’s go over the basics.

What Is a Docker Container

A container is a unit of software that packages an application, making it easy to deploy and manage no matter the host. Say goodbye to the infamous “it works on my machine” statement!

How? Containers are isolated and stateless, which enables them to behave the same regardless of the differences in infrastructure. A Docker container is a runtime instance of an image that’s like a template for creating the environment you want.

What Is a Docker Image

A Docker image is an executable package that includes everything that the application needs to run. This includes code, libraries, configuration files, and environment variables.

Why Do You Need Containers

Containers allow breaking down applications into microservices – multiple small parts of the app that can interact with each other via functional APIs. Each microservice is responsible for a single feature so development teams can work on different parts of the application at the same time. That makes building an application easier and faster.

How Is Docker Logging Different

Most conventional log analysis methods don’t work on containerized logging – troubleshooting becomes more complex compared to traditional hardware-centric apps that run on a single node and need less troubleshooting. You need more data to work with so you must extend your search to get to the root of the problem.

Here’s why:

Containers are Ephemeral

Docker containers emit logs to the stdout and stderr output streams. Because containers are stateless, the logs are stored on the Docker host in JSON files by default. Why?

The default logging driver is JSON-file. The logs are then annotated with the log origin, either stdout or stderr, and a timestamp. Each log file contains information about only one container.

You can find these JSON log files in the/var/lib/docker/containers/directory on a Linux Docker host. Here’s how you can access them:

/var/lib/docker/containers/<container id>/<container id>-json.log

That’s where logging comes into play. You can collect the logs with a log aggregator and store them in a place where they’ll be available forever. It’s dangerous to keep logs on the Docker host because they can build up over time and eat into your disk space. That’s why you should use a central location for your logs and enable log rotation for your Docker containers.

Get Started with Docker Container Logs

When you’re using Docker, you work with two different types of logs: daemon logs and container logs.

What Are Docker Container Logs?

Docker container logs are generated by the Docker containers. They need to be collected directly from the containers. Any messages that a container sends to stdout or stderr is logged then passed on to a logging driver that forwards them to a remote destination of your choosing.

Here are a few basic Docker commands to help you get started with Docker logs and metrics:

  • Show container logs: docker logs containerName
  • Show only new logs: docker logs -f containerName
  • Show CPU and memory usage: docker stats
  • Show CPU and memory usage for specific containers: docker stats containerName1 containerName2
  • Show running processes in a container: docker top containerName
  • Show Docker events: docker events
  • Show storage usage: docker system df 

Watching logs in the console is nice for development and debugging, however in production you want to store the logs in a central location for search, analysis, troubleshooting and alerting. 

Filebeat for Elasticsearch provides a simplified solution to store the logs for search, analysis, troubleshooting and alerting.

What is Filebeat

Filebeat is a log shipper belonging to the Beats family — a group of lightweight shippers installed on hosts for shipping different kinds of data into the ELK Stack for analysis. Each beat is dedicated to shipping different types of information — Winlogbeat, for example, ships Windows event logs, Metricbeat ships host metrics, and so forth. Filebeat, as the name implies, ships log files.

In an ELK-based logging pipeline, Filebeat plays the role of the logging agent—installed on the machine generating the log files, tailing them, and forwarding the data to either Logstash for more advanced processing or directly into Elasticsearch for indexing. Filebeat is, therefore, not a replacement for Logstash, but can and should in most cases, be used in tandem.

Written in Go and based on the Lumberjack protocol, Filebeat was designed to have a low memory footprint, handle large bulks of data, support encryption, and deal efficiently with back pressure. For example, Filebeat records the last successful line indexed in the registry, so in case of network issues or interruptions in transmissions, Filebeat will remember where it left off when re-establishing a connection. If there is an ingestion issue with the output, Logstash or Elasticsearch, Filebeat will slow down the reading of files.

Installing Filebeat

You can download and install Filebeat using various methods and on a variety of platforms. It only requires that you have a running ELK Stack to be able to ship the data that Filebeat collects. I will outline two methods, using Apt and Docker, but you can refer to the official docs for more options.

Install Filebeat using Apt

For an easier way of updating to a newer version, and depending on your Linux distro, you can use Apt or Yum to install Filebeat from Elastic’s repositories:

First, you need to add Elastic’s signing key so that the downloaded package can be verified (skip this step if you’ve already installed packages from Elastic):
(use the PGP key D88E42B4, Elasticsearch Signing Key, with fingerprint

4609 5ACC 8548 582C 1A26 99A9 D27D 666C D88E 42B4

to sign all packages. It is available from https://pgp.mit.edu)

wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo apt-key add - (elastic-signed-key)

The next step is to add the repository definition to your system:

echo "deb https://artifacts.elastic.co/packages/7.x/apt stable main" | sudo tee -a /etc/apt/sources.list.d/elastic-7.x.list

All what’s left to do is to update your repositories and install Filebeat:

sudo apt-get update && sudo apt-get install filebeat

Install Filebeat on Docker

If you’re running Docker, you can install Filebeat as a container on your host and configure it to collect container logs or log files from your host.

Pull Elastic’s Filebeat image with:

docker pull docker.elastic.co/beats/filebeat7.13.2

Logs from Standard Output

Filebeat with Docker

Filebeat Fetches & ships metrics from Docker container

  • Deployment one Filebeat per Docker host.
  • The Docker logs host folder (/var/lib/docker/containers) is the one monitored by the Filebeat 
  • Filebeat starts an input for the files and begins harvesting them as soon as they appear in the folder 
  • Everything happens before line filtering, multiline, and JSON decoding, so this input can be used in combination with those settings

Filebeat Container Input

OptionDescription
PathsThe base path where Docker logs are located.
Example: /var/lib/docker/containers/*/*.log
StreamReads from the specified streams only: all (default), stdout or stderr

Docker config example – docker.yml

filebeat.inputs:
- type: container
  paths:

    - '/var/lib/docker/containers/*/*.log'

Kubernetes

Kubernetes is a Production-grade, open-source platform container orchestrator. It Automates the distribution and scheduling of application containers across a cluster in a more efficient way. Kubernetes is a platform for containers that solves the problem of managing containers at scale. It can be self-healing as it handles containers and nodes failure.

Kubernetes Architecture

A Kubernetes cluster consists of Master and Nodes.Each node runs a container runtime (Docker or rkt). A node contains pod(s), which are scheduling units (and can contain one or more containers with shared namespaces and shared volumes).  Major components of Kubernetes include:  

  • API Server: master component that exposes the Kubernetes API and is the central management entity;
  • kubelet: ensures containers are running on each node.

Filebeat with Kubernetes

Kubernetes config example – docker.yml

filebeat.inputs:
- type: container
  stream: stdout
  paths:

    - '/var/log/containers/*.log'

Kubernetes Logs

Deploy Filebeat as a DaemonSet to ensure there’s a running instance on each node of the cluster. The Docker logs host folder (/var/lib/docker/containers) is mounted on the Filebeat container. Filebeat starts an input for the files and begins harvesting them as soon as they appear in the folder . To download the manifest file, run:

curl -L -O https://raw.githubusercontent.com/elastic/beats/master/ deploy/kubernetes/filebeat-kubernetes.yaml

Metadata Processors

Define processors in your configuration to process events before they are sent to the configured output for:

  • reducing the number of exported fields
  • enhancing events with additional metadata 
  • performing additional processing and decoding  

Filebeat has processors for enhancing your data from the environment, like: add_docker_metadata, add_kubernetes_metadata and add_cloud_metadata 

Docker Metadata Processors

add_docker_metadata processor annotates each event with relevant metadata from Docker containers

Example of metadata:

  • docker.container.id
  • docker.container.image
  • docker.container.name
  • docker.container.labels 

Docker config example – docker.yml 

  • Need to provide access to Docker’s unix socket 
‒ docker run -v /var/run/docker.sock:/var/run/docker.sock
  • May also need to add –user=root to the docker run flags, if Filebeat is running as non-root 30 processors: 
- add_docker_metadata: host: "unix:///var/run/docker.sock”

Kubernetes Metadata Processors

add_kubernetes_metadata processor annotates each event based on which Kubernetes pod the event originated from

Example of metadata:

  • kubernetes.pod.name 
  • kubernetes.namespace
  • kubernetes.labels 
  • kubernetes.annotations
  • kubernetes.container.name
  • kubernetes.container.image 

Kubernetes config example

Filebeat Autodiscover

Filebeat Autodiscover will Watch events and react to change

Scan existing containers and launch the proper configs for them. Then it will watch for new start/stop events

To enable define the settings in the filebeat.autodiscover section of the filebeat.yml config file specifying a list of providers 

Need to provide access to Docker’s unix socket

‒ docker run -v /var/run/docker.sock:/var/run/docker.sock ...

May also need to add --user=root to the docker run flags, if Filebeat is running as non-root

Autodiscover Providers

Watch for events on the system and translating those events into internal autodiscover events with a common format 

The providers support the features for:

  • Docker 
  • Kubernetes
  • Jolokia

 Fields from the autodiscover event can be used to set conditions using templates

Autodiscover Providers Templates

Filebeat supports templates for inputs and modules. Templates define a condition to match on autodiscover events. A list of configurations to launch when this condition happens ‒ equals, contains, regexp, range, has_fields, or, and, not 

Can contain variables under the data namespace. Ex: “$ {data.port}” resolves to 6379

Docker Autodiscover Provider

Example of available fields: 

  • host
  • port
  • docker.container.id
  • docker.container.image
  • docker.container.name 
  • docker.container.labels

Docker: Example Autodiscover Config

Kubernetes Autodiscover Provider

Example of available fields: 

  • host ‒ port (if exposed)
  • kubernetes.container.id
  • kubernetes.container.image 
  • kubernetes.container.name
  • kubernetes.labels 
  • kubernetes.namespace
  • kubernetes.node.name 
  • kubernetes.pod.name
  • kubernetes.pod.uid 

Kubernetes: Example Autodiscover Config

Configuration (Redis running under K8s)

Log Inside a Container

Logs from applications that don’t write to the standard output (stdout/stderr), instead the logs are located in files inside the containers. Containers by nature are transient (meaning that any files inside the container will be lost if the container shuts down). It is not recommended to read the logs from inside a container as the performance is worse 

Reading Logs from Inside a Container

  • Configure the logs to be written to the standard output (stdout and/or stderr)
  • Mount a shared volume and make it available to the container and configure the logs to be written to it
  • Or configure a workaround: 
    • use symbolic links to link to the standard output
    • write to /proc/self/fd/1(which is stdout) and its errors to /proc/self/fd/2  
    • more information can be found at: https://docs.docker.com/ config/containers/logging/configure/
  • With Kubernetes stream the logs to sidecar containers 

Reading Logs from Volume

Use shared data volumes to log events, the log data persists and can be shared with other containers 

  • If the volume is local add the volume to the configuration of the filebeat.yml 
  • If using a volume, use a single Filebeat outside your container architecture to avoid duplicate data 

Kubernetes – Streaming Sidecar Container

  1. Read logs from a file, a socket, or the journald
  2. Each Sidecar container tail a particular log file from a shared volume
  3. Redirect the logs to its own stdout stream
  4. Separate several log streams from different parts of your application

Solution Examples

Symbolic Link Example – Dockerfile

Docker: Reading Log from Volume – filebeat.yml

Configuration File for a Kubernetes Pod – Sidecar

Configuration File for a Kubernetes Pod – Sidecar

Exploring logs in Kibana

Once logs start flowing into Elasticsearch, you can start watching them from Kibana interface, let’s have a look to one of them. This is one of the event reported by Filebeat, corresponding to a new log line in a NGINX server running on our Docker scenario:

Thanks to add_docker_metadata we not only get the log output but a series of fields enriching it, with useful context from Docker, like the container name, ID, Docker image, and labels!

As an example, you may want to debug what’s going on in a specific container, you just need to filter your search results by your container name.

Conclusion

While Docker containerization allows developers to encapsulate a program and its file system into a single portable package, that certainly doesn’t mean containerization is free of maintenance. Docker Logging is a bit more complex than traditional methods, so teams using Docker must familiarize themselves with the Docker logging to support full-stack visibility, troubleshooting, performance improvements, root cause analysis, etc.

As we have seen in this post, to facilitate logging, configuring Filebeat to send logs from Docker to Elasticsearch is quite easy. The configuration can also be adapted to the needs of your own applications without requiring much effort. Filebeat is also a small footprint software that can be deployed painlessly no matter what your production environment may look like.

Author: Suganya Ganesan