Monitor OpenShift events

Prerequisites

Integrating Kubernetes events into Dynatrace is a feature available starting with Dynatrace version 1.188 and requires an ActiveGate version 1.187 or higher.

Enable events integration

You can enable the events integration on the Kubernetes settings page where you set up Kubernetes API monitoring for your clusters. For details see Monitor your Openshift clusters with Dynatrace.
The event field selectors can also be defined via Dynatrace API.

Which Kubernetes events can be ingested?

Dynatrace provides a flexible way of ingesting Kubernetes events into your environment to enrich the existing monitoring data from OneAgents and ActiveGates with additional context information. Ingestion follows the Kubernetes established format of field selectors. This achieves our goal of flexibility by choosing events based on event resource fields such as source.component, type, or involvedObject.

hipster

You can set up multiple field selectors for every Kubernetes environment to get maximum flexibility and fine-grain control over the events you want to ingest from Kubernetes.

Find out why your pods fail with CrashLoopBackOffs

CrashLoop events are created when a pod starts up, runs for a little while, crashes, and then restarts. A restart policy is defined in the pod specification by default. This is used by the kubelet to restart the crashing container with a back-off delay. There are many reasons why containers crash. Often, it’s because an application inside the container crashes. Crashes can also occur when a container or pod is misconfigured.

events card

The screenshot above shows a scenario in which the paymentservice-v1-d8574c957-vgwkh pod has a crashing container. The Node.js application seems to have a memory leak. Fortunately, the paymentservice-v1 deployment runs multiple replicas, so the end users of this service aren’t impacted. With the addition of events, we have deeper insights into a problem that could have gone unnoticed otherwise. To increase the stability of the cluster, protect its resources, and give developers clues to fix problems in their code, the root cause should be addressed.

memory leak