Rolling out Dynatrace full-stack monitoring to a Kubernetes or OpenShift cluster using a DaemonSet is easy and straightforward. Managing the full lifecycle of OneAgent deployments can, however, become a bit cumbersome as there’s no proper out-of-the-box lifecycle management available for Kubernetes DaemonSets that allows for easy OneAgent updates. In a typical scenario, the team responsible for Day 2 operations is responsible for keeping track of new OneAgent versions as they are introduced, taking care of rolling out the new versions and restarting all pods in order to pick up the updates.
In alignment with the automation mantra of the Kubernetes community, Dynatrace strives to automate as many routine customer tasks as possible. This is why we’re proud to introduce Dynatrace OneAgent Operator.
What is an operator in Kubernetes?
“An Operator represents human operational knowledge in software, to reliably manage an application.” – CoreOS
Kubernetes version 1.7 introduced the concept of custom resources and controllers, which allow for extending the Kubernetes API. These extension capabilities enable the Kubernetes community to implement domain-specific applications as first-class Kubernetes objects in a cloud-native style. This means you can define the desired state of workloads in a declarative manner and create custom controller logic that takes continuous action to achieve and maintain the desired state.
An operator makes use of these capabilities and extends the Kubernetes API by utilizing this concept of custom resources and corresponding resource controllers. The term “operator” was coined by CoreOS and announced as a means to more efficiently and reliably manage the lifecycle of stateful applications.
RedHat and CoreOS refined this concept and evolved the idea to provide an Operator Framework (including an Operator SDK and Lifecycle Manager) to make the process of implementing operators as easy as possible.
Dynatrace OneAgent Operator for ‘Day 2’ operations
“With the power of Kubernetes Operators, ISVs within the Red Hat ecosystem, like Dynatrace, can automate their services at scale in a Red Hat OpenShift environment” said Chris Morgan, Global Technical Director, Red Hat. “Operators enable OpenShift to not only be a priority deployment target for ISV solutions, but a catalyst to empower those solutions to operate on OpenShift as they would on the public cloud in terms of maintainability, flexibility, and upgradeability.”
Dynatrace is among the first Red Hat and CoreOS partners to pick up and integrate the Operator SDK into their products. Dynatrace makes use of this concept by putting operational knowledge into software and automating the management, updates, and roll-outs of new Dynatrace OneAgent versions. By automating the repetitive tasks involved in keeping Dynatrace OneAgent up-to-date, Dynatrace has taken another step toward providing self-driving IT for both our customers and ourselves.
Value-add of Dynatrace OneAgent Operator at a glance:
- Fine-grained control of OneAgent roll-outs to select nodes based on node labels. This enables you to monitor selected nodes using different Dynatrace environments. OneAgent Operator also supports tolerations so you can deploy OneAgent on tainted nodes.
- Dynatrace OneAgent updates are performed automatically, as soon as they’re available. When pending updates are available, OneAgent Operator takes care of recycling all pods that have not yet been picked up the latest version.
- OneAgent Operator ensures that you always monitor your OpenShift cluster with the latest OneAgent version.
How does the Operator work?
Dynatrace OneAgent Operator registers itself as a controller that watches for resources of type
OneAgent as defined by a CustomResourceDefinition. This allows you to define a configuration that describes your OneAgent deployment. By loading the configuration into Kubernetes or OpenShift, the configuration is automatically passed to the custom controller which ensures the rollout of OneAgent based on your specification.
The following diagram outlines the involved components and objects.
By creating the OneAgent
CustomResource entity in Kubernetes, the object is automatically passed to Dynatrace OneAgent Operator. First, it’s determined if a corresponding DaemonSet already exists. If not, Dynatrace OneAgent Operator creates a new one. The DaemonSet is responsible for rolling out OneAgent to selected nodes.
Further, OneAgent Operator constantly queries the Dynatrace API to check if a new version is available for a given deployment. In the event of a pending update, all Pods belonging to a certain custom resource that don’t have the updated version are recycled in order to pick up the latest drivers.
This reconciliation loop constantly revises the actual state with the desired state and takes appropriate actions when needed.
Our backlog is full of enhancements and refinements to further assist and simplify the handling of Dynatrace OneAgent in Kubernetes and OpenShift environments.
One of the next major features to be implemented is the ability to restart all application pods in the event of a OneAgent update so that the application pods pick up the latest OneAgent features and capabilities. This mandatory feature requested by our DevOps engineers will reduce complexity and dependencies in the Dynatrace CI/CD pipeline. So, be prepared for more functionality to come.
Going Open Source
In advance of providing this value-added tool to our customers in the Dynatrace ecosystem, Dynatrace OneAgent Operator has been released to the open source community. We encourage you to participate by proposing your own ideas for improvement and submitting changes via pull requests.
We’d love to hear from you. Tell us what you think about the Dynatrace OneAgent Operator by providing your feedback at Dynatrace Community.