Troubleshooting

This page provides a comprehensive guide to help you diagnose and resolve common problems.

Initial troubleshooting steps

Before you begin with the specific troubleshooting sections, it's important to have a clear understanding of the current state of your Kubernetes cluster. The initial steps outlined below will help you gather essential information about your cluster's health and the status of its components.

Check the status of your DynaKube by executing the kubectl get dynakubes -n dynatrace command.
Use the troubleshoot subcommand.
Check the status of the Dynatrace pods
Use the kubectl -n dynatrace get pods command to check the status of the Dynatrace Operator, OneAgent or CSI-driver pods (the amount of pods will vary depending on the selected deployment mode).
Inspect the logs
Use the kubectl logs command to inspect the logs of specific pods. For example, kubectl logs <pod-name> will display the logs for a specific pod.
Describe the resource
The kubectl describe command can provide detailed information about a specific resource. For example, kubectl describe pod <pod-name> will display detailed information about a specific pod.

General troubleshooting

General troubleshooting steps and guidance for common issues encountered when using Dynatrace with Kubernetes. It covers how to access debug logs, use the troubleshoot subcommand, or generate a support archive.

Troubleshoot common Dynatrace Operator setup issues using the `troubleshoot` subcommand

Dynatrace Operator version 0.9.0+

Run the command below to retrieve a basic output on DynaKube status, such as:

Namespace: If the dynatrace namespace exists (name can be overwritten via parameter)
DynaKube:
- If CustomResourceDefinition exists
- If CustomResource with the given name exists (name can be overwritten via parameter)
- If the API URL ends with /api
- If the secret name is the same as DynaKube (or .spec.tokens if used)
- If the secret has Dynatrace Operator and Data Ingest tokens set
- If the secret for customPullSecret is defined
Environment: If your environment is reachable from the Dynatrace Operator pod using the same parameters as the Dynatrace Operator binary (such as proxy and certificate).
OneAgent and ActiveGate image: If the registry is accessible; if the image is accessible from the Dynatrace Operator pod using the registry from the environment with (custom) pull secret.

kubectl exec deploy/dynatrace-operator -n dynatrace -- dynatrace-operator troubleshoot

If you use a different DynaKube name, add the --dynakube <your_dynakube_name> argument to the command.

Example output if there are no errors for the above-mentioned fields:

{"level":"info","ts":"2022-09-12T08:45:21.437Z","logger":"dynatrace-operator-version","msg":"dynatrace-operator","version":"<operator version>","gitCommit":"<commithash>","buildDate":"<release date>","goVersion":"<go version>","platform":"<platform>"}
[namespace ]     --- checking if namespace 'dynatrace' exists ...
[namespace ]      √  using namespace 'dynatrace'
[dynakube  ]     --- checking if 'dynatrace:dynakube' Dynakube is configured correctly
[dynakube  ]         CRD for Dynakube exists
[dynakube  ]         using 'dynatrace:dynakube' Dynakube
[dynakube  ]         checking if api url is valid
[dynakube  ]         api url is valid
[dynakube  ]         checking if secret is valid
[dynakube  ]         'dynatrace:dynakube' secret exists
[dynakube  ]         secret token 'apiToken' exists
[dynakube  ]         customPullSecret not used
[dynakube  ]         pull secret 'dynatrace:dynakube-pull-secret' exists
[dynakube  ]         secret token '.dockerconfigjson' exists
[dynakube  ]         proxy secret not used
[dynakube  ]      √  'dynatrace:dynakube' Dynakube is valid
[dtcluster ]     --- checking if tenant is accessible ...
[dtcluster ]      √  tenant is accessible

Debug logs

By default, OneAgent logs are located in /var/log/dynatrace/oneagent.

To debug Dynatrace Operator issues, run

You might also want to check the logs from OneAgent pods deployed through Dynatrace Operator.

Generate a support archive using the `support-archive` subcommand

Dynatrace Operator version 0.11.0+

Use support-archive to generate a support archive containing all the files that can be potentially useful for the RFA analysis:

operator-version.txt—a file containing the current Operator version information
logs—logs from all containers of the Dynatrace Operator pods in the Dynatrace Operator namespace (usually dynatrace); this also includes logs of previous containers, if available:
- dynatrace-operator
- dynatrace-webhook
- dynatrace-oneagent-csi-driver
manifests—the Kubernetes manifests for Dynatrace Operator components and deployed DynaKubes in the Dynatrace Operator namespace
troubleshoot.txt—output of a troubleshooting command that is automatically executed by the support-archive subcommand
supportarchive_console.log—complete output of the support-archive subcommand

Usage

To create a support archive, execute the following command.

kubectl exec -n dynatrace deployment/dynatrace-operator -- dynatrace-operator support-archive

The collected files are now stored in a zip file and can be downloaded from the pod using the kubectl cp command.

kubectl -n dynatrace cp <operator pod name>:/tmp/dynatrace-operator/operator-support-archive.zip ./tmp/dynatrace-operator/operator-support-archive.zip

The recommended approach is to use the --stdout parameter line switch to stream the zip file directly to your disk.

kubectl exec -n dynatrace deployment/dynatrace-operator -- dynatrace-operator support-archive --stdout > operator-support-archive.zip

If you use the --stdout parameter, all support archive command output is written to stderr so as not to corrupt the support archive zip file.

Run `support-archive` in a standalone pod

Dynatrace Operator version 1.0.0+

If the operator pod is not functioning due to severe startup issues, you can run the support-archive command in a standalone pod using the following command. Keep in mind that running this command in a standalone pod is recommended only as a last resort.

kubectl run -n dynatrace support-archive --rm -i --overrides='{ "spec": { "serviceAccount": "dynatrace-operator" }  }' --restart Never --image <operator-image> -- support-archive --delay 10 --stdout > support-archive.zip

Ensure that you use the same image as the operator pod.
The --delay 10 parameter is important because kubectl run tends to miss the first few lines of output, which could lead to corruption of the support archive.
Specify the serviceAccount as dynatrace-operator in the command as it allows the standalone pod to access all necessary logs and manifests required for compiling the support archive. Note that this method relies on the Dynatrace Operator resources still being installed and available on the cluster.

Sample output

The following is sample output from running support-archive with the --stdout parameter.

kubectl exec -n dynatrace deployment/dynatrace-operator -- dynatrace-operator support-archive --stdout > operator-support-archive.zip

[support-archive]       dynatrace-operator      {"version": "v0.11.0", "gitCommit": "...", "buildDate": "...", "goVersion": "...", "platform": "linux/amd64"}
[support-archive]       Storing operator version into operator-version.txt
[support-archive]       Starting log collection
[support-archive]       Successfully collected logs logs/dynatrace-oneagent-csi-driver-bdnpc/server.log
[support-archive]       Successfully collected logs logs/dynatrace-oneagent-csi-driver-bdnpc/provisioner.log
[support-archive]       Successfully collected logs logs/dynatrace-oneagent-csi-driver-bdnpc/registrar.log
[support-archive]       Successfully collected logs logs/dynatrace-oneagent-csi-driver-bdnpc/liveness-probe.log
[support-archive]       Successfully collected logs logs/dynatrace-oneagent-csi-driver-cb4pc/server.log
[support-archive]       Successfully collected logs logs/dynatrace-oneagent-csi-driver-cb4pc/provisioner.log
[support-archive]       Successfully collected logs logs/dynatrace-oneagent-csi-driver-cb4pc/registrar.log
[support-archive]       Successfully collected logs logs/dynatrace-oneagent-csi-driver-cb4pc/liveness-probe.log
[support-archive]       Successfully collected logs logs/dynatrace-oneagent-csi-driver-k8bl5/server.log
[support-archive]       Successfully collected logs logs/dynatrace-oneagent-csi-driver-k8bl5/provisioner.log
[support-archive]       Successfully collected logs logs/dynatrace-oneagent-csi-driver-k8bl5/registrar.log
[support-archive]       Successfully collected logs logs/dynatrace-oneagent-csi-driver-k8bl5/liveness-probe.log
[support-archive]       Successfully collected logs logs/dynatrace-operator-6d9fd9b9fc-sw5ll/dynatrace-operator.log
[support-archive]       Successfully collected logs logs/dynatrace-webhook-7d84599455-bfkmp/webhook.log
[support-archive]       Successfully collected logs logs/dynatrace-webhook-7d84599455-vhkrh/webhook.log
[support-archive]       Starting K8S object collection
[support-archive]       Collected manifest for manifests/injected_namespaces/Namespace-default.yaml
[support-archive]       Collected manifest for manifests/dynatrace/Namespace-dynatrace.yaml
[support-archive]       Collected manifest for manifests/dynatrace/Deployment-dynatrace-operator.yaml
[support-archive]       Collected manifest for manifests/dynatrace/Deployment-dynatrace-webhook.yaml
[support-archive]       Collected manifest for manifests/dynatrace/StatefulSet-dynakube-activegate.yaml
[support-archive]       Collected manifest for manifests/dynatrace/DaemonSet-dynakube-oneagent.yaml
[support-archive]       Collected manifest for manifests/dynatrace/DaemonSet-dynatrace-oneagent-csi-driver.yaml
[support-archive]       Collected manifest for manifests/dynatrace/DynaKube-dynakube.yaml

Debug configuration and monitoring issues using the Kubernetes Monitoring Statistics extension

The Kubernetes Monitoring Statistics extension can help you:

Troubleshoot your Kubernetes monitoring setup
Troubleshoot your Prometheus integration setup
Get detailed insights into queries from Dynatrace to the Kubernetes API
Receive alerts when your Kubernetes monitoring setup experiences issues
Get alerted on slow response times of your Kubernetes API

Potential issues when changing the monitoring mode

Changing the monitoring mode from classicFullStackto cloudNativeFullStack affects the host ID calculations for monitored hosts, leading to new IDs being assigned and no connection between old and new entities.
If you want to change the monitoring method from applicationMonitoring or cloudNativeFullstack to classicFullstack or hostMonitoring, you need to restart all the pods that were previously instrumented with applicationMonitoring or cloudNativeFullstack.

Troubleshooting

Initial troubleshooting steps

General troubleshooting

Troubleshoot common Dynatrace Operator setup issues using the troubleshoot subcommand

Debug logs

Generate a support archive using the support-archive subcommand

Usage

Run support-archive in a standalone pod

Sample output

Debug configuration and monitoring issues using the Kubernetes Monitoring Statistics extension

Potential issues when changing the monitoring mode

Components

Monitoring issues

Connectivity issues

Troubleshoot common Dynatrace Operator setup issues using the `troubleshoot` subcommand

Generate a support archive using the `support-archive` subcommand

Run `support-archive` in a standalone pod