maintenance windows

The latest release of Dynatrace enables you to define maintenance windows using either the Dynatrace API or the Dynatrace web interface.

Even if your organization runs a ‘100% availability’ production environment, there are times when your DevOps team must roll out new software updates. Various release-deployment strategies are available for this, including performing rolling updates and iteratively updating parts of instances. No matter which release-rollout strategy your DevOps team uses, it’s good practice to keep your performance monitoring system informed of these activities to ensure accurate monitoring data.

Maintenance window overview

Dynatrace distinguishes between two types of maintenance windows: planned maintenance windows and unplanned maintenance windows. Planned maintenance windows are configured in advance while unplanned windows are added retroactively to notify Dynatrace of unexpected downtimes that shouldn’t be factored into overall performance and availability metrics. Dynatrace adapts its baselining, alerting, and problem detection behavior based the type of the configured maintenance window.

Each maintenance window you configure has a name and description that you can use to provide contextual information about the purpose of the maintenance window.

Once a maintenance window is defined, Dynatrace automatically excludes the configured time period from its baseline calculations. With this approach, any response time anomalies that occur during the corresponding rolling update won’t negatively influence your overall service and application baselines.

With respect to baselining, it’s a good idea to define your maintenance windows before performing any load testing. Using maintenance windows during load testing ensures that any load spikes, longer-than-usual response times, or increased error rates won’t negatively influence your overall baselining.

To define a maintenance window via the Dynatrace UI

  1. Go to Settings > Maintenance > Maintenance windows.
  2. Define a Name for the maintenance window.
  3. Provide a Description of the purpose of the maintenance window.
  4. From the Maintenance type drop list, select Planned or Unplanned.
  5. If the maintenance window is to recur on a regular schedule, use the drop lists to define a daily, weekly, or monthly recurring schedule.
  6. From the Problem detection and alerting drop list, specify the action that Dynatrace should take if a monitored component experiences a problem during a scheduled maintenance window:
    • Detect problems and alert: Dynatrace will automatically detect and report all problems as usual and display a maintenance window icon (wrench and bolt icon, see below) on each problem that is detected during a maintenance window.
    • Detect problems but don’t alert: Problems will be detected but Dynatrace won’t send out alerts for the problems. Each problem will be listed on the Problems page with a maintenance window icon.
    • Disable problem detection: Detection and alerting of problems is disabled. Problems that occur during scheduled maintenance windows will not be included on the Problems page and no alerts will be sent out.
  7. The Scope of maintenance section of the page enables you to further reduce the set of monitored components that are included in the configured maintenance window. You can include entity tags for specific ApplicationsServices, or Hosts (see host tag example in the image below) or for tagged groups of components (for example, all hosts that have the tag PROD. If no scope filter is defined, the maintenance window affects your entire environment.

Define maintenance windows using REST API

Most users find it easy to define maintenance windows and downtimes using the settings page detailed above. Your DevOps team will however likely prefer to use our automation REST API to define maintenance windows. With our REST API, you get all the functionality you need to read, create, and update maintenance window configurations.

To read all defined maintenance windows, execute an HTTP GET call to /api/v1/maintenance/. The result is shown below:

[
	{
		"id":"New application deployment",
		"type":"Planned",
		"description":
		"We will deploy a new easyTravel application version",
		"suppressAlerts":false,
		"suppressProblems":false,
		"scope":null,
		"schedule":{
			"type":"Day",
			"timezoneId":
			"Europe/Vienna",
			"maintenanceStart":"2017-08-29 14:43",
			"maintenanceEnd":"2017-08-29 15:43",
			"recurrence":{
				"start":"14:43","duration":556
			}
		}
	}
]

An HTTP POST request to /api/v1/maintenance/ with the payload below creates a new maintenance window:

{
	"id" : "theWindowId",
	"type": "Planned",
	"description" : "Again another release",
	"suppressAlerts" : true,
	"suppressProblems" : false,
	"scope" : {
		"entities" : [
			"HOST-0B3371A5AC53FF12", "SERVICE-13FA1F30530CDEE1"
		],
		"matches" : [
			{
				"type" : "HOST",
				"tags" : [ 
					{
						"context" : "AWS",
						"key" : "myTag1", 
						"value" : "myValue1"
					},
					{	"key" : "myTag2" }
				]
			}
		]
	},
	"schedule" : {
		"type" : "Month",
		"timezoneId" : "Universal",
		"maintenanceStart" : "2017-01-01 00:00",
		"maintenanceEnd" : "2017-10-01 00:00",
		"recurrence" : {
			"dayOfMonth" : 4,
			"start" : "11:00",
			"duration" : 30
		}
	}
}

Refer to Dynatrace Help for more details about our maintenance window API.

Once you’ve defined your maintenance windows, Dynatrace flags all problems that occur during maintenance windows with a special maintenance (wrench and bolt) icon (see examples below). If you chose to completely disable problem detection during maintenance windows, no detected problems will be displayed here.

The Problems page filters now include an Under maintenance filter that enables you to view a list of problems that occurred during maintenance windows (see below).

If you open a problem that occurred during a maintenance window, Dynatrace shows a header on the Problem page, as shown below.

Even if you aren’t within a problem context and you select a global timeframe in which a selected host was under maintenance, Dynatrace shows you the details on the Maintenance tile. If the host is included in multiple, maintenance periods, Dynatrace shows you the most recent window and a count of how many maintenance windows the host experienced during the selected timeframe.

The newly introduced maintenance window feature enables Dynatrace you identify periods of possibly abnormal operation, such as downtimes, reduced performance periods, and high traffic events during load tests. Defining maintenance windows during abnormal operation times helps you reduce alert spam and keep your baseline clean for accurate monitoring and alerting. By providing a convenient and powerful automation API your DevOps teams can automatically create or modify maintenance windows in sync with your release pipeline.

One additional aspect that will be part of an upcoming release is the exclusion of planned maintenance window from your SLA reports. So stay tuned.