Alarms

The Alarms section of MyKeynote provides access to the alarm service, which sends messages to designated email addresses when website performance and/or availability breach thresholds of acceptability that you set. The email messages sent by the alarm service are referred to as alerts and can be formatted to serve your needs.

In conjunction with your carrier or other third-party services, you can set up alert emails to be delivered as text messages to mobile devices.

Alarms are available for Mobile Web Monitoring (MWP), Emulated Browser Web Monitoring (ApP), and Real Browser Web Monitoring (TxP) measurements.  

Note

We do not recommend creating alarms for Mobile App Monitoring (MAM) measurements.

The pages in the Alarms section provide these functions:

  • Summary: Shows both threshold and actual values for configured alarms; lists disabled alarms. 
  • Create: Set up your alarms—choose measurements and set up availability and/or performance alerts with thresholds for each, choose the agents or geographical locations from which data points should be considered, and select email formats, maintenance windows, and alert frequency.
  • Configure: Edit, suspend (disable for a period up to 24 hours), or delete alarms.
  • Email Layouts: Specify the information you want in alert emails and how you want them to be formatted.
  • Maintenance Windows: Set aside time periods for site maintenance during which data points do not trigger alarms. 
  • Baselines: Define a time period (in weeks) as a baseline, which allows alarms to be triggered by comparison with performance and availability computed during the baseline period.
  • Alarm Log: See a list of alarms that have been triggered recently.

Check out this video introducing Keynote Alarms at Dynatrace University, where you can find other videos on various aspects of alarms.

Alarm concepts

There are two basic types of alarms, performance alarms and availability alarms.

Performance alarms

Performance alarms are triggered when average measurement completion (or page download) time over a fixed period of time or over a fixed number of data points exceed predefined thresholds. You can choose to consider data points from all or certain specified geographical locations/agents. If a transaction-level error occurs (e.g., a page is not downloaded at all, or the page is downloaded but an optional keyword is missing), the data point is not included in the calculation of performance time.

Static and dynamic performance alarms

Thresholds can be static (based on absolute values for performance) or dynamic (based on multiples of baseline performance over a selected time span). Dynamic thresholds are expressed as a multiple of the baseline average performance. Baselines are dynamically recalculated, taking into account the most recent baseline period of 4 to 6 weeks. 

An example of a static performance threshold for sending out a warning alert would be 18 seconds (for measurement or component completion). A dynamic threshold for the same measurement could be 1.5 times the baseline performance over the last 4 weeks. If the baseline performance varied between 15 and 17 seconds, the calculated dynamic threshold would vary from 22.5 to 25.5 seconds. 

Baselines

Using baselines allows you to set a dynamic performance threshold based on the average performance of your measurement over a baseline period of 4 to 6 weeks. If your site performance varies a lot (e.g., during business hours compared to off-peak hours), we recommend using a dynamic over static performance threshold, which requires setting up a baseline period for comparison. The Baselines help page provides more information on setting up baselines.

Availability alarms

Availability alarms are triggered when the percentage of successful measurements (measurements not reporting errors) is lower than a threshold level you define. The threshold level applies to a period of time or specified number of data points, e.g., a warning threshold of 75% over 2 hours implies that if the success rate over the preceding 2 hour period is 75% or less, warning alerts will be triggered. A warning threshold of 75% of 25 data points implies that if the availability (success rate) of the 25 preceding data points is 75% or less, warning alerts will be triggered. 

You can consider data points from all or certain specified geographical locations/agents. You can also consider all or specified errors when computing availability.

Note on errors removed from availability

Availability is measured by dividing the number of successful measurements by the total number of measurements. Some internal errors are removed from consideration when computing availability because they may be caused by problems within Keynote agents or infrastructure as opposed to problems with customer sites. To see a list of all Keynote internal errors with information on which ones affect availability, see Internal and External Error Code Listing.

Alarm Severity Levels

You can set two levels of error alerts for both performance and availability, warning and critical. Alerts for each severity can be sent to a different email address or addresses.

  • Warning alerts are typically set up to be sent when measurement (website) performance or availability are at worse-than-desired but not yet critical levels. Warning alerts can provide early warning of site problems before they reach critical levels.
  • Critical alerts are typically set up to be sent when performance or availability conditions are seriously degraded and require immediate attention. Critical alerts are more severe than warning alerts.
  • Escalated alerts can be sent if a measurement remains in critical state for a specified period of time after critical alerts have been sent out.

Frequency of alarms

Alarms have three states: OK (no alarm condition is present), Warning, and Critical. Warning and critical are error states.

  • By default, alerts are sent twice for each severity level, i.e., the change in alarm state from OK to Warning, OK to Critical, Warning to Critical, or Critical to Warning. You can change the default frequency and escalation time period (half an hour) of alerts.
  • You can enable an OK alert when the alarm condition is no longer present, i.e., when the state changes from Critical to OK or Warning to OK.

Selecting all vs. specific cities or agents

You can set up your alarms based on data points from geographical locations (cities) or agents. If you choose alerts by cities, you can include/exclude data points from certain cities. Likewise, if you choose alerts by agents, you can include/exclude data points collected by certain agents. If your measurement is deployed in such a way that there is one agent per city, there is no substantive difference in choosing geographical locations or agents. However, if your measurement is carried by a number of different agents in the same city, alerting by agents allows more precise control in the data points included/excluded from alert calculations. 

Content and format of alerts

You can customize the format and content of alert emails. For more information, see Email Layout.

Alarm summary

The Summary tab shows:

  • Enabled alarms with performance and availability values as compared to alarm thresholds
  • Disabled alarms are listed and denoted with icons; temporarily disabled alarms are listed with threshold levels. 

Performance or availability alerting can be disabled by unchecking the Performance Alarms Enabled or the Availability Alarms Enabled boxes in alarm settings.

A suspended alarm is listed as temporarily disabled in the alarm summary. Unlike disabling an alarm, suspension is in effect for a maximum of 24 hours, and the alarm is automatically resumed after the suspension period. Disabled alarms must actively be re-enabled for alerting to be resumed.

Alarms are also listed as temporarily disabled during a maintenance window.

  • Current performance and availability threshold settings (for enabled and temporarily disabled alarms)
  • The current actual performance and availability values for enabled alarms

Note

Alarms for measurements that have expired in the Keynote Service Center (KSC) do not show up in the alarm summary. But when the measurement is reactivated, the alarm(s) will be visible again with the last saved settings.

If an alarm has warning (yellow) or critical (maroon) levels of performance or availability, you can hover over the alarm name to see the sequence of alerts sent. The most recent alerts are listed first. The image below shows a series of warning and OK alerts leading up finally to a warning, and then a critical alert.

Click an alarm Alias to view performance and availability trend graphs for the last 24 hours.