Creating or editing alarm settings

This page explains how to set up an alarm, whether you are creating one from scratch or editing an existing alarm.

  • For general information about alarms and the Alarm Summary , see this help page.

  • For information on preparing for setting up alarms, see this help page.

  • For **Transaction Perspective, ** **Application Perspective, and Mobile Web Perspective ** alarm details, see below.

  • For Streaming Perspe ctive alarm details, see below.

  • To learn about Mass Creating alarms, see below.

  • To learn about Managing or Mass Configuring  alarms, see this help page.

Creating alarms

Best practice

Before configuring alarms, run your measurements for a few days or even weeks so you can iron out any early issues related to scripting or provisioning. You want to make sure that your measurement runs smoothly and is measuring what you want to measure. This also allows you to get a better sense of performance and availability expectations and set up baselines.In running your measurement, the more data you have to work with, the easier it is to identify anomalies and the true variability in your measurement's performance and availability. If you run your measurement in "test" mode over fewer agents/locations than you eventually plan to deploy, the data you see might not be representative enough for setting baselines and thresholds.

To start the alarm configuration process:

  1. Navigate to > Create (in the Alarms section).

  2. Select a measurement (or group of measurements) that you want to create an alarm for. You can filter and sort the measurement list. ApP and TxP measurements are listed under the type "Transaction" while MWP measurements are listed under the type "Mobile Browser." While MAM measurements are listed under the type "Mobile App," we do not recommend creating alarms for them.

  3. Click Create Alarm at top right.  You are directed to the alarm configuration page. Use the explanations and best practices below to fill out details (see Alarm Settings for TxP, ApP, or MWP).

  4. Click Save Changes for your alarm settings to go into effect.

Alarms mass create

When creating alarms, you can choose multiple measurements to share the same settings or choose different settings for each measurement.

Note

The mass create feature offers convenience in alarm creation allowing you to configure multiple alarms with the same settings. Note, however, that a separate alarm is created for each measurement. An alarm can only apply to one measurement at a time, even though each measurement can have many different alarms.

  1. Select Alarms > Create .

  2. Select multiple measurements and click Create Alarm .

  3. Choose whether you want to configure each alarm separately or set up shared settings in alarms for all the measurements chosen.

  4. Click Submit and proceed with alarm creation.

Alarm settings for TxP, ApP, or MWP

Setting up alarms involves making optimal decisions about these and other considerations:

  • Whether to set up performance or availability alerts (or both)
  • Setting up and using baselines
  • Threshold types and levels to balance responsiveness to site issues with notification volume
  • Alert frequency
  • Whether to look back over a fixed period of time or a fixed number of data points to compute performance and availability
  • Basing alarms on data points by geographical location or agents
  • Email formats
  • Maintenance windows

You might also want to have different settings for when you first set up alerts versus later when you have had a chance to correct the errors triggered by the initial alarms.

The sections below describe alarm settings along with advice on best practices for using them. Be sure to select Save or Save Changes at the bottom of the page when done with configuration.

Note

When you create or edit alarm settings, changes can take up to 15 minutes to go into effect.

Alarm address configuration

Alarm Alias Enter an alias to refer to this alarm. While you can set up multiple alarms with different settings for the same measurement, each alarm must have a unique alias.

Tip

It is useful to have the alarm alias refer to the measurement name along with some key alarm settings. For example, if your measurement is called "MetLife IE," your alarm alias could "MetLife IE Availability" or "Metlife IE Dynamic."
Use this alarm for Dashboards Select this option to assign an alarm to your dashboard views; the Warning and Critical levels displayed for a measurement on a dashboard use thresholds configured for this alarm. The You can only assign one alarm per measurement using this flag. If this flag is not enabled on any alarms created for a given measurement, dashboards will use the most demanding active alert threshold.
Clone Config From You may select an existing alarm to copy settings from. You can then adjust the copied settings as needed for the new alarm you are creating.

Tip

When you clone an existing alarm, the Alias defaults to "Copy of <alarm copied>." This can be misleading if unchanged, as it could refer to a totally different measurement name. Be sure to change the alias to refer to the measurement you want to track.
Enable Warning Alert This option enables sending a warning alert when warning thresholds are met. If not selected, only critical alerts are sent. This option is checked by default and it is a best practice to receive both warning and critical alerts.
This option only affects whether warning alerts are sent ; warning alarms are shown in the  Alarm Log if warning thresholds are met, whether or not this option is checked.
Email Addresses , Email Group These radio buttons appear on the alarm configuration screen if email groups have been set up in the menu > Settings > Email Groups . Select Email Addresses to enter individual addresses for sending warning and critical alerts to. Select Email Group to select an email group for sending warning and critical alerts to.
Send Warning Alert To Enter comma-separated email addresses to receive alert emails when performance or availability reaches warning levels. A maximum of 255 characters is allowed, including spaces. Please also refer to email best practices, below.
Send Critical Alert To Enter comma-separated email addresses to receive alert emails when performance or availability reaches critical levels. This list can be a different from the warning alert recipient list. A maximum of 255 characters is allowed, including spaces. Please also refer toemail best practices, below.

Best practices for email recipients

  • To get around the character limitations of the email address fields for sending warning and critical alerts, you can enter the name of a distribution list set up within your organization.

  • You can also set up and use an email group in MyKeynote consisting of all the addresses you want to send alerts to. Be sure to select the Email Groups radio button in alarm settings.

    • Note, however, that both warning and critical alerts will be sent to the same email group.
  • You can also set up separate escalation email addresses for performance and availability alerts if an alarm severity (warning or critical) persists for a given duration.

Time zone configuration

Select the Time Zone to be used in alerts. This is the time zone in which alerts are triggered and displayed in the Alarm Summary. This time zone does not have to be the same as the account time zone.

Performance alarm configuration

Performance Alarms Enabled Check this box to turn on a performance alarm for the measurement.

Tip

You might not want to turn on performance alarms if transaction completion (i.e., availability) is more important to you than the time taken for completion.
If unchecked, this performance alarm shows up as disabled in the alarm summary.
Warning Performance Threshold A warning threshold is used to send alerts when performance is worse than desired, but not yet at critical levels. This is thestatic threshold or absolute value for sending a warning about performance. If the average performance time exceeds this threshold (over the time period of number of data points specified), a warning alert is sent out. Average performance time is recalculated every 5 minutes, when the alarm server polls measurement data. Your warning static threshold should be lower than your critical static threshold, e.g., 40 seconds and 50 seconds, respectively.
The range of acceptable values is 0.001-14400. If basing the alarm on Total Byes Downloaded, enter the number of kilobytes.
We do not recommend using static thresholds except under very limited circumstances; please see static threshold best practices, below.
Critical Performance Threshold A critical threshold is used to send alerts when performance degradation is serious. Critical alerts are more severe than warning alerts. This is thestatic threshold or absolute value for sending a critical alert about performance. If the average performance time exceeds this threshold (over the time period of number of data points specified), a critical alert is sent out. Average performance time is recalculated every 5 minutes, when the alarm server polls measurement data. Your warning static threshold should be lower than your critical static threshold, e.g., 40 seconds and 50 seconds, respectively.
The range of acceptable values is 0.001-14400. If basing the alarm on Total Byes Downloaded, enter the number of kilobytes.
We do not recommend using static thresholds except under very limited circumstances; please seestatic threshold best practices, below.
Dynamic Threshold Warning

Note

You need to set up a baseline for a measurement in order to see this field in its alarm settings.
Use this drop-down list to specify adynamic threshold for warning alerts expressed as a multiple of your baseline performance. If the average performance time exceeds this threshold (over the time period of number of data points specified), a warning alert is sent out. Average performance time is recalculated every 5 minutes, when the alarm server polls measurement data. Dynamic thresholds only work in conjunction with baselines. Baselines measure average performance over a 4 to 6 week period for use as the basis for setting a performance threshold as a multiplier, e.g., 1.1 x baseline performance. If you haven't defined one already, click New to set up a baseline for this measurement (or you can Edit an existing baseline). Only one baseline can be defined per measurement. (See Baselinesfor more information about defining baselines.) The warning threshold signifies degraded but not-yet-critical performance. Your critical dynamic threshold should be a larger multiple of your baseline than your warning threshold, e.g., warning = 1.5x and critical = 2.0x . If you do not wish to use a dynamic warning threshold, set this value to None . Please also check dynamic threshold best practices, below.
Dynamic Threshold Critical

Note

You need to set up a baseline for a measurement in order to see this field in its alarm settings.
Use this drop-down list to specify a dynamic threshold for critical alerts expressed as a multiple of your baseline performance. If the average performance time exceeds this threshold (over the time period of number of data points specified), a critical alert is sent out. Average performance time is recalculated every 5 minutes, when the alarm server polls measurement data. Dynamic thresholds only work in conjunction with baselines. Baselines measure average performance over a 4 to 6 week period for use as the basis for setting a performance threshold as a multiplier, e.g., 1.1 x baseline performance. If you haven't defined one already, click New to set up a baseline for this measurement (or you can Edit an existing baseline). Only one baseline can be defined per measurement. (See Baselines for more information about defining baselines.) The critical threshold is a higher severity level for degraded performance than the warning threshold. Your critical dynamic threshold should be a larger multiple of your baseline than your warning threshold, e.g., warning = 1.5x and critical = 2.0x . If you do not wish to use a dynamic critical threshold, set this value to None . Please also check dynamic threshold best practices, below.
Component Performance alarms can be based on either the total measurement time (the sum of all measurement components) or on specific components—user experience/network or browser events. Note that Custom Components are defined in your script.

Tip

Standard practice is to use Total Measurement Time .

Best practices for static performance thresholds

  • It is better not to rely on a static threshold. The actual performance of your measurement in different locations could vary widely around an average because of variability in CDN or carrier network speeds. It could also be that your performance changes over time, and you would have to update static thresholds manually in order not to receive false or overly frequent alerts.
  • An example of when you would want to use a static threshold is if you had a strict service agreement for absolute website performance.
  • If using static thresholds, you need not set up a baseline. Set the dynamic threshold levels to None .
  • Alarm configuration in MyKeynote requires that you specify a default static threshold. However, if you want to rely on dynamic settings instead, set static thresholds high enough that they are never reached. So if your measurement performance hovers around 15 seconds, set the static warning threshold to 100 seconds and the critical threshold to 150 seconds so that static alerts are never triggered.

Best practices for dynamic performance thresholds

  • We recommend using dynamic over static thresholds, except in cases where you are bound by a strict service agreement about absolute performance levels.
  • Dynamic thresholds require setting up baselines, so we recommend running your measurement over a large enough sample of agents so you can gather baseline performance data before setting up alarms.
  • Common settings for dynamic performance thresholds are 1.5 x for warnings and 3 x for critical alerts.
  • If you see poor performance in your MyKeynote data (e.g., in a scatter plot) and are concerned about non-delivery of alerts, it could be that the agent or city reporting the error is not included in your alarm configuration. Every error data point does not generate an alert; check performance thresholds (and the number of errors it would take to set off an alarm) as also alert frequency.

Availability alarm configuration

Availability Alarms Enabled Check to enable availability alarms. If unchecked, this availability alarm shows up as disabled in the alarm summary.
Warning Availability Threshold A warning availability alert is sent if your measurement availability drops to this percentage of successful runs over the time period or number of data points specified. Availability is recalculated every 5 minutes, when the alarm server polls measurement data. The warning threshold signifies degraded but not-yet-critical availability. Your warning threshold should be a larger percentage than your critical threshold, e.g., warning = 90% and critical = 80%. Please also check Availability Alert Best Practices, below.
Critical Availability Threshold A critical availability alert is sent if your measurement availability drops to this percentage of successful runs over the time period or number of data points specified. Average performance time is recalculated every 5 minutes, when the alarm server polls measurement data. The critical threshold is a higher severity level for degraded availability than the warning threshold. Your critical threshold should should be a smaller percentage than your warning threshold, e.g., warning = 90% and critical = 80%. Please also check Availability Alert Best Practices, below.
Include Transaction Errors Errors that count towards data points included in calculating availability Refer to the Internal and External error code listing for a description of miscellaneous as well as transaction errors. Default and recommended setting is All . Use Ctrl-click to select multiple errors. Please also check Availability Alert Best Practices, below.

Availability alert best practices

  • Setting availability thresholds can vary quite a bit based on considerations such as how sensitive you want the alarms to be and the known quality of your website, e.g., your site has known errors that you are aware of, and you only want to be alerted on hard downtimes.

  • For a site with intermittent errors but reliable availability, we recommend a warning threshold of 70-75% and a critical threshold of 60-50%. This balances responsiveness to downtimes with alert email volume.

    • While these can seem like high tolerance levels (i.e., many errors before alerts are sent out), we recommend a simple calculation to see how many data points it takes to trip an alert. If you run your measurement on 5 agents every 15 minutes, you would have a total of 40 data points in a 2-hour monitoring period (seeGeneral Optionsbelow). A warning threshold of 75% would trigger alerts when you had a total of 10 error runs in 2 hours, which is not too many errors before you are alerted.
    • You don't have to wait till the end of the 2-hour monitoring period to receive alerts. The alarm server looks back over the last 2 hours and calculates availability every 5 minutes. With a warning threshold of 75%, you wouldn't have to wait too long to receive alerts if availability hits the threshold.
    • If you had the same measurement running on 10 agents, it would take a total of 20 error runs in a 2-hour period to trip a 75% warning alert. To continue to be warned when there are 10 data points, you can drop the monitoring period down from 2 to 1 hour.
  • Common practice is to set your availability warning threshold no higher than 85% and then lower it to 75% in a week when availability issues have been ironed out.

  • For the critical availability threshold, common practice is to start out with a threshold of 60%. If your site has a lot of errors (that you are aware of and don't want to be alerted on), set the threshold back to 50%. If your site is fairly stable, leave the setting at 60%.

  • Availability alerts are typically based on All errors, with some miscellaneous errors related to Keynote agent issues automatically discounted. Change this setting to deselect specific known errors, e.g., script failure when known error text is found, that you don't wish to be alerted on; be sure to select other errors to be alerted on.

  • If you see errors in your MyKeynote data (e.g., in a scatter plot) but don't receive alerts, it could be that the agent or city reporting the error is not included in your alarm configuration. Every error data point does not generate an alert; check availability thresholds (and the number of errors it would take to set off an alarm) as also alert frequency.

  • Error data points that trigger an availability alert will still show downloaded objects in the waterfall graph; availability alerts are based on thresholds being breached, regardless of which objects load on the page.

Notification management

This section describes notification settings for both performance and availability alerts.

While there are separate settings for performance and availability notifications, their configuration and behavior is the same. You can specify a different alarm frequency, time interval, and escalation email address(es) for performance versus availability alert notifications using the instructions below.

# of Alerts per Severity Level Specifies the number of times an alert is sent for each severity level. The default value is 2. For example, if you set this value to 3, an initial alert is sent to warning or critical email recipients as soon as a threshold is reached. The second alert is sent when the alarm server runs again in 5 minutes and the third notification in 15 minutes (from the original alert). You can also set up optional escalations if your measurement crosses critical thresholds.
Set Alarm Escalation The time period (15 minutes - 2 hours) after which escalated critical alerts are sent out–if your measurement continues to be in critical state at the end of this period, an escalated critical alert is sent out. The time interval begins when the specified number of critical alerts have been sent out. This alert is sent in addition to the specified number of alerts for critical performance or availability. Select No additional notification if you do not want to receive escalated alerts. For example, you could choose to send 3 critical alerts with an escalation time period of 30 minutes. The first alert is sent the critical threshold is crossed, a second after 5 minutes, and a third after 10 minutes. A fourth, escalation alert is sent if the measurement's critical state persists for 30 minutes after the third alert is sent. Thereafter, escalations are sent every 30 minutes if the measurement remains in critical state. Optionally, you can designate escalation email addresses (in addition to the original alert recipients) to receive escalation alerts when a critical state has continued for the designated period.
Send Escalated Email To Optional comma-separated email addressesfor critical escalations and OK alerts (in addition to theoriginal alert recipients)—if you leave this field empty, escalation (and OK) alerts are sent to the original alert recipients.
Enable Site OK Alerts If this option is selected, alerts will be sent when the alarm state returns to OK from Warning or Critical state. Alerts are sent to:

Alert frequency and escalation best practices

  • For a starting level of alert notifications, use the default setting for alert frequency—2 alerts per severity level and a 30-minute time interval for sending critical escalations.
  • Consider ignoring the escalation setting if you are not in need of escalation alerts or multiple alerts.
  • Remember that you get to define separate email lists for warning alerts (both performance and availability), critical alerts (both performance and availability), performance escalations, and availability escalations. If a user who is not your warning or critical recipient list receives alerts, check the escalation lists!
  • To limit the number of alert emails you receive, set up a single alert to notify the correct contacts. Because the alarm server works on a “look-back” method, multiple alerts may be sent otherwise for an issue that has been addressed and/or resolved.
  • If you are concerned about non-delivery of alerts based on errors or poor performance in your MyKeynote data, it could be that the agent or city reporting the error is not included in your alarm configuration. Every error data point does not generate an alert; check performance or availability thresholds (and the number of errors it would take to set off an alarm) as also alert frequency.

General options

These settings apply to both performance and availability calculation and alerts.

Page Sequence You can choose to alarm based on the performance and availability of an individual transaction page or the entire transaction. Change the default Total setting only if you want to configure separate alarms for each page, perhaps with different thresholds or email recipients.
To map transaction page numbers to names, navigate to the menu > Settings > Page Names and click the Edit button next to a measurement.

Tip

Standard practice is to leave this setting at Total to base the alarm on the aggregate performance and availability of all transaction pages.
Networks
This setting only appears in alarm settings for MWP measurements.
Choose the wireless network type(s). Use Ctrl-click to select multiple options. Availability and performance alarm calculation is based only on data points gathered on the selected networks. The default is All networks.
Alarm based on radio buttons Choose whether to look back over a fixed period of time or a fixed number of data points to compute performance and availability.
  • Time - This is the time period over which performance data is averaged and availability percentage is computed. You can look back over a time period from the Last 5 minutes to 2 hours.

Note

  This is not the time period used in calculating the performance [baseline](/support/doc/keynote/portals/mykeynote-help/alarms/#anchor_baselines), which is used for setting thresholds for acceptable performance. Nor is this the time period for sending [escalated alerts](#anchor_escalation).
  • Data points - This is the number of most recent data points over which performance data is averaged and availability percentage is computed. You can choose the Last 1-20 data points.

Tip

Standard practice is to use thresholds based on Time . Use thresholds based on Data points if you have a very aggressive SLA with your customers.
Count content errors as page errors When measuring the performance of a site, Keynote registers page errors, in which the entire page failed to download, and content errors, in which one or more specific content elements of the page failed to download. Content error runs are depicted with a yellow icon in MyKeynote scatter plots. When this option is checked, pages with content errors are not included when determining aggregate performance, but count as errors when determining availability. If unchecked, runs with content errors are included in performance calculations but not counted as errors for calculating availability.
Maintenance Windows You can specify periods of time during which you do not want to have alarms triggered, typically during scheduled maintenance. Use Ctrl-click to select more than one maintenance window to apply to an alarm. Select an alarm and click Edit to change its configuration; click New to create a new maintenance window (a new maintenance window is not automatically applied to an alarm; you must select and apply it in alarm settings).
Select Email Format If you have set up formats for your alerts using the >Email Layout page, you can select the email format that will be used for this alias.
Include Targets
This option is only available if Target Groups have been created for your account by Keynote.
Target group alarm options
Target group alarm options
If you have Target Groups set up, you can set the number of targets within a target group that will be included in the calculation of performance or availability for alarm thresholds. You can also select specific targets from the target group to be included. Select Aggregate to include an aggregate (average) of measurements from all selected targets within the target group in triggering alarms. You can also select the specific number of targets that you want included. (Choosing 3 , for example, will result in alarms being triggered when at least 3 targets from within the target group have performance or availability that match the current thresholds.) Select All to trigger alarms only when ALL of the targets you've selected meet the current alarm thresholds.
Enable State Management
This option is only available for subscribers to the Keynote Adapter and MyKeynote Inside services.
Checking this box enables MyKeynote to do state management regarding of number of alerts that will be sent. Customers who have EMS systems may prefer to deselect this option so that their EMS systems can control state management.
Include Detail Alarm If selected, alerts use HTML format and contain extended error information (if errors are present) and other diagnostic information.

Geography configuration

These settings apply to both performance and availability calculation and alerts.

Alert based on City Select this option to base this alarm on cities, which allows you to include/exclude data points from certain cities. All Keynote agents located in a city are included/excluded. The alternative is to base the alarm on agents. You can also opt to be alerted only when data points in a specified number of cities cross thresholds.
Alert when...cities meet thresholds Set the number of cities required to meet the alarm thresholds before an alarm is triggered. Selecting Aggregate means that there is no minimum number of cities required to meet alarm thresholds. Choosing 3, for example, will result in alarms being triggered when at least 3 cities have performance or availability that match thresholds. Select All to trigger alarms only when all selected cities meet thresholds.
Include data from Cities Select All to have alarms based on the data points collected from all cities. Choose individual cities (using Ctrl-click) if you want your alarms to be based on measurements from specific cities.
Alert when...cities meet thresholds Set the number of cities required to meet the alarm thresholds before an alarm is triggered. Selecting Aggregate means that there is no minimum number of cities required to meet alarm thresholds. Choosing 3, for example, will result in alarms being triggered when at least 3 cities have performance or availability that match thresholds. Select All to trigger alarms only when all selected cities meet thresholds.
Include data from Cities Select All to have alarms based on the data points collected from all cities. Choose individual cities (using Ctrl-click) if you want your alarms to be based on measurements from specific cities.
Alert based on Agent Select this option to base this alarm on agents, which allows you to include/exclude data points collected by certain agents. The alternative is to base the alarm on cities. You can also opt to be alerted only when data points collected by a specified number of agents cross thresholds.
Alert when...agents meet thresholds Set the number of agents required to meet the alarm thresholds before an alarm is triggered. Selecting Aggregate means that there is no minimum number of agents required to meet alarm thresholds. Choosing 3, for example, will result in alarms being triggered when at least 3 agents have performance or availability that match thresholds. Select All to trigger alarms only when all selected agents meet thresholds.

Note

This field is not available to Enterprise Perspective logins.
Include data from Agents Select All to have alarms based on the data points collected from all agents. Choose individual agents (using Ctrl-click) if you want your alarms to be based on measurements from specific agents.
Alert when...agents meet thresholds Set the number of agents required to meet the alarm thresholds before an alarm is triggered. Selecting Aggregate means that there is no minimum number of agents required to meet alarm thresholds. Choosing 3, for example, will result in alarms being triggered when at least 3 agents have performance or availability that match thresholds. Select All to trigger alarms only when all selected agents meet thresholds.

Note

This field is not available to Enterprise Perspective logins.
Include data from Agents Select All to have alarms based on the data points collected from all agents. Choose individual agents (using Ctrl-click) if you want your alarms to be based on measurements from specific agents.

Geography configuration best practices

  • If your measurement is deployed on one agent per city, you can choose either the Agent or City radio button to the same effect. If, however, your measurement is deployed on multiple agents/carriers per city, the Agent  radio button gives you much more control over which data points within a given city to include/exclude.
  • Excluding cities or agents is useful if you find that some locations/agents are skewing data because they consistently have problems that are not properly diagnosed.
  • You might decide to choose a minimum number of cities/agents that cross threshold values before being alerted to ensure that any performance or availability issues are not specific to one location or carrier. On the flip side, if you set a minimum of number of cities or agents required to meet thresholds, you will not get alerted when fewer cities experience even critical levels of performance or availability.
  • If you see errors or poor performance in your MyKeynote data (e.g., in a scatter plot) but don't receive alerts, it could be that the agent or city reporting the error is not included in your alarm configuration. Every error data point does not generate an alert; check performance or availability thresholds (and the number of errors it would take to set off an alarm) as also alert frequency.