Overview
Collect, view, and analyze metrics from your NetApp OnTap clusters in context with your hosts, applications, and services already being monitored by OneAgents. Make use of powerful charting and dashboarding capabilities as well as allow the Davis® AI causation engine to generate baselines and alert you when anomalies are detected in designated metrics.
Metrics will be collected OnTap cluster-wide as well as for each of your nodes and storage virtual machines (SVMs).
Compatibility information
Connects to and collects data from the NetApp OnTap API. This REST API is available in OnTap 9.6+.
Details
Assets
A NetApp OnTap Overview dashboard is included with the extension. This includes links to access the various OnTap entities detected.

Metric events
Three metric event configurations are included with the extension. These must be enabled in the Metric events for alerting settings before they will be active:
- OnTap Cluster monitoring unavailable
- OnTap FRU in error state
- High Temperature on OnTap Node
Metrics
Metrics are associated with different feature sets that can be enabled or disabled as needed. Metrics will be collected once per minute.
Default
Cannot be disabled.
- Cluster availability: Connectivity to the configured OnTap cluster URL as detected by the extension
Aggregates
- Aggregate state: Current aggregate state: online, onlining, offline, offlining, relocating, unmounted, restricted, inconsistent, failed, or unknown
- Aggregate block storage used: Space used or reserved in bytes. Includes volume guarantees and aggregate metadata.
- Aggregate block storage available: Space available in bytes
- Aggregate block storage size: Total usable space in bytes, not including WAFL reserve and aggregate Snapshot copy reserve
- Aggregate block storage used percentage: Percentage of block storage used
Clusters
- Cluster IOPS (other, read, write, and total): The cluster's number of I/O operations observed at the storage object
- Cluster throughput (other, read, write, and total): The cluster's rate of throughput bytes observed at the storage object
- Cluster latency (other, read, write, and total): The cluster's raw latency in microseconds observed at the storage object
- Cluster block storage size: The size of the cluster's block storage
- Cluster block storage used: Amount of block storage on the cluster in use
- Cluster block storage used percentage: The percentage of the cluster's block storage that is currently in use
Disks
- Rated life used: Percentage of rated life used
- Disk state: Current disk state: broken, copy, maintenance, partner, pending, present, reconstructing, removed, spare, unfail, or zeroing
Field Replaceable Units (FRUs)
- FRU state: State of the field replaceable unit (100% for OK 0% for ERROR))
Nodes
- Node uptime: How long the node reports it has been running
- Over temperature status: Specifies whether the hardware is currently operating outside of its recommended temperature range (1 = "normal", 2 = "over").
Storage Virtual Machines (SVMs)
- SVM state: Current SVM state: starting, running, stopping, stopped,or deleting
Volumes
- Volume state: Volume state: error, mixed, offline, or online
- Volume throughput (other, read, write, and total): The volume's rate of throughput bytes observed at the storage object
- Volume IOPS (other, read, write, and total): The volume's number of I/O operations observed at the storage object
- Volume latency (other, read, write, and total): The volume's raw latency in microseconds observed at the storage object
- Volume size: Total provisioned size
- Volume space available: The available space
- Volume space used: Volume space used (including data and metadata)
- Volume space used percentage: Percentage of volume space used (including data and metadata)
LUNs
- LUN state: The state of the LUN. Normal states for a LUN are online and offline. Other states indicate errors
- LUN container state: The state of the volume and aggregate that contain the LUN. LUNs are only available when their containers are available
- LUN enabled state: The enabled state of the LUN. LUNs can be disabled to prevent access to the LUN. 1 = enabled, 0 = disabled
- LUN space used: The amount of space consumed by the main data stream of the LUN
- LUN size: The total provisioned size of the LUN
- LUN space used percentage: Space used in the LUN as a percentage
Installation
Requirements
- NetApp OnTap version 9.6+ with REST API reachable
- OnTap user with 'http' application access that is assigned a rest-role with at least readonly access to the following API paths (may vary depending on enabled feature sets):
- /api/cluster
- /api/svm/svms
- /api/storage/cluster
- /api/storage/aggregates
- /api/storage/disks
- /api/storage/volumes
- /api/storage/luns
Dynatrace configuration
Find 'NetApp OnTap' in the in-product Extensions or Hub page and activate (if offline you can download the extension from this Hub page in the 'Versions' section and install as a custom extension).
Monitoring configurations
Once activated in your environment you can create monitoring configurations. Each monitoring configuration can have one or more OnTap clusters configured.
First select the desired ActiveGate group that will run the monitoring configuration.
For each cluster configure a NetApp OnTap Extension Endpoint:
- OnTap REST API URL: URL (including protocol) to where OnTap API is available (e.g. https://ontap-prod/)
- Cluster name: Used in naming the cluster entity (default is detected hostname)
- Username: For API access
- Password: For API access (check requirements section for needed permissions)
- Proxy
- Verify SSL certificate
The Frequency can be used to collect metrics less frequently than the default of once per minute. You may need to use this in large clusters where collecting all requested data would take longer than 1 minute.
The Log level will be set at the monitoring configuration level and will apply to all endpoints. INFO by default. DEBUG logging is only needed when investigating issues with support.
Finally, enable the desired feature sets (refer to the Details tab for what metrics are associated with which feature sets).

Licensing
Licensing
There is no charge for obtaining the extension, only for the data (metrics & events) that the extension ingests. The details of license consumption will depend on which licensing model you are using. This will either be Dynatrace classic licensing or the Dynatrace Platform Subscription (DPS) model.
Metrics
License consumption is based on the number of metric data points ingested. The following formula will provide approximate annual data points ingested assuming all feature sets are enabled:
(16 + (3 x nodes) + (1 x frus) + (1 x svms) + (2 x disks) + (5 x aggregates) + (17 x volumes) + (6 x LUNs) x 60 min x 24 h x 365 days data points/year
Classic licensing
In the classic licensing model, metric ingestion will consume Davis Data Units (DDUs) at the rate of .001 DDUs per metric data point.
Multiply the above formula for annual data points by .001 to estimate annual DDU usage.
Log records
This extension will report log events in 2 situations:
- When a cluster node restart is detected
- When the extension cannot connect to the configured cluster API endpoint
- Each minute will have another event reported until the issue is resolved and a successful connection occurs
Log management and analytics (powered by Grail)
License consumption is based on the size (in bytes) of data ingested & processed, retained, and queried so there is not a single formula to estimate the total consumption from this extension. Consult the log management and analytics documentation for details on the other dimensions that will effect license consumption.
Classic licensing
In the classic licensing model, log record ingestion will consume Davis Data Units (DDUs) at the rate of 100 DDUs per Gigabyte of log records ingested.
Log monitoring classic
In log monitoring classic, license consumption is based on the number of ingested log records.
Classic licensing
In the classic licensing model, log record ingestion will consume Davis Data Units (DDUs) at the rate of .0005 DDUs per ingested log record.
Multiply estimated ingested log records by .0005 to estimate DDU usage from log records.