Microsoft SQL Server monitoring

Monitor a Microsoft SQL Server instance remotely using the ActiveGate extension.

Prerequisites

  • Microsoft SQL Server with active extended support by Microsoft. Check the Microsoft documentation for end-of-life details.
  • Dynatrace Environment ActiveGate 1.171+ (we recommend the lastest)
  • ActiveGate host OS: Windows Server 2012 R2+, Debian 9+, or Red Hat 8+
  • Visual C++ Redistributable (Windows only). See the Microsoft documentation required for dll dependencies of extension.
  • Database user permissions as described below.

Database user permissions

Permissions specified in the Microsoft documentation for sys.dm_os_performance_counters are required.

  • On SQL Server and SQL Managed Instance, requires VIEW SERVER STATE permission.
  • On SQL Database Basic, S0, and S1 service objectives, and for databases in elastic pools, the server admin account or the Azure Active Directory admin account is required.
  • On all other SQL Database service objectives, the VIEW DATABASE STATE permission is required in the database.

Note: Windows Auth and Active Directory (AD) are not supported for connection credentials for this extension.

Extension installation

To install the extension

  1. In the Software Intelligence Hub, find and select Microsoft SQL Server (MSSQL - remote monitoring).

  2. On the Microsoft SQL Server (MSSQL - remote monitoring) page, select Download for the appropriate target platform to get the extension ZIP file. Don't rename the file.

    • The operating system specifies the system of the host on which ActiveGate is running, not the operating system on which the database is running.
    • The Red Hat 7.x option is for a legacy version that is no longer supported or updated due to incompatibility between the extension drivers and the older Linux kernel (Red Hat 7.x generation). For Red Hat 8+, download the Linux option.
  3. Unzip the ZIP file to the plugin_deployment directory of your ActiveGate host.

    plugin_deployment/custom.remote.python.mssql/<files>

  4. In the Dynatrace menu, go to Settings and select Monitoring > Monitored technologies.

  5. Select Add new technology monitoring.

  6. Select Add ActiveGate extension.

  7. Select Upload extension and upload the ZIP file.

  8. After you upload the extension, go to Settings > Monitoring > Monitored technologies and switch to the Custom extensions tab.

  9. Find the extension and select it to open it to display Add new endpoint.

  10. Configure the endpoint as described below.

Endpoint configuration

Field Details

Endpoint name

A meaningful endpoint name. Example value: Management01

Connection host

The address of the Microsoft SQL Server host. Can be an IP address or a domain. If this field contains a backslash \, then the part after the backslash is used as a named instance. Example values:

  • mssql.example.com
  • 204.146.46.209
  • mssql.example.com\mgt01 (host mssql.example.com and named instance mgt01)

The value of the named instance is used for queries to statistics objects.

  • If no instance name is supplied (for example, mssql.example.com), SQLServer:<Statistics object> is queried.
  • If the name of the named instance is supplied (for example, mssql.example.com\mgt01), MSSQL$mgt01:<Statistics object> is queried.

Connection port

Port on the host on which SQL Server is ready to accept connections from ActiveGate. Example value: 1433.

Database name

Name of the logical database to connect to. This can be any database to which the supplied database user can connect. Only one logical database name can be specified here. Example value: primary

Connection user

Database user whose credentials are going to be used to establish a connection to the database and query performance counters.

  • See database user permissions above for required permissions.
  • Windows Auth and Active Directory credentials are not supported.

A good practice is to create a separate user for Dynatrace to query required views.

User password

Password for the database user.

Source from which custom device name is pulled

Determines the name of the custom device associated with the monitored server. For details, see Custom device and custom device group below.

Manual name for the custom device

When Source from which custom device name is pulled is set to Custom, this value is used as a custom device name. For details, see Custom device and custom device group below.

Source from which custom device group name is pulled

Determines the name of the custom device group associated with the monitored server. For details, see Custom device and custom device group below.

Manual name for the custom device group

When Source from which custom device group name is pulled is set to Custom, this value is used as a custom device group name. For details, see Custom device and custom device group below.

Choose ActiveGate

The ActiveGate that will run the extension (where the extension is installed).

Custom device and custom device group

This section describes how to configure:

  • Source from which custom device name is pulled
  • Manual name for the custom device
  • Source from which custom device group name is pulled
  • Manual name for the custom device group

When the extension first registers the monitored database with the tenant, it creates a custom device with which all of the collected metrics for this endpoint are going to be associated.

These fields control the names of the custom device and the custom device groups associated with the monitored instance.

Each SQL Server database monitored through this extension is associated with its own custom device. By default, the name of that custom device the same as the name of the logical database entered into the Database name connection field in the configuration endpoint.

Custom devices are organized into custom device groups. Each custom device can only belong to a single group. But every group may contain several devices. By default, the name of the custom device group for the monitored database is Host:Port, where values for host and port are taken from the connection fields Connection host and Connection port.

Possible configurable sources for names of the custom device and group:

  • Database Name: value from the field Database name is used as a name.
  • Host: value from Connection host is used as a name.
  • Host:Port: combination of connection host and port (separated by a colon) is used.
  • Custom: Custom name from the neighboring Manual name… field is used.

Configuration of the custom device name

Custom device name configuration

Configuration of the custom device group name

Custom device group name configuration

Metric collection configuration

To configure metric collection

  1. Select or clear Only collect metrics from a logical database extension is connected to.

    When the extension connects to SQL Server from ActiveGate, it connects to a specific logical database. The database user specified in the connection fields queries serverwide performance counters and not the logical database specified in the Database name field.

    • Most metrics collected from the server are serverwide: they don't belong to any specific logical database, but rather to the whole server.
    • Some metrics, however, are collected per logical database, in which case a group of metrics is collected for each logical database on the server.

    When this checkbox is selected, you force the extension to collect only systemwide metrics and a single group of per-database metrics just for the database specified in the Database name field.

  2. Select the Collect <MetricName> checkbox for each metric you want to collect.

    Individual checkboxes enable you to trigger the collection of individual metrics for fine-grained control of the extension.

    • When a metric checkbox is cleared, the metric is not sent to the tenant and does not appear on graphs in the web UI.
    • If none of the metrics that should be present on a graph is reported, the graph is not rendered.

Metrics

The following metrics are available.

General statistics

The SQLServer:General Statistics object in SQL Server provides counters to monitor general server-wide activity, such as the number of current connections and the number of users connecting and disconnecting per second from computers running an instance of SQL Server. This can be useful when you are working on large online transaction processing (OLTP) type systems where there are many clients connecting and disconnecting from an instance of SQL Server.

Metric name Metric description

Database service availability (%)

Availability of the database host.

User Connections (count)

The number of users currently connected to SQL Server.

Logins/sec

Total number of logins started per second. This does not include pooled connections.

Processes blocked (count)

Number of currently blocked processes.

Latch metrics

The SQLServer:Latches object in Microsoft SQL Server provides counters to monitor internal SQL Server resource locks called latches. Monitoring the latches to determine user activity and resource usage can help you to identify performance bottlenecks.

Metric name Metric description

Average Latch Wait Time (ms)

Average latch wait time (in milliseconds) for latch requests that had to wait.

Latch Waits/sec

Number of latch requests that could not be granted immediately.

Lock statistics

The SQLServer:Locks object in Microsoft SQL Server provides information about SQL Server locks on individual resource types. Locks are held on SQL Server resources, such as rows read or modified during a transaction, to prevent concurrent use of resources by different transactions. For example, if an exclusive (X) lock is held on a row within a table by a transaction, no other transaction can modify that row until the lock is released. Minimizing locks increases concurrency, which can improve performance. Multiple instances of the Locks object can be monitored at the same time, with each instance representing a lock on a resource type.

Metric name Metric description

Lock Timeouts/sec

Number of lock requests per second that timed out, including requests for NOWAIT locks.

Lock Wait Time (ms)

Total wait time (in milliseconds) for locks in the last second.

Lock Waits/sec

Number of lock requests per second that required the caller to wait.

Deadlocks/sec

Number of lock requests per second that resulted in a deadlock.

SQL statistics

The SQLServer:SQL Statistics object in SQL Server provides counters to monitor compilation and the type of requests sent to an instance of SQL Server. Monitoring the number of query compilations and recompilations and the number of batches received by an instance of SQL Server gives you an indication of how quickly SQL Server is processing user queries and how effectively the query optimizer is processing the queries.

Compilation is a significant part of a query's turnaround time. To save compilation costs, the database engine saves the compiled query plan in a query cache. The objective of the cache is to reduce compilation by storing compiled queries for later reuse so they don't have to be recompiled if executed again later. However, each unique query must be compiled at least one time. Query recompilations can be caused by the following factors:

  • Schema changes, including base schema changes such as adding columns or indexes to a table, or statistics schema changes such as inserting or deleting a significant number of rows from a table.
  • Environment (SET statement) changes. Changes in session settings such as ANSI_PADDING or ANSI_NULLS can cause a query to be recompiled.
Metric name Metric description

Batch Requests/sec

Number of Transact-SQL command batches received per second. This statistic is affected by all constraints (such as I/O, number of users, cache size, and complexity of requests). High batch requests means good throughput.

SQL Compilations/sec

Number of SQL compilations per second. Indicates the number of times the compile code path is entered. Includes compiles caused by statement-level recompilations in SQL Server. After SQL Server user activity is stable, this value reaches a steady state.

SQL Re-Compilations/sec

Number of statement recompilations per second. Counts the number of times statement recompiles are triggered. Generally, you want this to be low.

Memory statistics

Monitoring server and process memory usage to gauge user activity and resource usage can help you to identify performance bottlenecks. Monitoring the memory used by an instance of SQL Server can help determine:

  • If bottlenecks exist from inadequate physical memory for storing frequently accessed data in cache. If memory is inadequate, SQL Server must retrieve the data from disk.
  • If query performance can be improved by adding more memory or by making more memory available to the data cache or SQL Server internal structures.
Metric name Metric description

Target Server Memory (KB)

Indicates the ideal amount of memory the server can consume.

Total Server Memory (KB)

Specifies the amount of memory the server has committed using the memory manager.

Connection Memory (KB)

Specifies the total amount of dynamic memory the server is using for maintaining connections.

Memory Grants Outstanding (count)

Specifies the total number of processes that have successfully acquired a workspace memory grant.

Memory Grants Pending (count)

Specifies the total number of processes waiting for a workspace memory grant.

Database statistics

Monitor transactions and the transaction log to determine how much user activity is occurring in the database and how full the transaction log is becoming. The amount of user activity can determine the performance of the database and affect log size, locking, and replication. Monitoring low-level log activity to gauge user activity and resource usage can help you to identify performance bottlenecks.

Multiple instances of the Databases object, each representing a single database, can be monitored at the same time.

Note that the metrics listed below are collected for each logical database present on the instance to which the endpoint connects. Distinct metrics for each logical database will be available as dimensions on the Further details tab of the monitored technology.

To limit collection of the metrics in this section to a single logical database, select the Only collect metrics from a logical database extension is connected to checkbox in endpoint configuration.

Metric name Metric description

Log File(s) Size (KB)

Cumulative size (in kilobytes) of all the transaction log files in the database.

Log File(s) Used Size (KB)

The cumulative used size of all the log files in the database.

Log Flush Waits/sec

Number of commits per second waiting for the log flush.

Log Growths

Total number of times the transaction log for the database has been expanded.

Log Shrinks

Total number of log shrinks for this database.

Log Truncations

The number of times the transaction log has been truncated (in Simple Recovery Model).

Log Used (%)

Percentage of space in the log that is in use.

Transactions/sec

Number of transactions started for the database per second. Transactions/sec does not count XTP-only transactions (transactions started by a natively compiled stored procedure).

Buffer Manager statistics

The Buffer Manager object provides counters to monitor how SQL Server uses:

  • Memory to store data pages.
  • Counters to monitor the physical I/O as SQL Server reads and writes database pages.
  • Buffer pool extension to extend the buffer cache by using fast non-volatile storage such as solid-state drives (SSDs).

Monitoring the memory and the counters used by SQL Server helps you determine:

  • If bottlenecks exist from inadequate physical memory. If it cannot store frequently accessed data in cache, SQL Server must retrieve the data from disk.
  • If query performance can be improved by adding more memory, or by making more memory available to the data cache or SQL Server internal structures.
  • How often SQL Server needs to read data from disk. Compared with other operations, such as memory access, physical I/O consumes a lot of time. Minimizing physical I/O can improve query performance.
Metric name Metric description

Page reads/sec

Indicates the number of physical database page reads that are issued per second. This statistic displays the total number of physical page reads across all databases. Because physical I/O is expensive, you may be able to minimize costs by changing the database design or by using a larger data cache, intelligent indexes, and more efficient queries.

Page writes/sec

Indicates the number of physical database page writes that are issued per second.

Page life expectancy (seconds)

Indicates the number of seconds a page will stay in the buffer pool without references.

Free list stalls/sec

Indicates the number of requests per second that had to wait for a free page.

Checkpoint pages/sec

Indicates the number of pages flushed to disk per second by a checkpoint or other operation that require all dirty pages to be flushed.

Buffer cache hit (%)

Indicates the percentage of pages found in the buffer cache without having to read from disk. The ratio is the total number of cache hits divided by the total number of cache lookups over the last few thousand page accesses. After a long period of time, the ratio moves very little. Because reading from the cache is much less expensive than reading from disk, you want this ratio to be high. Generally, you can increase the buffer cache hit ratio by increasing the amount of memory available to SQL Server or by using the buffer pool extension feature.

Availability Replica

The SQLServer:Availability Replica performance object contains performance counters that report information about the availability replicas in Always On availability groups in SQL Server.

All availability replica performance counters apply to both the primary replica and the secondary replicas, with send/receive counters reflecting the local replica.

  • For the most part, the primary replica sends most of the data, and the secondary replicas receive the data.
  • However, secondary replicas send ACKs and some other background traffic to the primary replicas.

Note that, on a given availability replica, some counters will show a zero value, depending on the current role—primary or secondary—of the local replica.

Metric name Metric description

Bytes sent to replica/sec

In SQL Server 2012 and 2014: Actual number of bytes (compressed) sent per second over the network to the remote availability replica (sync or async). Compression is enabled for both sync and async replica by default.

In SQL Server 2016 (13.x) and later: Number of bytes sent to the remote availability replica per second. Before compression for async replica. (Actual number of bytes for sync replica, which has no compression.)

Resource footprint

The Microsoft SQL Server remote extension has a low memory footprint, requiring just a few MiB per endpoint after the extension is initialized. Multiple endpoints running on the same host reuse the same objects in memory.

Each endpoint runs once per minute and performs 5-6 lightweight SQL queries to the monitored database to collect the metrics.

DDU consumption

By default, the extension collects and reports the following number of metrics each minute:

  • 24 instance-wide metrics
  • 8 metrics per logical database on the server (Database statistics metrics)

For example, if an endpoint monitors an instance with 10 logical databases, the number of metrics reported per minute is:

24 instance-wide metrics + 8 logical database metrics * 10 logical databases on instance = 104 metrics reported every minute

Troubleshoot

General

For general ActiveGate extension troubleshooting information, see Troubleshoot ActiveGate extensions.

The remote monitoring extension for Microsoft SQL Server employs JDBC connections to query performance metrics from SQL Server instance by connecting to it from ActiveGate.

Logs

To take a look at the logs, check the following locations:

  • On Linux, logs are stored at /var/lib/dynatrace/remotepluginmodule/log/remoteplugin
  • On Windows, logs are stored at %PROGRAMDATA%\Dynatrace\remotepluginmodule\log\remoteplugin

When the extension is started successfully by ActiveGate, a separate log directory named custom.remote.python.mssql is created for it. The logs for the MS SQL extension contain all details about connection problems, SQL Server issues, and issues with the extension itself.

Limitations

Topic Limitation

Events and alerts

Alerts and events must be configured manually for remote Microsoft SQL Server monitoring extension. The extension does not come with predefined setups or alert configurations out of the box.

Links to other monitored entities, SmartScape

Being built on top of the Dynatrace Extension Framework 1.0, the SQL Server extension does not support interlinking of custom devices monitored by custom extension like this one and the processes monitored by OneAgent.

That means that a custom device that represents a SQL Server monitored by this extension remotely will be presented as a separate entity from the database process monitored by OneAgent on the same SQL server. In SmartScape, they will look like two distinct nodes. This is a limitation of the Extension Framework 1.0 and will be fixed with the next generation of the framework.

Query plans, top queries

The extension only collects metrics available in the performance counters views in the database. It does not analyze query plans or most frequent and slowest queries. However, such information can be collected by the OneAgent using local monitoring for SQL Server.

Known issues

Restart extension

To restart the extension, use one of these options:

  • Restart remote plugin module
  • Restart ActiveGate