I’ve been on a SharePoint Performance Evangelist Tour for the past couple of months. Here are some of my contributions to make Users, and by result, SharePoint Admins happier than they are right now:

As part of my tour, we extended the Share Your PurePath program to SharePoint users. In exchange for a free trial extension, these users send me Dynatrace performance data captured from their environment and I analyze it for them.

In this blog I want to highlight some of the key takeaways from the data that was shared with me. The first step of analysis is always the transaction flow that highlights how components communicate with each other and where the performance and functional issues are:

Dynatrace Monitoring Dashboard highlighting where the hotspots in your SharePoint Infrastructure are
Dynatrace Monitoring Dashboard highlighting where the hotspots in your SharePoint Infrastructure are

#1: How to validate user complaints

When somebody calls in and says: “SharePoint is slow – AGAIN!” its not easy for a SharePoint Administrator to validate if there is a real problem or not. The following shows a screenshot of a Dynatrace User Action which gets captured for every single user on your SharePoint installation. Picking the actions for a complaining user with all technical details makes troubleshooting much easier. In fact – this user was complaining for a good reason.

Every User Action is captured in full technical detail showing where time is really spent when users navigate through SharePoint pages.
Every User Action is captured in full technical detail showing where time is really spent when users navigate through SharePoint pages.
Root cause is excessive and slow SQL Execution. A single click brings us to the actual SQL Statements
Root cause is excessive and slow SQL Execution. A single click brings us to the actual SQL Statements

#2: Gaining insight into the Impact of SQL Server on SharePoint Performance

Access to the SharePoint Content Database happens transparent to the SharePoint Administrators and Users. But – depending on how many, which and configuration options of the Web Parts you add on your SharePoint Sites and Pages – Database Performance impacts your SharePoint performance and End User Experience. Knowing what is really going on between SharePoint and SQL Server is essential to start optimizing SQL Server or the web pages you designed.

Here is an overview of the Dynatrace Database Dashlet that highlights problematic SQL Executions. With problematic I mean those that execute slow and those that are executed very often. Both cases can be optimized either on SQL Server or in SharePoint.

Slow SQLs as input for the SQL Server Admin or the Web Part to developer to optimize Indices or the statements itself
Slow SQLs as input for the SQL Server Admin or the Web Part to developer to optimize Indices or the statements itself
SQLs that are called multiple times per request are typically caused by bad coding of Web Parts or overloaded SharePoint pages.
SQLs that are called multiple times per request are typically caused by bad coding of Web Parts or overloaded SharePoint pages.

#3: Understanding Load Distribution across SharePoint AppPools

If you run SharePoint in a cluster or have different SharePoint Sites you need to check how the load is actually distributed and whether you need to scale out or scale down your environment depending on usage. Zooming into the Dynatrace Transaction Flow visualizes every single Site and AppPool with performance and load details that help in your decision on what to do next:

How is load distributed? Which AppPools are running low on resources? Where do we have issues? Which ones can we take out of IIS as they are not used?
How is load distributed? Which AppPools are running low on resources? Where do we have issues? Which ones can we take out of IIS as they are not used?

#4: Identifying killer Proxy Lookups

SharePoint can make web service calls to any other SharePoint site. In the following case SharePoint is calling a Web Service of its own Site Instance. Because they used the FQDN (Full Qualified Domain Name) of the SharePoint Server the .NET Framework first tried to figure out which proxy to use to reach that URL. This resulted in a 20s performance impact due to this lookup taking that long for that hostname. Additionally every web service call to the same SharePoint instance needs the “roundtrip” via SOAP and IIS to eventually end up in the same ASP.NET Instance. That’s a lot of overhead for a local instance call that could also be achieved through the SharePoint API.

Resolving a proxy for the FQDN results in a 20s performance penalty followed by a roundtrip via IIS instead of making a local SharePoint API call
Resolving a proxy for the FQDN results in a 20s performance penalty followed by a roundtrip via IIS instead of making a local SharePoint API call

#5: Getting visibility into the Resource Impact of Taxonomy Cache Updates

Have you heard about the Taxonomy Cache? If not – check out this summary blog post from Daniel Webster on Troubleshooting SharePoint’s Hidden Lists and Managed Metadata Columns.

SharePoint constantly checks whether there are updates for this cache. It does this by frequently calling the MetaDataService’s GetChange service in a background thread using Sleeps between these checks. Dynatrace gives great insight into what that means from a resource perspective (CPU, Memory, Bound Threads, …).

Transaction Flow highlights that the TaxonomyCache refresh implementation makes constant calls to the MetaDataService
Transaction Flow highlights that the TaxonomyCache refresh implementation makes constant calls to the MetaDataService
The Taxonomy Cache gets updated through a background thread that keeps calling the MetaDataService and sleeps between these calls
The Taxonomy Cache gets updated through a background thread that keeps calling the MetaDataService and sleeps between these calls