More and more people are talking about the end of synthetic monitoring. It is associated with high costs and missing insight into real user performance. This is supported by the currently evolving standards of the W3C Performance Working Group which will help to get more accurate data from end users directly in the browser with deeper insight. Will User Experience Management using JavaScript agents eventually replace synthetic monitoring or will there be a coexistence of both approaches in the end?

I think it is a good idea to compare these two approaches in a number of categories which I see as important from a performance management perspective. Having intensively worked with both approaches I will present my personal experience. Some judgments might be subjective – but this is what comments are for 😉

Real User Perspective

One of the most if not the most important requirement of real user monitoring is to experience performance exactly as real users do. This means how close the monitoring results are to what real application users see.

Synthetic monitoring collects measures using pre-defined scripts executed from a number of locations. How close this is to what users see depends on the actual measurement approach. Only solutions that use real browsers and not just emulate provide reliable results. Some approaches only monitor from high-speed backbones like Amazon EC2 and only emulate different connection speeds making measurements only an approximation of real user performance. Solutions like Dynatrace Load in contrast measure from real user machines spread out across the world resulting in more precise results

Agent-based approaches like Dynatrace UEM measure directly in the user’s browser taking actual connection speed and browser behavior into account. Therefore they provide the most accurate metrics on actual user performance.

Transactional Coverage

Transactional coverage defines how many types of business transactions – or application functionality – are covered. The goal of monitoring is to cover 100 percent of all transactions. The minimum requirement is to cover at least all business critical transactions.

For synthetic monitoring, this directly relates to on the number of transactions which are modeled by scripts. The more scripts the higher the coverage. This comes at the cost of additional development and maintenance effort.

Agent-based approaches measure using JavaScript code which gets injected into every page automatically. This results in 100 percent transactional coverage. The only content that is not covered by this approach is streaming content as agent-based monitoring relies on JavaScript to be executed.

SLA Monitoring

SLA monitoring is a central to ensure service quality at the technical and business level. For SLA management to be effective not only internal but also third party services like ads have to be monitored.

While agent-based approaches provide rich information on end-user performance, they are not well suited for SLA management. Agent-based measurement depends on the user’s networks speed, local machine etc. This means a very volatile environment. SLA management however requires a well-defined and stable environment. Another issue with agent-based approaches is that third parties like CDNs or external content providers are very hard to monitor.

Synthetic monitoring using pre-defined scripts and provides a stable and predictable environment. The use of real browser and the resulting deeper diagnostics capabilities enable more fine-grained diagnostics and monitoring especially for third party content. Synthetic monitoring can also check SLAs for services which are currently not used by actual users.

Availability Monitoring

Availability monitoring is an aspect of SLA monitoring. We look at it separately as availability monitoring comes with some specific technical prerequisites which are very different between agent-based and synthetic monitoring approaches.

For availability monitoring, only synthetic script-based approaches can be used. They do not rely on JavaScript code being injected into the page but measures using on points of presence instead. This enables them to measure although a site is down which is essential for availability monitoring.

Agent-based will not collect any monitoring data if a site is actually down. The only exception is an agent-based solution which use also run inside the web server or proxy like Dynatrace UEM. Availability problems resulting from application server problems can then be detected based on HTTP response codes.

Understanding user-specific problems

In some cases – especially in a SaaS environment – the actual application functionality heavily depends on user-specific data. In case of functional or performance problems., information on a specific request of a user is required to diagnose a problem.

Synthetic monitoring is limited to the transactions covered by scripts. In most cases, they are based on test users rather than real user accounts (you would not want a monitoring system to operate a real banking account). For an eCommerce site where a lot of functionality does not depend on an actual user, synthetic monitoring provides reasonable insight here. For many SaaS applications this however is not the case.

Agent-based approaches are able to monitor every single user click resulting in a better ability to diagnose user-specific problems. They also collect metrics for actual user requests instead of synthetic duplicates. This makes them the preferred solution for websites where functionality heavily depends on the actual user.

Third Party Diagnostics

Monitoring of third party content poses a special challenge. As the resources are not served from our own infrastructure we only have limited monitoring capabilities.

Synthetic monitoring using real browsers provides the best insight here. All the diagnostics capabilities available within browsers can be used to monitoring third party content. In fact the possibilities for third party and own content are the same. Besides the actual content also networking or DNS problems can be diagnosed.

Agent-based approaches have to rely on the capabilities accessible via JavaScript in the browser. While new W3C standards of the Web Performance Working Group will make this easier in the future it is hard to do in older browser. It requires a lot of tricks to get the information whether third party content loads and performs well.

Proactive Problem Detection

Proactive problem detection targets to find problems before users do. This not only gives you the ability to react faster but also helps to minimize business impact.

Synthetic monitoring tests functionality continuously in production. This ensures that problems are detected and reported immediately irrespective if someone is using the site or not.

Agent-based approaches only collect data when a user actually accesses your site. If for example you are experiencing a problem with a CDN from a certain location in the middle of the night when nobody uses your site you will not see the problem before the first users accesses your site in the morning.

Maintenance Effort

Cost of ownership is always an important aspect of software operation. So the effort needed to adjust monitoring to changes in the application must be taken into consideration as well.

As synthetic monitoring is script based it is likely that changes to the application require changes to scripts. Depending on the scripting language and the script design the effort will vary. In any case there is continuous manual effort required to keep scripts up-to-date.

Agent-based monitoring, on the other hand, does not require any changes when the application changes. Automatic instrumentation of event handlers etc. ensures zero effort for new functionality. At the same time modern solution automatically inject the required HTML fragments to collect performance data automatically into HTML content at runtime.

Suitability for Application Support

Besides operations and business monitoring, support is the third main user of end user data. In case a customer complains that a web application is not working properly, information on what this user was doing and why it is not working is required.

Synthetic monitoring can help here in case of general functional or performance issues like a slow network from a certain location or broken functionality. It is however not possible to get information on what a user was doing exactly and to follow that user’s the click path.

Agent-based solutions provide much better insight. As they collect data for real user interactions they have all information required for understanding potential issues users are experiencing. So also problems experienced by a single user can be discovered.

Conclusion

Putting all these points together we can see that both – synthetic monitoring and agent-based approaches – have their strengths and weaknesses. One cannot simply choose one over the other. This is also validated by the fact that many companies use a combination of both approaches. This is also true for APM vendors which provide products in both spaces. The advantage of using both approaches is that modern agent-based approaches perfectly compensate for the weaknesses of synthetic monitoring leading to an ideal solution.