I think it is a good idea to compare these two approaches in a number of categories which I see as important from a performance management perspective. Having intensively worked with both approaches I will present my personal experience. Some judgments might be subjective – but this is what comments are for 😉
Real User Perspective
One of the most if not the most important requirement of real user monitoring is to experience performance exactly as real users do. This means how close the monitoring results are to what real application users see.
Synthetic monitoring collects measures using pre-defined scripts executed from a number of locations. How close this is to what users see depends on the actual measurement approach. Only solutions that use real browsers and not just emulate provide reliable results. Some approaches only monitor from high-speed backbones like Amazon EC2 and only emulate different connection speeds making measurements only an approximation of real user performance. Solutions like Dynatrace Load in contrast measure from real user machines spread out across the world resulting in more precise results
Agent-based approaches like Dynatrace UEM measure directly in the user’s browser taking actual connection speed and browser behavior into account. Therefore they provide the most accurate metrics on actual user performance.
Transactional coverage defines how many types of business transactions – or application functionality – are covered. The goal of monitoring is to cover 100 percent of all transactions. The minimum requirement is to cover at least all business critical transactions.
For synthetic monitoring, this directly relates to on the number of transactions which are modeled by scripts. The more scripts the higher the coverage. This comes at the cost of additional development and maintenance effort.
SLA monitoring is a central to ensure service quality at the technical and business level. For SLA management to be effective not only internal but also third party services like ads have to be monitored.
While agent-based approaches provide rich information on end-user performance, they are not well suited for SLA management. Agent-based measurement depends on the user’s networks speed, local machine etc. This means a very volatile environment. SLA management however requires a well-defined and stable environment. Another issue with agent-based approaches is that third parties like CDNs or external content providers are very hard to monitor.
Synthetic monitoring using pre-defined scripts and provides a stable and predictable environment. The use of real browser and the resulting deeper diagnostics capabilities enable more fine-grained diagnostics and monitoring especially for third party content. Synthetic monitoring can also check SLAs for services which are currently not used by actual users.
Availability monitoring is an aspect of SLA monitoring. We look at it separately as availability monitoring comes with some specific technical prerequisites which are very different between agent-based and synthetic monitoring approaches.
Agent-based will not collect any monitoring data if a site is actually down. The only exception is an agent-based solution which use also run inside the web server or proxy like user experience management. Availability problems resulting from application server problems can then be detected based on HTTP response codes.
Understanding user-specific problems
In some cases – especially in a SaaS environment – the actual application functionality heavily depends on user-specific data. In case of functional or performance problems., information on a specific request of a user is required to diagnose a problem.
Synthetic monitoring is limited to the transactions covered by scripts. In most cases, they are based on test users rather than real user accounts (you would not want a monitoring system to operate a real banking account). For an eCommerce site where a lot of functionality does not depend on an actual user, synthetic monitoring provides reasonable insight here. For many SaaS applications this however is not the case.
Agent-based approaches are able to monitor every single user click resulting in a better ability to diagnose user-specific problems. They also collect metrics for actual user requests instead of synthetic duplicates. This makes them the preferred solution for websites where functionality heavily depends on the actual user.
Third Party Diagnostics
Monitoring of third party content poses a special challenge. As the resources are not served from our own infrastructure we only have limited monitoring capabilities.
Synthetic monitoring using real browsers provides the best insight here. All the diagnostics capabilities available within browsers can be used to monitoring third party content. In fact the possibilities for third party and own content are the same. Besides the actual content also networking or DNS problems can be diagnosed.
Proactive Problem Detection
Proactive problem detection targets to find problems before users do. This not only gives you the ability to react faster but also helps to minimize business impact.
Synthetic monitoring tests functionality continuously in production. This ensures that problems are detected and reported immediately irrespective if someone is using the site or not.
Agent-based approaches only collect data when a user actually accesses your site. If for example you are experiencing a problem with a CDN from a certain location in the middle of the night when nobody uses your site you will not see the problem before the first users accesses your site in the morning.
Cost of ownership is always an important aspect of software operation. So the effort needed to adjust monitoring to changes in the application must be taken into consideration as well.
As synthetic monitoring is script based it is likely that changes to the application require changes to scripts. Depending on the scripting language and the script design the effort will vary. In any case there is continuous manual effort required to keep scripts up-to-date.
Agent-based monitoring, on the other hand, does not require any changes when the application changes. Automatic instrumentation of event handlers etc. ensures zero effort for new functionality. At the same time modern solution automatically inject the required HTML fragments to collect performance data automatically into HTML content at runtime.
Suitability for Application Support
Besides operations and business monitoring, support is the third main user of end user data. In case a customer complains that a web application is not working properly, information on what this user was doing and why it is not working is required.
Synthetic monitoring can help here in case of general functional or performance issues like a slow network from a certain location or broken functionality. It is however not possible to get information on what a user was doing exactly and to follow that user’s the click path.
Agent-based solutions provide much better insight. As they collect data for real user interactions they have all information required for understanding potential issues users are experiencing. So also problems experienced by a single user can be discovered.
Putting all these points together we can see that both – synthetic monitoring and agent-based approaches – have their strengths and weaknesses. One cannot simply choose one over the other. This is also validated by the fact that many companies use a combination of both approaches. This is also true for APM vendors which provide products in both spaces. The advantage of using both approaches is that modern agent-based approaches perfectly compensate for the weaknesses of synthetic monitoring leading to an ideal solution.