Background Half Wave
Apps and Microservices

Performance monitoring

What is performance monitoring?

Performance monitoring is an IT operations practice that automatically observes key metrics of application and system operations to ensure services are available, reliable, and performing within the agreed-upon service-level objectives, or SLOs. Additionally, performance monitoring includes the collection of distributed traces, which record data on the time required to complete tasks, such as a series of service calls that implement an application function. The third component of performance monitoring is log analytics, which provides insight into the specific state of applications and infrastructure.

Cloud monitoring tools continuously collect data on the resources that software uses, such as CPU utilization and the amount of persistent storage used. Monitoring data is often used to create visualizations, such as dashboards, to provide a system performance summary and to generate alerts about performance or resource utilization anomalies that may need human intervention. Distributed traces help identify bottlenecks in processes that can create unacceptably long latencies or high error rates. Logging offers insights into the operations a system performs, errors encountered, and additional debugging details.

Metrics, distributed traces, and logs are important for understanding the state of applications and infrastructure, especially in highly dynamic cloud environments. Without automated performance monitoring, creating and maintaining services that meet complicated operational requirements can be challenging.

Problems at any level of a typical application stack can degrade performance or prompt failures. For example, a workload spike can cause a backlog of operations to build up in a message queue. Normally, this would trigger auto-scaling operations to scale up the application. But if a Kubernetes cluster can't add more nodes to the cluster for some reason, the backlog will continue to grow, and performance will degrade. Performance monitoring enables application managers and developers to quickly identify the root causes of performance issues and address problems before they adversely affect overall system operations.

To learn more about performance monitoring, see Application Observability.