Apache Spark performance

All relevant key performance metrics about your Apache Spark instance in minutes

  • In-depth cluster performance analysis

    Dynatrace presents Apache Spark metrics alongside other infrastructure measurements, which enables in-depth cluster performance analysis of both current and historical data.

  • Details about all of your Apache Spark components

    Apache Spark performance monitoring provides insight into the resource usage, job status, performance of Spark Standalone clusters, and even more.

  • Pinpoint problems at the code level

    Dynatrace automatically pinpoints components that are causing problems with big data analytics of billions of dependencies within your entire application stack.

What is Apache Spark?

Apache Spark is an open-source cluster-computing framework. Originally developed at the University of California, Berkeley's AMPLab, the Spark codebase was later donated to the Apache Software Foundation, which has maintained it since. Spark provides an interface for programming entire clusters with implicit data parallelism and fault-tolerance. As a fast, in-memory data processing engine Apache Spark allows data workers to efficiently execute various tasks. Examples are streaming, machine learning or SQL workloads that require fast iterative access to datasets.

Dynatrace analyzes the activity of your Apache Spark processes, providing Spark-specific metrics alongside all infrastructure measurements. With Apache Spark monitoring enabled globally, Dynatrace automatically collects Spark metrics whenever a new host running Spark is detected in your environment. Since Dynatrace gathers all those metrics and data as soon as novel hosts are running Spark within your environment, a big data picture of the entire IT system can be seen easily und immediately.

Optimizing your Spark components in minutes

Dynatrace immediately detects your Apache Spark processes and shows key metrics like CPU, connectivity, retransmissions, suspension rate and garbage collection time.

  • Manual configuration of your monitoring setup is no longer necessary.
  • Auto-detection starts monitoring new hosts running Spark.
  • All data and metrics are retrieved immediately.

Improve your Spark performance

Dynatrace shows performance metrics for the main Spark components. The three main metrics for Spark are cluster managers, driver programs and worker nodes.

Apache Spark monitoring provides insight into the resource usage, job status, and performance of Spark Standalone clusters. The Cluster charts section provides all the information you need regarding Jobs, Stages, Messages, Workers, and Message processing.

For the full list of the provided cluster metrics please visit our detailed blog post about Apache Spark monitoring.

Access valuable Spark worker metrics

Apache Spark metrics are presented alongside other infrastructure measurements, enabling in-depth cluster performance analysis of both current and historical data.

Spark node and worker monitoring provides metrics including the number of

  • free cores
  • worker-free cores
  • cores used
  • worker cores used
  • executors
  • worker executors

For the full list of the provided worker metrics please visit our detailed blog post about Apache Spark monitoring.

Try Spark performance monitoring now!

You’ll be up and running in under 5 minutes: Sign up, deploy our agent and get unmatched insights out-of-the-box.
Dynatrace Free Trial

Organizations transforming with Dynatrace

See customer stories