Extend the platform,
empower your team.
Enhanced insights for Spark Components
ExtensionThis extension collects JMX metrics to provide insights into resource usage, job and application status, and performance of your spark components.
Apache Spark metrics are presented alongside other infrastructure measurements, enabling in-depth cluster performance analysis of both current and historical data.
The extension enables insights into the overall health of Spark component instances
Activate this extension in your Dynatrace environment from the in-product Hub and simply select which OneAgents to enable this on.
You must configure the required component metrics to be reported to the JMX sink.
$SPARK_HOME/conf/metrics.properties
For example, you can enable metric collection to the JmxSink for the master, worker, driver and executor components with a command such as:
cat << EOF > $SPARK_HOME/conf/metrics.properties
*.sink.jmx.class=org.apache.spark.metrics.sink.JmxSink
master.source.jvm.class=org.apache.spark.metrics.source.JvmSource
worker.source.jvm.class=org.apache.spark.metrics.source.JvmSource
driver.source.jvm.class=org.apache.spark.metrics.source.JvmSource
executor.source.jvm.class=org.apache.spark.metrics.source.JvmSource
EOF
Refer to the Spark documentation for more details
Details: This extension can query and collect almost all component instance and namespace metrics as defined in Spark Metric Providers
Below is a complete list of the feature sets provided in this version. To ensure a good fit for your needs, individual feature sets can be activated and deactivated by your administrator during configuration.
Metric name | Metric key | Description | Unit |
---|---|---|---|
streaming.inputRate-total | spark.streaming.inputRate-total | - | Count |
streaming.latency | spark.streaming.latency | - | MilliSecond |
streaming.processingRate-total | spark.streaming.processingRate-total | - | Count |
streaming.states-rowsTotal | spark.streaming.states-rowsTotal | - | Count |
streaming.states-usedBytes | spark.streaming.states-usedBytes | - | Byte |
Metric name | Metric key | Description | Unit |
---|---|---|---|
shuffleService.numActiveConnections.count | spark.shuffleService.numActiveConnections.count | - | Count |
shuffleService.numRegisteredConnections.count | spark.shuffleService.numRegisteredConnections.count | - | Count |
shuffleService.numCaughtExceptions.count | spark.shuffleService.numCaughtExceptions.count | - | Count |
shuffleService.registeredExecutorsSize | spark.shuffleService.registeredExecutorsSize | - | Count |
Metric name | Metric key | Description | Unit |
---|---|---|---|
appStatus.stages.failedStages.count | spark.appStatus.stages.failedStages.count | - | Count |
appStatus.stages.skippedStages.count | spark.appStatus.stages.skippedStages.count | - | Count |
appStatus.stages.completedStages.count | spark.appStatus.stages.completedStages.count | - | Count |
appStatus.tasks.excludedExecutors.count | spark.appStatus.tasks.excludedExecutors.count | - | Count |
appStatus.tasks.completedTasks.count | spark.appStatus.tasks.completedTasks.count | - | Count |
appStatus.tasks.failedTasks.count | spark.appStatus.tasks.failedTasks.count | - | Count |
appStatus.tasks.killedTasks.count | spark.appStatus.tasks.killedTasks.count | - | Count |
appStatus.tasks.skippedTasks.count | spark.appStatus.tasks.skippedTasks.count | - | Count |
spark.appStatus.tasks.unexcludedExecutors.count | spark.appStatus.tasks.unexcludedExecutors.count | - | Count |
appStatus.jobs.succeededJobs.count | spark.appStatus.jobs.succeededJobs.count | - | Count |
appStatus.jobs.failedJobs.count | spark.appStatus.jobs.failedJobs.count | - | Count |
appStatus.jobs.jobDuration | spark.appStatus.jobs.jobDuration | - | MilliSecond |
Metric name | Metric key | Description | Unit |
---|---|---|---|
HiveExternalCatalog.fileCacheHits.count | spark.HiveExternalCatalog.fileCacheHits.count | - | Count |
HiveExternalCatalog.filesDiscovered.count | spark.HiveExternalCatalog.filesDiscovered.count | - | Count |
HiveExternalCatalog.hiveClientCalls.count | spark.HiveExternalCatalog.hiveClientCalls.count | - | Count |
HiveExternalCatalog.parallelListingJobCount.count | spark.HiveExternalCatalog.parallelListingJobCount.count | - | Count |
HiveExternalCatalog.partitionsFetched.count | spark.HiveExternalCatalog.partitionsFetched.count | - | Count |
Metric name | Metric key | Description | Unit |
---|---|---|---|
ApplicationSource.status | spark.ApplicationSource.status | - | Count |
ApplicationSource.runtime_ms | spark.ApplicationSource.runtime_ms | - | Count |
ApplicationSource.cores | spark.ApplicationSource.cores | - | Count |
Metric name | Metric key | Description | Unit |
---|---|---|---|
BlockManager.disk.diskSpaceUsed_MB | spark.BlockManager.disk.diskSpaceUsed_MB | - | MegaByte |
BlockManager.memory.maxMem_MB | spark.BlockManager.memory.maxMem_MB | - | MegaByte |
BlockManager.memory.maxOffHeapMem_MB | spark.BlockManager.memory.maxOffHeapMem_MB | - | MegaByte |
BlockManager.memory.maxOnHeapMem_MB | spark.BlockManager.memory.maxOnHeapMem_MB | - | MegaByte |
BlockManager.memory.memUsed_MB | spark.BlockManager.memory.memUsed_MB | - | MegaByte |
BlockManager.memory.offHeapMemUsed_MB | spark.BlockManager.memory.offHeapMemUsed_MB | - | MegaByte |
BlockManager.memory.onHeapMemUsed_MB | spark.BlockManager.memory.onHeapMemUsed_MB | - | MegaByte |
BlockManager.memory.remainingMem_MB | spark.BlockManager.memory.remainingMem_MB | - | MegaByte |
BlockManager.memory.remainingOffHeapMem_MB | spark.BlockManager.memory.remainingOffHeapMem_MB | - | MegaByte |
BlockManager.memory.remainingOnHeapMem_MB | spark.BlockManager.memory.remainingOnHeapMem_MB | - | MegaByte |
Metric name | Metric key | Description | Unit |
---|---|---|---|
LiveListenerBus.numEventsPosted.count | spark.LiveListenerBus.numEventsPosted.count | - | Count |
LiveListenerBus.queue.appStatus.numDroppedEvents.count | spark.LiveListenerBus.queue.appStatus.numDroppedEvents.count | - | Count |
LiveListenerBus.queue.appStatus.size | spark.LiveListenerBus.queue.appStatus.size | - | Count |
LiveListenerBus.queue.eventLog.numDroppedEvents.count | spark.LiveListenerBus.queue.eventLog.numDroppedEvents.count | - | Count |
LiveListenerBus.queue.eventLog.size | spark.LiveListenerBus.queue.eventLog.size | - | Count |
Metric name | Metric key | Description | Unit |
---|---|---|---|
DAGScheduler.job.activeJobs | spark.DAGScheduler.job.activeJobs | - | Count |
DAGScheduler.job.allJobs | spark.DAGScheduler.job.allJobs | - | Count |
DAGScheduler.messageProcessingTime.count | spark.DAGScheduler.messageProcessingTime.count | - | Count |
DAGScheduler.messageProcessingTime.oneminuterate | spark.DAGScheduler.messageProcessingTime.oneminuterate | - | PerMinute |
DAGScheduler.messageProcessingTime.mean | spark.DAGScheduler.messageProcessingTime.mean | - | MilliSecond |
DAGScheduler.stage.failedStages | spark.DAGScheduler.stage.failedStages | - | Count |
spark.DAGScheduler.stage.runningStages | spark.DAGScheduler.stage.runningStages | - | Count |
DAGScheduler.stage.waitingStages | spark.DAGScheduler.stage.waitingStages | - | Count |
Metric name | Metric key | Description | Unit |
---|---|---|---|
master.workers | spark.master.workers | - | Count |
master.aliveWorkers | spark.master.aliveWorkers | - | Count |
master.apps | spark.master.apps | - | Count |
master.waitingApps | spark.master.waitingApps | - | Count |
Metric name | Metric key | Description | Unit |
---|---|---|---|
worker.executors | spark.worker.executors | - | Count |
spark.worker.coresUsed | spark.worker.coresUsed | - | Count |
spark.worker.memUsed_MB | spark.worker.memUsed_MB | - | MegaByte |
spark.worker.coresFree | spark.worker.coresFree | - | Count |
spark.worker.memFree_MB | spark.worker.memFree_MB | - | MegaByte |
Metric name | Metric key | Description | Unit |
---|---|---|---|
mesos_cluster.waitingDrivers | spark.mesos_cluster.waitingDrivers | - | Count |
mesos_cluster.launchedDrivers | spark.mesos_cluster.launchedDrivers | - | Count |
mesos_cluster.retryDrivers | spark.mesos_cluster.retryDrivers | - | Count |
Metric name | Metric key | Description | Unit |
---|---|---|---|
executor.bytesRead.count | spark.executor.bytesRead.count | - | Byte |
executor.bytesWritten.count | spark.executor.bytesWritten.count | - | Byte |
executor.cpuTime.count | spark.executor.cpuTime.count | - | Count |
executor.filesystem.file.largeRead_ops | spark.executor.filesystem.file.largeRead_ops | - | Count |
spark.executor.filesystem.file.read_bytes | spark.executor.filesystem.file.read_bytes | - | Byte |
spark.executor.filesystem.file.read_ops | spark.executor.filesystem.file.read_ops | - | Count |
spark.executor.filesystem.file.write_bytes | spark.executor.filesystem.file.write_bytes | - | Byte |
executor.filesystem.file.write_ops | spark.executor.filesystem.file.write_ops | - | Count |
executor.recordsRead.count | spark.executor.recordsRead.count | - | Count |
executor.recordsWritten.count | spark.executor.recordsWritten.count | - | Count |
executor.succeededTasks.count | spark.executor.succeededTasks.count | - | Count |
Metric name | Metric key | Description | Unit |
---|---|---|---|
applicationMaster.numContainersPendingAllocate | spark.applicationMaster.numContainersPendingAllocate | - | Count |
spark.applicationMaster.numExecutorsFailed | spark.applicationMaster.numExecutorsFailed | - | Count |
applicationMaster.numExecutorsRunning | spark.applicationMaster.numExecutorsRunning | - | Count |
applicationMaster.numLocalityAwareTasks | spark.applicationMaster.numLocalityAwareTasks | - | Count |
applicationMaster.numReleasedContainers | spark.applicationMaster.numReleasedContainers | - | Count |