Processes crash for a multitude of reasons and it’s often difficult to understand the root causes that contribute to such crashes. When a monitored process crashes, you’ll see a process crash entry in the Events section of each affected process and host page. The example process below has some availability problems (shown in red on the timeline). By selecting the affected timeframe in the timeline, the Events section shows you the number of process crashes that occurred during that timeframe (1 crash in this example).
Click the Process crash details button to view a detailed list of the crashes that occurred during the selected timeframe. Here you’ll find all details related to why each process crashed.
The provided crash details include the signal that killed the process (for example,
Segmentation fault or
Abort), the execution stack frame that crashed, and more. The available details vary based on the type of crash and may include a native core dump, a Java core dump, or an abnormal program exit due to exceptions.
Note: This functionality works for all processes on each monitored host.
Analyze additional crash artifacts
Crash details often include a Download button that provides access to additional crash artifacts, such as
hs_err_pid files for Java crashes, text files that provide analysis of Linux and Windows core dumps, or files containing the .NET, Java, or Node.js exceptions that were potentially responsible for the crashes. For example, the Segmentation fault crash report above resulted in a core dump. Dynatrace OneAgent analyzed the core dump automatically and then produced the following report as a log artifact:
dumpproc version 188.8.131.5261025-115919, installer version 184.108.40.20661025-121046 2016-11-09 18:00:44: Application 'CreditCardAutho', inner pid '15891', outer pid '0', signal: 'Segmentation fault' (11) process group ID: 0x441b2cb89962033d process group instance ID: 0xfe58bab23100f42c process group Name: easytravel-*-x* threadCount: 1 thread: 0 - stack range: 0x7ffeda572000-0x7ffeda594000, size: 136 kB 0x00007ffeda592be0 0x00007f4de477604d libpthread-2.15.so!<imagebase>+0xf04d 0x00007ffeda592bf0 0x00000000004038d8 CreditCardAuthorizationS64!main+0x1b8 0x00007ffeda592c60 0x00007f4de41c676d libc-2.15.so!__libc_start_main+0xed 0x00007ffeda592d20 0x000000000040329a CreditCardAuthorizationS64!<imagebase>+0x329a mapped files: 0000000000400000-000000000041e000 0 /home/labuser/easytravel-2.0.0-x64/CreditCardAuthorizationS64 (MD5: da5992daf5ba3b76c633c853c7da5e87) 000000000051d000-000000000051e000 1d /home/labuser/easytravel-2.0.0-x64/CreditCardAuthorizationS64 (MD5: da5992daf5ba3b76c633c853c7da5e87) 00007f4de41a5000-00007f4de4359000 0 /lib/x86_64-linux-gnu/libc-2.15.so (GNU Build-Id: aa64a66ac46bff200848c0a0694011bd0140ab4e) 00007f4de4359000-00007f4de4558000 1b4 /lib/x86_64-linux-gnu/libc-2.15.so (GNU Build-Id: aa64a66ac46bff200848c0a0694011bd0140ab4e) 00007f4de4558000-00007f4de455c000 1b3 /lib/x86_64-linux-gnu/libc-2.15.so (GNU Build-Id: aa64a66ac46bff200848c0a0694011bd0140ab4e) 00007f4de455c000-00007f4de455e000 1b7 /lib/x86_64-linux-gnu/libc-2.15.so (GNU Build-Id: aa64a66ac46bff200848c0a0694011bd0140ab4e) 00007f4de4563000-00007f4de4565000 0 /lib/x86_64-linux-gnu/libdl-2.15.so (GNU Build-Id: d181af551dbbc43e9d55913d532635fde18e7c4e) 00007f4de4565000-00007f4de4765000 2 /lib/x86_64-linux-gnu/libdl-2.15.so (GNU Build-Id: d181af551dbbc43e9d55913d532635fde18e7c4e) 00007f4de4765000-00007f4de4766000 2 /lib/x86_64-linux-gnu/libdl-2.15.so (GNU Build-Id: d181af551dbbc43e9d55913d532635fde18e7c4e) 00007f4de4766000-00007f4de4767000 3 /lib/x86_64-linux-gnu/libdl-2.15.so (GNU Build-Id: d181af551dbbc43e9d55913d532635fde18e7c4e) 00007f4de4767000-00007f4de477f000 0 /lib/x86_64-linux-gnu/libpthread-2.15.so (GNU Build-Id: c340af9dee97c17c730f7d03693286c5194a46b8) 00007f4de477f000-00007f4de497e000 18 /lib/x86_64-linux-gnu/libpthread-2.15.so (GNU Build-Id: c340af9dee97c17c730f7d03693286c5194a46b8) 00007f4de497e000-00007f4de497f000 17 /lib/x86_64-linux-gnu/libpthread-2.15.so (GNU Build-Id: c340af9dee97c17c730f7d03693286c5194a46b8) 00007f4de497f000-00007f4de4980000 18 /lib/x86_64-linux-gnu/libpthread-2.15.so (GNU Build-Id: c340af9dee97c17c730f7d03693286c5194a46b8) 00007f4de4984000-00007f4de4a02000 0 /lib/x86_64-linux-gnu/liboneagentproc.so (220.127.116.1161025-115919) 00007f4de4a02000-00007f4de4c01000 7e /lib/x86_64-linux-gnu/liboneagentproc.so (18.104.22.16861025-115919) 00007f4de4c01000-00007f4de4c03000 7d /lib/x86_64-linux-gnu/liboneagentproc.so (22.214.171.12461025-115919) 00007f4de4c03000-00007f4de4c05000 7f /lib/x86_64-linux-gnu/liboneagentproc.so (126.96.36.19961025-115919) 00007f4de4cc0000-00007f4de4ce2000 0 /lib/x86_64-linux-gnu/ld-2.15.so (GNU Build-Id: e25ad1a11ccf57e734116b8ec9c69f643dca9f18) 00007f4de4ee2000-00007f4de4ee3000 22 /lib/x86_64-linux-gnu/ld-2.15.so (GNU Build-Id: e25ad1a11ccf57e734116b8ec9c69f643dca9f18) 00007f4de4ee3000-00007f4de4ee5000 23 /lib/x86_64-linux-gnu/ld-2.15.so (GNU Build-Id: e25ad1a11ccf57e734116b8ec9c69f643dca9f18)
Protect sensitive user data
Crash reports may contain sensitive personal information that should not be viewed by all users. For this reason, your Dynatrace administrator must enable the View logs account-security option and the View sensitive request data permissions in your user profile before you can view personal data. This option is disabled by default for all non-admin users and must be explicitly enabled before you can access log contents.
How Dynatrace handles crashes on Windows and core dumps on Linux
Crash handling on Windows
In order for a generic Windows process crash (core dump) to be visible to Dynatrace, the crash must be detected by Windows Error Reporting. For this reason, the Windows Error Reporting service must be enabled.
When a crash occurs on Windows, a dialog appears, asking if you want to debug or close the crashed application. This is not desirable for headless systems. You can disable this dialog by adding a value to the registry, as shown below:
[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\Windows Error Reporting] "DontShowUI"=dword:00000001
You can learn about other valuable settings related to Windows Error Reporting by visiting Microsoft documentation.
Linux core dump handling
In Linux, the way the core dump is handled by the kernel is set in
/proc/sys/kernel/core_pattern. Beginning with kernel 2.6.19 (1), there are two methods of dealing with application crashes. The core dump may either be written to a file pointed to by the
/proc/sys/kernel/core_pattern entry or pushed to an application—the entry must be prefixed with a vertical slash character (
Suse Linux uses the first method and so the entry is similar to
/proc/sys/kernel/core_pattern: core. This means that a file with the name
core is written in the current working directory of the crashed process.
Ubuntu and Redhat generally rely on their own tools for reporting crash dumps and so the lines appear as follows:
|/usr/share/apport/apport %p %s %c %P
|/usr/libexec/abrt-hook-ccpp %s %c %p %u %g %t e
In the latter example, when a program crashes, the
coredump output is pushed to
stdin of the application given in the first parameter. Moreover, the kernel fills the values of any parameters formatted as
apport reporting service overwrites the file
apport is enabled (in
/etc/default/apport), then the
/proc/sys/kernel/core_pattern configuration setting is set when the
apport crash reporting service starts on system boot.
Dynatrace installer core_pattern handling
The Dynatrace installer overwrites the core pattern with its own command but preserves the original pattern.
The content of the original
/proc/sys/kernel/core_patternfile is copied to
/opt/dynatrace/oneagent/agent/conf/original_core_pattern. When Dynatrace OneAgent is uninstalled, the uninstaller restores the original core pattern present in this file to
The content of the original
/etc/sysctl.confis copied to
/opt/dynatrace/oneagent/agent/conf/original.sysctl.corepattern. When Dynatrace OneAgent is uninstalled, the uninstaller restores the original core pattern present in this file to
Depending on the original entry in
core_pattern, Dynatrace will write different patterns to
core_pattern. The possible configurations and expected entries after installation are listed below:
|Original core_pattern entry||core_pattern after ruxitdumpproc installation||Comment|
|core|||/opt/dynatrace/oneagent/agent/rdp -p %p -e %e -s %s||Simple core dump without parameters.|
|core_%s_%e|||/opt/dynatrace/oneagent/agent/rdp -p %p -e %e -s %s -kp %s,%e||Simple core dump with parameters in the filename. The
||/usr/share/apport/apport|||/opt/dynatrace/oneagent/agent/rdp -p %p -e %e -s %s||Core dump next application without parameters. The
||/usr/share/apport/apport %p %s %c %P|||/opt/dynatrace/oneagent/agent/rdp -p %p -e %e -s %s -a %p %s %c %P||Core dump next application with parameters. The
Core handling by OneAgent dumpproc
When a crash occurs, then
rdp is called first to dump the core to OneAgent folders. This core is used by Crash Reporting functionality. In the next step, OneAgent reads the
/opt/dynatrace/oneagent/agent/conf/original_core_pattern and generates core according to the settings there. This means that if the original setup was writing the core file to a specific place, this would still happen after OneAgent was installed.
In a next step the core dump is analyzed to check if Dynatrace could have been the root cause of the crash. If that is the case, a support alert is generated. This is reported to our DevOps team. In such a case the core dump is zipped and retained in addition to all involved libraries. This is needed for later offline analysis.
If OneAgent determines that Dynatrace is not at fault, a crash is reported via the Dynatrace UI to the user and if it has any impact on the customer's application a problem is opened, and an appropriate event is generated for the involved processes as described above.
The log and support alert directories are cleaned up automatically.
- For support alerts, we process the
core dump, then zip it and keep it in order to be sent to cluster.
- For crashes (non-instrumented processes or instrumented ones where we decide Dynatrace is not at fault), we process and then delete the copy of the