When do we add logging?

In order to analyze problems, developers tend to add additional logging information to the source code. There are different logging frameworks available, e.g.: log4j, log4net, EnterpriseLibrary Logging, … which provide an easy and extensible way to create log files. Logging offers additional insight into what’s actually going on in in the application when it is under load or when it is deployed in production without having a great impact on performance.

Why and when is Logging turned on?

When problems come up in a load-testing, staging or in the live environment, developers either provide a new version of the software including additional logging outputs or ask the Tester/IT Admin to turn up the log-level. The additional logging will then provide more contextual information about why a certain problem, e.g: performance problem or error occurred.

What’s the problem with logging?

There is a big downside of logging – which is the amount of data that is produced in a non-contextual manner. Once the log files are available for developers they need to be analyzed and correlated with other log files or other metrics that have been gathered by other tools used in the system, e.g.: load testing or monitoring solutions. Browsing through large log files – looking for log entries with particular timestamps and correlating those entries to certain events (performance or functional issues) usually takes a long time and not always result in the discovery of the root cause of the problem.

Transactional Logging: Getting the log information in the right context!

In order to improve root cause analysis based on log file entries log entries should be viewed based on a single transaction of the application, e.g.: a single page request or a click on a button. Knowing which transaction had a problem allows us to only focus on those log entries that are essential for the error analysis.
On the other side we should also be able to analyze certain log entries which indicate a problematic situation – and trace it to an individual transaction. This allows us to identify the transaction flow and why the transaction execution ended up in a problematic situation causing the log entry to be created.

Dynatrace captures log entries of the major logging frameworks along every transaction that is recorded. The log entries show up as PurePath nodes right at the location where the log messages have been created including the log message and severity. With this information its possible to analyze all log messages created for a single transaction – or identify those transactions that logged a certain message. The following image shows how to identify an individual transaction based on a certain log message.

Conclusion

Developers spend a lot of time analyzing problems based on log file entries. Analyzing and correlating values to other measures is a tedious job and takes a lot of time. Offering log entries in the context of a single transaction (web request, user interaction, …) will speed up the root cause analysis.