Production Monitoring

Tibco Business Events Memory leak analysis in live production

As a performance architect I get called into various production performance issues. One of our recent production issues happened with Tibco Business Event (BE) Service constantly violating our Service Level Agreements (SLA) after running 10-15 hours since the last restart. If we keep the services running for longer we would see them crash due to an “out of memory” execution. This is a typical sign of a classic memory leak! In this blog … read more

Choose an APM Tool for the Solution – not for the Problem!

Just last week a senior Hybris consultant shared the story of a customer engagement on which he was working. This customer had problems, serious problems! We are talking about response times far beyond the most liberal acceptable standard! They were unable to solve the issue in their eCommerce platform – specifically Hybris. Although the eCommerce project was delivered by a System Integrator/Implementation Partner, the vendor still gets involved when things go really wrong! After all, the vendor … read more

Diagnosing Common Database Performance Hotspots in our Java Code

When I help developers or architects analyze and optimize the performance of their Java application it is not about tweaking individual methods to squeeze out another millisecond or two in execution time. While for certain software it is important to optimize on milliseconds I think this is not where we should start looking. I analyzed hundreds of applications in 2015 and found most performance and scalability issues around bad architectural … read more

How to Automate Enterprise Application Monitoring with Chef

In a previous article, I demonstrated how to effectively and efficiently install the Dynatrace Application Monitoring solution using Ansible. In this post, I am going to explain how to achieve the same results using Chef with our official dynatrace cookbook available on GitHub and on the Chef Supermarket. In the following hands-on tutorial, we’ll also apply what we see as good practice on working with and extending our deployment automation blueprints to … read more

Use Visibility & Facts to Avoid Lengthy War Rooms & Miscommunication

A few days back I was called into a war room situation with the hosted services group of our partner hybris. They were facing issues with the eCommerce site of one of their customers, a large UK based luxury clothing retailer. The situation was quite critical. Even though loadtests were conducted and everything appeared to be optimal, the eCommerce site encountered issues during a recent customer promotion which required the implementation of customer … read more

Production: Performance where it REALLY matters!

“Production is where performance matters most, as it directly impacts our end users and ultimately decides whether our software will be successful or not. Efforts to create test conditions and environments exactly like Production will always fall short; nothing compares to production!” These were the opening lines of my invitation encouraging performance practitioners to apply for the recent WOPR24 (Workshop On Performance and Reliability). Thirteen performance gurus answered the call … read more

Last-Minute Black Friday Rescue & Cyber Monday Readiness

In order to be ready for Christmas season, online retailers typically bring their shops into shape right before Black Friday. Together with Cyber Monday this is the most important day in the retailer’s year. Stilnest.com (@Stilnest) is a publishing house for designer jewelry, running their online shop on Magento. While the guys at Stilnest did a good job in preparing their environment, the interest in their products and, therefore, the traffic … read more

Network App Performance: Application Deceleration Controller & DC RUM

Not too long ago I had an opportunity to work with a customer who was experiencing performance problems with their web-based HR application. Users at the headquarters location – about 30 milliseconds away from the data center – would occasionally experience page load times of 10 or 15 seconds – instead of the normal 2 or 3 seconds. Dynatrace Data Center Real User Monitoring (DC RUM) reported both a pattern … read more

7 Reasons why APM is a No-Brainer for all Organizations

Why bother? That is the question many IT professionals face when trying to sell the value of application performance management internally to their organizations. As a working IT manager, for a Fortune 500 company, I like to save my company money, work more efficiently and ensure that my system users are happy. It sometimes feels easier to not bother, but then inevitably there is a system failure, a group of … read more

Unlocking Critical SAP Performance Insight

SAP performance issues can be extremely complex and painful – for users, for administrators, and for IT teams alike. My colleagues and I know this first-hand as our experience as Dynatrace Guardian Consultants gives us unique insight into some rather difficult performance challenges. The seemingly simple question – “Why are users experiencing poor performance?” – evolves into many more. Is it all users, or just some? Is the problem location-dependent? … read more