Production

Monitoring is key to Docker success in production

Recently O’Reilly and Dynatrace Ruxit conducted a survey about the container and Docker ecosystem. The goal of the survey was to understand technology adoption across the lifecycle. Besides offering some surprising stats on Docker adoption, the survey identifies some key challenges faced by early adopters of Docker. Docker is the delivery stack of the future Adoption of Docker and related container technologies in production environments is on the rise. A majority of survey… read more

Fighting Technical Debt: Memory Leak Detection in Production

Thanks to our friends from Prep Sportswear who let me share their memory leak detection story with you. It is a story about “fighting technical debt” in software that matured over the years with initial developer’s no longer on board to optimize/fix their code mistakes. Check out their online store and browse through their pages – especially cool are the product details pages where they use some really nice… read more

Velocity 2015 – Highlights from Day 2

Day 1 at Velocity is over and provided quite useful information in the tutorial sessions. We just went through the schedule for today’s sessions and found are some very interesting talks on the list. We – Andreas Grabner (@grabnerandi) and Harald Zeitlhofer (@hzeitlhofer) – will keep you updated here. If you are here at Velocity, stop by at our booth in the exhibition hall for a chat or track… read more

Identify Bad Service Oriented Architectures Through Metrics

There are many advantages of breaking an application into smaller services. When APIs and Interfaces are well defined it allows more independent development on a separate code base, keeping risk low to break the whole app with a single code change. It allows for more flexible and scalable deployments when done right and it is theoretically possible to replace services when a better service is available that provides the same… read more

How to Performance Monitor All Your Applications on a Single Dashboard

It’s become easy to monitor applications that are deployed on hundreds of servers – thanks to the advances in application performance management tools. But – the more data you collect the harder it is to visualize the health state in a way that a single dashboard tells you both overall status as well as the problematic component. Eugene Turetsky (Dynatrace) and Stephan Levesque (SSQ Financial Group) shared their… read more

Top 10 WebLogic Performance Metrics to Proactively Monitor a Server Farm

In my years of experience with Weblogic monitoring, I came up with a list of key metrics that can help determine the health of my server farm. These early indicators allow me to set pro-active steps instead of waiting for end users to complain. The following screenshot shows one of my Dynatrace dashboards containing key health metrics captured through JMX: Dynatrace Dashboard showing the health status of… read more

How to Optimize the Good and Exclude the Bad/ Bot Traffic that Impacts your Web Analytics and Performance

This blog is about how a new generation of BOTs impacted our application performance, exploited problems in our deployment and skewed our web analytics. I explain how we dealt with it and what you can learn to protect your own systems. Another positive side-effect of identifying these requests is that we can adjust our web analytic metrics we report to management. Tools like Google Analytics can’t exclude all of these… read more

Are we getting attacked? No, it’s just Google indexing our site

Friday morning at 7:40AM we received the first error from our APMaaS Monitors informing us about our Community Portal being unavailable. It “magically recovered itself” within 20 minutes but just about an hour later was down again. The Potential Root Cause was reported by dynaTrace which captured an Out-of-Memory (OOM) Exception in Confluence’s JVM that hosts our community. First Analysis Step: Availability Monitor highlighted the problem. dynaTrace identified… read more

How to accurately identify impact of system issues on end-user response time

Triggered by current expected load projections for our community portal, our Apps Team was tasked to run a stress on our production system to verify whether we can handle 10 times the load we currently experience on our existing infrastructure. In order to have the least impact in the event the site crumbled under the load, we decided to run the first test on a Sunday afternoon. Before we ran… read more

Make PHP requests “Sleep” to stop bad behavior. Smart or not?

In our previous post where we showed how we hooked up our blog’s WordPress application with the new Compuware APMaaS offering. Since WordPress is a PHP application we use PurePath for PHP to monitor it. We highlighted that we got an alert about a response time violation on some of our blog posts – which is shown on the following screenshot. Dynamic Baselining detect a… read more