Velocity 2015 – Highlights from Day 2

Day 1 at Velocity is over and provided quite useful information in the tutorial sessions. We just went through the schedule for today’s sessions and found are some very interesting talks on the list. We – Andreas Grabner (@grabnerandi) and Harald Zeitlhofer (@hzeitlhofer) – will keep you updated here. If you are here at Velocity, stop by at our booth in the exhibition hall for a chat or track us down in the hallways. We wear the lovely “I ♥ dynatrace” shirts 🙂

4:10PM: Crafting Performance Alerting Tools by Etsy

Allison McKnight (@aemcknig) from Etsy walking us through their history of how they came up with a good performance alerting system. Etsy is always a guarantee for good content and high attendance – this was definitely the most packed room I’ve seen so far at Velocity 🙂

Besides explaining how they come up with “semi-automated” thresholds for their different pages that they check every day to identify any regressions I though it was very interesting in what data they put into their Alert Notifications: Graphs that visually show whats wrong for each page as well as additonal context information, e.g: % of Page Visits as compared to the overall Web Site telling them how critical is that for me!

Additionally they figured out a way detect dependencies between services. So – if one service is slow they only alert on that service but not on the performance of the page that uses that services. This results in actionable alerts.

A really cool thing she showed was an easy way to pull in performance data & graphs to IRC allowing them to easily share data with other team members but also across teams. They also monitor improvements and therefore “Celebrating Improvements” instead of “Punishing bad Performance” 🙂

Summary: Great input on how they improved alerting through more contextual data and “semi-automated” thresholds. Check out her slides once available!

2:40PM: Continuous Delivery in Financial Trading

David Genn (@david_genn) from IG. As a financial company they deploy all changes across all apps once 1 month during after trading update windows. Interesting challenges their industry faces: Regulations around Uptime SLAs & Auditing of all changes; mainly physical data centers; already deploying once 1 month – isn’t that good enough?; developer attitude!

Their conclusion after their journey was: Every Company CAN DO Continuous Delivery!

Here are their 4 principles: Separate Deployment from Release: Using Blue/Green Pattern, Automate Everything, Trust your Tests, Every Commit potentially Merge to Master

Best Practices: Say Thank You (to those teams that go through the change); Start Small and be brave quickly; 80% is good enough!

Summary: A consistent message across the conference on how Continuous Delivery can work in any type of organization!

1:45pm: Design and Performance

by Steve Souders (@souders)
Steve is starting his talk with confirming, that DevOps is a lot about increasing communication and the awareness of skills. But this does not only apply for Developers and Operations, but also for Developers and Designers. What are both groups after, is building cool, modern and performant apps. But when Developers and Designers have to work together, it often seems the are working in silos and for different goals.
Designers create awesome results, but sometimes they have never heard anything about performance. That might result in a design that just can’t be made fast. But performance generally has a much higher priority over design, at least for the user. However, it’s important to bring these teams together.
  • create small interdisciplinary teams, make sure you have the right people working together.
  • set guiding principles – and performance has to be a major part of that
  • create a prototype as early as possible – then you can integrate the assets created by your designers in an early stage
  • measure performance from the start
    • set performance budgets
    • set clear baselines for what’s acceptable
    • make performance data available for all
    • make results easy visible for the developer (in the browser)

Performance metrics in detail

The industry standard for measuring performance is page load time. But that is not a good metric to describe the user experience.
Steve gives 2 examples: Gmail and Amazon. In both examples the page load time does not reflect the real user experience. Gmail is actually slower, the Amazon site faster than the page load time would tell. We need to find other metrics, that are more related to user experience. Speed index is a great metric for the overall rendering performance of the website.

Steve gives an example for a common problem: hero images

In many websites the hero image takes quite a while to load. 2-3 seconds is too slow for the main image on your site. The image is not necessarily that large, but it takes a while after the file is completely loaded until it’s displayed. The bottleneck here is the browser’s preload functionality, which scans your HTML and gives scripts the highes priority for downloading, even when the script is at the end of the file. The image rendering then starts after all scripts have been downloaded from the server.

best practices for bringing together design and performance

  • use custom metrics
  • define most important elements of the page
  • measure using user timing
  • track with RUM and sythetics


  • identify what matters most
  • focus on UX performance

1:45PM: How LinkedIn uses RUM and PoPs to optimize Page Speed

Ritesh Maheswhwari (@ritesh) from LinkedIn starting out with a promising story on how they improved user experience from the status they had in 2013. He gave some interesting insight in to how they optimized their PoPs (Points of Presence) in order to optimize overall user experience by optimizing the assignment of a certain geo location to the best PoP.

Instead of using a Synthetic approach to figure out the best connectivity per country they decided to use the real users to give them that answer. They “converted” their real users into “synthetic agents” and let them capture performance data through their own RUM implementation.They let every user download small resources from different PoPs and reported that data back to their monitoring services. This showed them that their current PoP assignment was not optimal for many regions. They achieved a 10% improvement in page load time by optimizing that assignment based on their RUM data. Very cool approach and very impressive!

They extended their analysis by also monitoring to which PoP really get connected to and identified that 31% of US traffic got assigned to a suboptimal PoP caused by bad DNS IP Resolver. DNS uses the DNS Resolver IP instead of the actual Client IP which caused these bad assignments.

After optimizing their PoP assignments its now time to build more PoPs. They pick the next locations based on the RUM data!

Summary: Really interesting approach to optimize user experience by finding optimal PoP assignment. Stay tuned for his write ups on his blog!

10:50AM: (Re)building an engineering culture: DevOps @ Target

Heather Mickman (@hmmickman) and Ross Clanton (@RossClanton) telling the transformation story of Target transforming their way of developing and deploying software. With thousands of people in their IT team in more than 100s teams their starting point was to optimize a lot of culture but also organizationally.

They went through 4 Phases of DevOps Maturity: Change Agent(s); a grassroots; tops down; scale

Change Agent(s): Start in a smaller group and get your first successes. Key is to embrace a new engineering culture vs. Process Culture. Mandatory attributes of change agents: Passion, Vision, Tenacious, willing to take risks, challenge status quo, active in tech community. First successes: “As our # of deployments went up the # of incidents dropped :-)”

In order to get more people on board they started with an Internal DevOpsDays Events with 160 attendees growing this to 400+ as of today.

Grassroots: sharing the story with others; badges/tshirts/…; encourage public communication about it; continue the improvements!!

Tops Down: pairing champions with executives to move forward! continue sharing with peers. moving grassroots to mainstream

Scale: most challenging step is to scale across the whole organization. Still trying to figure out all the details. Enterprise Coaching!

Summary: Great story on their steps towards DevOps. Start small with dedicated people -> grow from their until you get top down buy-in!

9:30AM – Highlights of Sessions given during the Opening Keynote

Kicked off by Laura Bell (@lady_nerd) on Security and how everyone needs to become a security expert. She is the founder of and answers questions on twitter via #betteroffbad. One of her quotes: “We are all responsible for security. We are all doing bad things. Lets not assume the world is a good place!”

Her 3 steps to become a security experts

  1. Think like a villain … and be objective
  2. Create a safe place … to create a little chaos
  3. Don’t be afraid to play … like you never read the rulebook

Dana Quinn from Intuit (@dquinn_devops) talked about their journey to the cloud

Lessons learned

  • What workloads are good to start with? Build environments & Load Testing (spike on April 15)
  • Cloud-native or Hybrid? Choose for cloud-native to avoid frictions and use all cloud capabilities!
  • Don’t treat your Cloud like your Data Center. New Metrics to track: Avg instance Age (keep low), Utilization
  • Watch your spending: remember to “shut the cloud off”!

Intuits Results

  • Small teams are successful at getting new products out quickly
  • Elastic capabilities gave them more output for the $$. Especially around load testing!

Open Sourcing their Load Testing System they use for TurboTax! -> really cool!!

Dave McCrory (@mccrory) on Building a Faster, high available data tier

Scale or Fail: The amount of data is expected to double every two years through 2020

Challenges: Data Isolation, Data Consistency, Data Gravity

There were more presentations during the keynote we haven’t covered. Really good tips from Astrid Atkinson (Google) on “Engineering for the long run”. Go ahead and watch all of them through the Velocity Conference website as these general session were recorded and will be made available at a later stage.

Stay updated