Day 1 at Velocity is over and provided quite useful information in the tutorial sessions. We just went through the schedule for today’s sessions and found are some very interesting talks on the list. We – Andreas Grabner (@grabnerandi) and Harald Zeitlhofer (@hzeitlhofer) – will keep you updated here. If you are here at Velocity, stop by at our booth in the exhibition hall for a chat or track us down in the hallways. We wear the lovely “I ♥ dynatrace” shirts 🙂
4:10PM: Crafting Performance Alerting Tools by Etsy
Allison McKnight (@aemcknig) from Etsy walking us through their history of how they came up with a good performance alerting system. Etsy is always a guarantee for good content and high attendance – this was definitely the most packed room I’ve seen so far at Velocity 🙂
Besides explaining how they come up with “semi-automated” thresholds for their different pages that they check every day to identify any regressions I though it was very interesting in what data they put into their Alert Notifications: Graphs that visually show whats wrong for each page as well as additonal context information, e.g: % of Page Visits as compared to the overall Web Site telling them how critical is that for me!
Additionally they figured out a way detect dependencies between services. So – if one service is slow they only alert on that service but not on the performance of the page that uses that services. This results in actionable alerts.
A really cool thing she showed was an easy way to pull in performance data & graphs to IRC allowing them to easily share data with other team members but also across teams. They also monitor improvements and therefore “Celebrating Improvements” instead of “Punishing bad Performance” 🙂
Summary: Great input on how they improved alerting through more contextual data and “semi-automated” thresholds. Check out her slides once available!
2:40PM: Continuous Delivery in Financial Trading
David Genn (@david_genn) from IG. As a financial company they deploy all changes across all apps once 1 month during after trading update windows. Interesting challenges their industry faces: Regulations around Uptime SLAs & Auditing of all changes; mainly physical data centers; already deploying once 1 month – isn’t that good enough?; developer attitude!
Their conclusion after their journey was: Every Company CAN DO Continuous Delivery!
Here are their 4 principles: Separate Deployment from Release: Using Blue/Green Pattern, Automate Everything, Trust your Tests, Every Commit potentially Merge to Master
Best Practices: Say Thank You (to those teams that go through the change); Start Small and be brave quickly; 80% is good enough!
Summary: A consistent message across the conference on how Continuous Delivery can work in any type of organization!
1:45pm: Design and Performance
- create small interdisciplinary teams, make sure you have the right people working together.
- set guiding principles – and performance has to be a major part of that
- create a prototype as early as possible – then you can integrate the assets created by your designers in an early stage
- measure performance from the start
- set performance budgets
- set clear baselines for what’s acceptable
- make performance data available for all
- make results easy visible for the developer (in the browser)
Performance metrics in detail
Steve gives an example for a common problem: hero images
best practices for bringing together design and performance
- use custom metrics
- define most important elements of the page
- measure using user timing
- track with RUM and sythetics
Summary:
- identify what matters most
- focus on UX performance
1:45PM: How LinkedIn uses RUM and PoPs to optimize Page Speed
Ritesh Maheswhwari (@ritesh) from LinkedIn starting out with a promising story on how they improved user experience from the status they had in 2013. He gave some interesting insight in to how they optimized their PoPs (Points of Presence) in order to optimize overall user experience by optimizing the assignment of a certain geo location to the best PoP.
Instead of using a Synthetic approach to figure out the best connectivity per country they decided to use the real users to give them that answer. They “converted” their real users into “synthetic agents” and let them capture performance data through their own RUM implementation.They let every user download small resources from different PoPs and reported that data back to their monitoring services. This showed them that their current PoP assignment was not optimal for many regions. They achieved a 10% improvement in page load time by optimizing that assignment based on their RUM data. Very cool approach and very impressive!
They extended their analysis by also monitoring to which PoP really get connected to and identified that 31% of US traffic got assigned to a suboptimal PoP caused by bad DNS IP Resolver. DNS uses the DNS Resolver IP instead of the actual Client IP which caused these bad assignments.
After optimizing their PoP assignments its now time to build more PoPs. They pick the next locations based on the RUM data!
Summary: Really interesting approach to optimize user experience by finding optimal PoP assignment. Stay tuned for his write ups on his blog!
10:50AM: (Re)building an engineering culture: DevOps @ Target
Heather Mickman (@hmmickman) and Ross Clanton (@RossClanton) telling the transformation story of Target transforming their way of developing and deploying software. With thousands of people in their IT team in more than 100s teams their starting point was to optimize a lot of culture but also organizationally.
They went through 4 Phases of DevOps Maturity: Change Agent(s); a grassroots; tops down; scale
Change Agent(s): Start in a smaller group and get your first successes. Key is to embrace a new engineering culture vs. Process Culture. Mandatory attributes of change agents: Passion, Vision, Tenacious, willing to take risks, challenge status quo, active in tech community. First successes: “As our # of deployments went up the # of incidents dropped :-)”
In order to get more people on board they started with an Internal DevOpsDays Events with 160 attendees growing this to 400+ as of today.
Grassroots: sharing the story with others; badges/tshirts/…; encourage public communication about it; continue the improvements!!
Tops Down: pairing champions with executives to move forward! continue sharing with peers. moving grassroots to mainstream
Scale: most challenging step is to scale across the whole organization. Still trying to figure out all the details. Enterprise Coaching!
Summary: Great story on their steps towards DevOps. Start small with dedicated people -> grow from their until you get top down buy-in!
9:30AM – Highlights of Sessions given during the Opening Keynote
Kicked off by Laura Bell (@lady_nerd) on Security and how everyone needs to become a security expert. She is the founder of safestack.io and answers questions on twitter via #betteroffbad. One of her quotes: “We are all responsible for security. We are all doing bad things. Lets not assume the world is a good place!”
Her 3 steps to become a security experts
- Think like a villain … and be objective
- Create a safe place … to create a little chaos
- Don’t be afraid to play … like you never read the rulebook
Dana Quinn from Intuit (@dquinn_devops) talked about their journey to the cloud
Lessons learned
- What workloads are good to start with? Build environments & Load Testing (spike on April 15)
- Cloud-native or Hybrid? Choose for cloud-native to avoid frictions and use all cloud capabilities!
- Don’t treat your Cloud like your Data Center. New Metrics to track: Avg instance Age (keep low), Utilization
- Watch your spending: remember to “shut the cloud off”!
Intuits Results
- Small teams are successful at getting new products out quickly
- Elastic capabilities gave them more output for the $$. Especially around load testing!
Open Sourcing their Load Testing System they use for TurboTax! -> really cool!!
Dave McCrory (@mccrory) on Building a Faster, high available data tier
Scale or Fail: The amount of data is expected to double every two years through 2020
Challenges: Data Isolation, Data Consistency, Data Gravity
There were more presentations during the keynote we haven’t covered. Really good tips from Astrid Atkinson (Google) on “Engineering for the long run”. Go ahead and watch all of them through the Velocity Conference website as these general session were recorded and will be made available at a later stage.