Last week at Velocity Conference – New York I had the opportunity to sit in keynote address by Mikey Dickerson on the topic “One Year After healthcare.gov: Where Are We Now?”
Mikey Dickerson is the Administrator/Deputy CIO of USDS. In October, 2013 he took a leave of absence from Google to join what became known as the “ad hoc” team that rescued healthcare.gov after its disastrous launch on October 1. Mikey talked about how healthcare.gov was built/launched with no monitoring in place, leaving the only way to hear about problems through news networks like CNN. He recalled that there were 55 contractors and none of them were ultimately responsible for monitoring.
What stood out for me was that there was no sense of urgency in fixing issues or preventing issues in the first place, until it became a national media story. Mikey had attended more than 130 war room meetings, and clocked 17 hours days, 7 days a week, with no days off. The highlight of the talk was how the “ad-hoc” surge team got everyone to communicate and collaborate effectively and urgently in order to resolve issues on an everyday basis.
As pointed out by Matthew Heusser (@mheusser) in a CIO magazine article, on the lessons learned from launch, testing was not a part of the Delivery Process. The article states: “The entire delivery process was firefighting where testing wasn’t seriously considered.”
In today’s increasing complex world, where users are expecting sub millisecond responsive website, “failure is just one click away”. Jason Bloomberg (@theebizwizard) at Forbes summarizes his impressions of Velocity, as the confab where performance engineers learn how to build resilience and performance engineering is about “respond to and recover from failure. Knowing when something is about to fail, quickly identifying the problems, understanding the root cause of each issue, gracefully degrading performance while people are working on solutions, and delivering those solutions quickly and permanently are all important pages in the web and mobile performance playbook.”i
Another interesting session I attended was by Rodrigo Campos (@xinu), the Operations Manager at Walmart in Brazil. Rodrigo gave some insights in “Real World #Devops at Walmart“. Devops is a culture at Walmart that fosters communication, transparency, and collaboration. No more emails or info sec dossier, or throwing things over the wall or “consider this as a favor.”
So here are my important takeaways from the keynote address by Mikey Dickerson, Rodrigo Campos’s sessions and other sessions at Velocity, New York.
#1 Continuous Testing
We have talked about continuous testing, and testing is not negotiable. You can increase the number of builds per day, the number of features tested per build and the number of deployments per day by incorporating continuous automated testing and get instant feedback whether it could be unit, functional or architectural failures.
#2 Culture of Performance
Secondly, you should incorporate a culture of performance within the organization. This requires breaking silos and fostering collaboration and communication between development and operations. Testing cannot be an afterthought or done in production and it cannot be someone else’s problem.
Finally, The United States Digital Service (USDS) has created playbooks in order to increase the success rates of the federal projects. The automated testing and deployment playbook #10 provides the key questions and checklists for delivering successful digital services. Every organization should create playbooks to increase the success rate of their digital services.