Guest blog from Dynatrace customer Mark Forrester, Digital Readiness Manager at Mitchells & Butlers.
Working in a digital delivery environment I’m hearing a lot of people talking about “stability of the platform” to which I respond, define your interpretation of stability.
the state of being stable.
Since changing the monitoring solution a few years ago we have enjoyed continued uptime, in fact we have not had a full outage of the digital platform for over 18 months now, but is this really stability? Is the stability people refer to a mystical belief that everything is running without error? Where does service degradation and API call failures fit into such a blanket term?
My current ethos is “every transaction matters”, I’m talking about background digital transactions, response times of API’s, loading of webpages and database calls etc. When these come together it is truly amazing digital experience for our guests considering what we have delivered as a business in the last few years and the road map is just as exciting.
It’s just a matter of time
Digital services are now so complex it is inevitable that parts of the platform are going to break, is infrastructure really infrastructure or has it just become more code? Agile delivery means continual releases to the platform and our suppliers are continually updating integration platforms. So I’m going to say it:
failure is inevitable, it’s how you monitor it and the speed you recover from it, that defines how good your services are.
Monitoring our services has become much easier in the last few years as we reviewed the monitoring platform and moved to a solution that better fitted with requirements, not just core monitoring product but additional services that became enablers such as business reporting. Switching to the new monitoring was a breeze and took less than 10 working days to have everything setup like for like, then we began to enhance and dig deeper within the applications and services. The AI began to consume our data and turned the team into “virtual gardeners”, all the issues that were causing failures were being weeded out, applications pruned, one by one, the landscape turned from a jungle to a well kept garden.
Recovery has become simpler but the services have become more complex, illogical I know, but that is now the reality. Legacy services/applications have been streamlined as the inefficiencies have been identified (using the monitoring) and removed as technical debt, new services are now monitored at a much deeper level before production delivery. These changes, at huge effort by the team, mean that alerts are more granular and therefore we can detect the failures much earlier. The monitoring helps pinpoint the root case, the team respond and end users are blissfully unaware, well most of the time.
It’s good when it just works
Success can be very tangible, people talk about return on investment while others use success criteria or achieving objectives, I guess it depends on your company approach. I personally think of an old saying that my grandmother used to say when I was a child “if you look after the pennies, the pounds will look after themselves”. Translate this for my digital world and I have:
Concentrate on small wins and the success will come by itself
With the help of a great team and very flexible monitoring product we are able to continually break records and help the business move faster into the digital world and offer more services.
What I would say is success is not static it evolves, to deliver this in a digital world is becoming more challenging everyday.
Technology and service enhancement waits for nobody, speed of alerts/notifications to the right people at the right time and automatic healing are all on the horizon, shifting the monitoring left and detecting issues before they make production all make logical sense now.
Working with 3rd parties in collaboration and giving their monitoring tools the ability to push alerts into our monitoring platform to provide a “single pane of glass” view of the whole customer journey. Do you really know if an upstream or downstream supplier is having issues affecting your customers?
We have the roadmap, we have the desire, let the journey continue……