Piksel overcomes surging cloud complexity with Dynatrace
We’ve seen the time it takes to identify the root cause go from a week to minutes, meaning we can identify and solve problems before they impact customers and end-users.
Growth in new platform creates largest AWS media footprint in Europe
After a decade providing global media and broadcast organizations with bespoke video-on-demand platforms, Piksel developed its own video delivery solution in 2014 – Piksel Palette. This cloud-native, microservices-based platform has since been adopted by major media providers seeking to digitally transform their operations.
But as Steve Gran, senior technical architect at Piksel explains, the growth in Piksel Palette resulted in an explosion of cloud complexity that quickly became difficult to manage. “In the early days, the platform was quite simple. It had some APIs, microservices and databases, nothing we couldn’t handle. But over time, it evolved to become more complex, with lots of moving parts. As the platform grew, the complexity and breadth of technology grew as well, making it increasingly difficult to manage.”
Gran continues, “Piksel has one of the largest Amazon Web Services footprints in Europe – we’ve got multiple environments for testing, development and production, as well as a multitude of dedicated customer environments. We’re running our microservices architecture in Docker on anywhere between 600 to 700 virtual machines in AWS.”
Lack of visibility impacting innovation and performance
It became clear to Piksel that this level of complexity was limiting visibility into its cloud environment. Piksel didn’t have insight or control over the performance of crucial, mission-critical third-party cloud services underpinning its platform. This meant that when issues did arise, it was difficult to pinpoint the root cause. “We’ve got automation built into the platform to fix problems that we’ve seen before, but you can only automate for what you know,” explains Gran. “If we had a new problem we’ve not seen before, it would take time to find the needle in the haystack to determine the root cause. Typically, we’d jump on the vendor merry-go-round and share logs to try and narrow down parameters, but this was very time consuming.”
This lack of visibility also extended to the development team that was deploying new code. It didn’t have deep insights into the quality and scalability of code and therefore didn’t know how the code would impact performance when pushed into production. This lack of confidence slowed down the speed of innovation as development cycles were extended.
Our development loops have shortened, as we are getting instant diagnostics on each line of code written. As a result, our developers are more productive and innovate faster.
Customer recommendation triggers adoption of Dynatrace
Piksel was already looking for a vendor to remedy this lack of visibility, but a customer performance event accelerated the timeframes. “One of our global customers was migrating to Piksel Palette and began sending us production levels of traffic,” explains Gran. “The platform started experiencing performance problems, with latency increasing. Even with the team working long hours troubleshooting the problem, the lack of visibility meant we only saw the performance degradation, and not the root cause. We contacted our vendors and provided logs to determine the root cause, but it took a week to discover that one of our Java databases was at fault.”
Gran explains its customer was already working with Dynatrace, “When we were encountering problems, they were using Dynatrace at their end to identify which users were suffering the impact, helping us to narrow down the root cause. They suggested we use Dynatrace ourselves to help overcome any future problems. When one of the world’s largest telecoms providers makes a recommendation, then it’s an easy decision.”
Having already conducted a thorough audit of the APM solutions available to the market, Piksel realized it needed something far more advanced. As a result, Piksel selected Dynatrace to help improve visibility over its cloud-native platform. Its software intelligence capabilities and tight integration with AWS and Docker gave Piksel immediate visibility over its cloud environment, covering all services, applications and dependencies.
Quicker innovation and identification, more automation
Dynatrace also helped Piksel to improve its speed of innovation, as Gran explains: “Our development loops have shortened, as we’re getting instant diagnostics on each line of code as it’s completed. As a result, our developers are more productive and innovate faster.”
The complex cloud environment supporting Piksel Palette has become much simpler to manage with Dynatrace. Its AI capabilities mean that Piksel gets answers about performance problems, instead of data. This quicker diagnosis allows the team to get straight into troubleshooting problems with code, cloud services or applications to identify the root cause much faster. “We’ve seen the time it takes to identify the root cause go from a week to minutes, meaning we can identify and solve problems before they impact customers and end-users.”
Gran concludes, “Dynatrace shines a light into our cloud environment, helping us fine tune areas of the platform we didn’t even know there was a problem with and drastically reduce MTTR. It has removed the debugging time that sucked up most of our day and freed up more space for people to innovate and do their jobs. We’ve become better and more efficient, and able to create more value for the business, meeting SLAs by stopping the deployment of bad code and getting better quality code in the pipeline. In the future, we anticipate that Dynatrace will drive further automation in our platform. Out of the box it gave us all the functionality we needed, and as we take on new services we know they will automatically appear in Dynatrace.”