Business innovation costs money. But organizations may not always have insight into how their moves toward innovation generate costs as well as revenue. That’s why cloud cost optimization is becoming a major priority regardless of where organizations are on their digital transformation journeys.
From managing cloud cost with the major providers to the increasing compute costs that generative AI and large language models (LLMs) create—not to mention the carbon footprint of these resources—the cost of innovation is affecting bottom lines across the industry.
In fact, Gartner’s 2023 forecast is for worldwide public cloud spending to reach nearly $600 billion. To put that into perspective, analysts at venture capital firm Andreessen Horowitz report that many companies spend more than 80% of their total capital raised on compute resources.
Generative AI and LLMs are compounding these figures. For example, a CNBC report found that training just one LLM can cost millions of dollars, and then millions more to update.
These costs also have an environmental impact. A Cloud Carbon Footprint report found that global greenhouse gas emissions from the technology sector are on par or larger than the aviation industry.
Bernd Greifeneder, Dynatrace founder and CTO, acknowledged in his opening remarks at Dynatrace Perform 2024 that even his team suffers from tool sprawl.
“CNCFs, thousands of tools, they pick up everything,” Greifeneder said. Having a platform like Dynatrace with the Grail data lakehouse and AppEngine helps teams view and manage cloud cost associated with these tools and services. “We had many do-it-yourself tools that integrated with security and did some other import tasks,” Greifeneder said. “We shifted all that onto the [Dynatrace] platform so we have compliance and privacy issues solved.”
The cost of tool sprawl
The cost of tool sprawl is not just the tool itself. Maintaining consistency, updates, and collaboration among the data and the teams that use the various tools means teams waste time building and maintaining integrations while potentially losing important data and its context.
“Every dollar we spend on cloud [infrastructure] is a dollar less we can spend on innovation and customer experience,” said Matthias Dollentz-Scharer, Dynatrace chief customer officer. Dollentz-Scharer outlined four phases of cloud cost optimization and how Dynatrace helps with each.
- Financial. Dynatrace enhances platform optimizations negotiated with cloud providers by automating savings plans and cost allocations and giving teams insights into how they use cloud resources.
- Utilization. By tracking over- and under-utilized machines, Dynatrace helps teams track underutilized machines and minimize overprovisioning.
- Architecture. The Dynatrace platform tracks workload and dependencies among resources. With its topology mapping and dependency tracking, Dynatrace provides tools that help analysts determine which processes use what resources to troubleshoot and optimize at the process level.
- Smart orchestration. By ensuring automated workloads run most cost-effectively, Dynatrace monitors automated workloads, such as in Kubernetes environments, to help teams eliminate redundant services for further cost savings.
How cloud cost optimization mitigates the effects of tool sprawl
For example, the Dynatrace team investigated its Amazon Elastic Block Store (EBS) usage. To discover why EBS usage was growing in relation to the Dynatrace architecture, the team used Dynatrace Query Language (DQL) and instrumentation notebooks to determine what processes were using the resources. “The team tweaked the Dynatrace [AWS] deployment and automated discrete sizings in Kafka, which helped us save a couple of million dollars,” Dollentz-Scharer said.
“When we started deploying Grail in several hyperscaler locations, we were able to use more predictive AI from our Davis AI capabilities to make our orchestration even smarter,” Dollentz-Scharer continued. That enabled the Dynatrace team to reduce its EBS usage by 50%.
Likewise, when an automation glitch deployed a batch of unneeded Kubernetes workloads into a new Grail region, Dynatrace instrumentation and tooling surfaced the problem and the team fixed it within 48 hours, enabling the team to manage cloud costs and save resources.
Reducing cloud carbon footprint
A driving motivation for cloud cost optimization and resource utilization awareness is climate change. The effects of global climate change are evident everywhere, from wildfires to catastrophic flooding.
With its unique vantage point over the entire multicloud landscape, its workloads, and interdependencies, Dynatrace introduced the Carbon Impact app in 2023 to help organizations pinpoint and optimize their carbon usage with precision.
“Everyone has a part to play,” said a representative of a major banking group. “Our commitment is to achieve net 0 carbon operations and reduce our direct carbon emissions by at least 75%, and reduce our total energy consumption by 50%, all by 2030.” The organization has already met its commitment to switch to 100% renewable energy.
The company’s IT ecosystem uses thousands of services across traditional data centers and hybrid cloud environments, and it’s continually growing. “It’s critical that we have observability of our energy consumption and footprint to monitor the impact of this growth over time,” he said.
You can’t manage what you can’t measure
The organization uses Dynatrace to optimize carbon consumption at the data center, host, and application levels. That looks like modernizing data centers from the ground up with sustainability and efficiency in mind and identifying underutilized infrastructure.
“But it’s very important as we do this to find that sweet spot,” the banking group representative said. “We can’t risk the stability or performance of the services.” Critically, Dynatrace helps the group’s team observe how reducing energy consumption relates to resilience.
Implementing “Green-coding principles”
At the application level, the banking group is beginning to apply green coding, a practice that minimizes the compute energy consumption of software. “We are optimizing and analyzing the source code of our applications to run more energy efficiently in terms of CPU cycles and memory utilization,” the representative explained.
“We’ll be introducing quality gates for software and development and testing stages to ensure that any new code is as efficient as possible.” This includes writing code in the most efficient languages for each use case, a process that will be phased in over time. Dynatrace is a key enabler of the banking group’s approach to compare a baseline CO2 measurement of their application code before and after each change.
The organization’s proof of concept, a single application programming interface (API), yielded a reduction of around 2 tons of CO2 per year. This promises to yield significant savings when applied to all their applications.
End-to-end observability makes it possible to measure both performance and carbon consumption and to manage cloud costs. “In performance optimization, you’re hunting down milliseconds, and with carbon optimization, you’re hunting down grams,” Dollentz-Scharer observed.
For more about managing and optimizing Kubernetes workloads, see the blog Kubernetes health at a glance: One experience to rule it all.
For all Perform coverage, check out the Dynatrace Perform 2024 guide.
Looking for answers?
Start a new discussion or ask for help in our Q&A forum.
Go to forum