Header background

Optimizing AI ROI from DevOps and IT Operations: The rising need for AI/LLM observability

Every organization is adopting GenAI across its infrastructure and application stacks. It’s important that IT operations teams seek a seat at the table because large swaths of models will be deployed across every technology. For example, the use of cloud migrations, GenAI large language models, small language models, and specialized models will drive productivity, cost savings, and business returns. Every customer is considering and attempting to measure their business returns from their AI investments; transparency into the data, system and model performance and drift, security, and quality are critical areas where IT operations, DevOps, SREs, and platform engineering teams can play a critical role in optimizing business returns and reducing business risks. So, where should you start the conversation?

Executives can use observability to reduce business risks and increase AI ROI by understanding how observability capabilities play a role in delivering across the core AI value categories of productivity, customer impact, cost optimization, innovation, and quality. For example, observability improves customer satisfaction by reducing the mean time to resolution and mean time to understanding. In addition, it can improve cross-team collaboration and data access to deliver cost efficiencies.

To reduce business risks and increase ROI in GenAI use cases, technology executives should plan to manage rising complexity, and, as part of continuous evaluation, executives should consider GenAI performance across the following dimensions:

  • System performance: Monitoring the system performance of GenAI applications encompasses measuring operational performance characteristics similar to those of traditional applications, including at the software and infrastructure layers and the model. Model system performance monitoring includes the measurement of metrics such as model response latency, error rates (including failure to respond), and API failures.
  • Quality performance: It is crucial for organizations to monitor the output quality of GenAI and AI applications. Quality includes accuracy of responses and model drift, where data used to train models no longer produces accurate or relevant results.
  • Governance: Model governance of GenAI often encompasses monitoring and enforcing legal requirements and the organization’s ethics policies. Ongoing monitoring is necessary, including the adoption of guardrails to prevent the delivery of outputs that don’t comply with laws or company policies.
  • Security: In addition to the theft of private information or loss of intellectual property, organizations must protect against security risks that are specific to GenAI applications. Prompt injection and jailbreaks are two emerging attacks. Monitoring tools that detect these and other security issues are critical to risk management.
  • Cost: Monitoring the cost of delivering a GenAI application is a multitiered undertaking. Depending on the application, organizations may incur costs for each query and response to a model, in addition to costs associated with the underlying infrastructure required to deliver the application. The ability to collect the right cost information and analyze it on a per-application basis will be key to the ability of an organization to determine ROI.

Organizations must base the measurement of each performance dimension on its ability to derive outcomes that drive business value. Each GenAI application should support a targeted outcome, such as improved productivity, increased revenue, new revenue streams, or enhanced customer satisfaction. Connecting the dots between GenAI performance dimensions and business value requires defining measurements that matter to the business and collecting, correlating, and analyzing the data to understand the app’s ability to deliver that value.

For technology executives, AI observability is fast becoming essential for managing the operational complexity and business outcomes from AI initiatives. It provides the visibility needed to demonstrate ROI, ensure reliable AI applications, and make informed decisions based on critical data that supports every AI use case.

Monitor, optimize, and secure Generative AI applications, LLMs, and agentic workflows — improving performance, explainability, and compliance.

Learn more, or try Dynatrace for free!