Knowledge Base
Topics
- DevOps
A/B testing
A/B testing is a technique that compares two versions of a software application, or a specific feature within the app, to determine which performs better based on specific metrics or user feedback, providing valuable insights into user preferences and behavior, and helping to optimize the user experience. - AIOps
AI observability
AI observability is the practice of applying monitoring and continuous analysis techniques to gain real-time insights into AI systems' behavior and performance. It's critical to building and maintaining reliable, transparent, and accountable AI systems. - AIOps
AIOps
AIOps (artificial intelligence for IT operations) is an IT practice that combines big data and machine learning to automate IT operations, such as event correlation, anomaly detection, and root-cause analysis. A modern approach to AIOps serves the full software delivery lifecycle. - Observability
Alerting
Alerting is the process of notifying relevant parties when a predefined condition or event occurs. - AIOps
Anomaly detection
Anomaly detection is a technique that uses AI to identify abnormal behavior as compared to an established pattern. Anything that deviates from an established baseline pattern is considered an anomaly. - Digital Experience
Apdex
Apdex is a performance-measurement standard that shows the relationship between recorded performance measurements and real-user satisfaction. - Apps and Microservices
API monitoring
API monitoring is the process of collecting and analyzing data about the performance of an API in order to identify problems that impact users. - Apps and Microservices
APM
Application performance monitoring (APM) is the practice of tracking key software application performance metrics using monitoring software and telemetry data. - Apps and Microservices
Application mapping
Application mapping is the process of mapping out elements across your entire IT environment, through a visual graph, and then looking at how these applications are interconnected and dependent on each other. AI can be used to automate these actions. - Apps and Microservices
Application modernization
Application modernization takes existing legacy applications and modernizes their platform infrastructure, internal architecture, or features. Application modernization centers on bringing monolithic, on-premises applications into cloud architecture and release patterns. - Application Security
Application programming interface (API) security
Application programming interface (API) security is the practice of securing APIs to prevent unauthorized access, data leakage, and other security vulnerabilities. - Apps and Microservices
Application topology discovery
Application topology discovery is the ability to discover all components and dependencies of your entire technology stack, end-to-end. Application mapping is the process of mapping out these elements across your entire IT environment, through a visual graph, and then looking at how these applications are interconnected and dependent on each other. - AIOps
Artificial intelligence
Artificial intelligence (AI) refers to a system’s ability to mimic human cognitive function. AI applies advanced analytics and logic-based techniques to interpret data and events, support and automate decisions, and even take intelligent actions. - Infrastructure
Autoscaling
Autoscaling is the process of automatically adjusting the number of resources allocated to a system or component based on traffic and demand. - Infrastructure
AWS Lambda
AWS Lambda is a serverless compute service that can run code in response to predetermined events or conditions and automatically manage all the computing resources required for those processes. - Infrastructure
Azure Functions
Azure Functions is a serverless compute service by Microsoft that can run code in response to predetermined events or conditions (triggers), such as an order arriving on an IoT system, or a specific queue receiving a new message. - DevOps
BizDevOps
BizDevOps is a software development methodology that integrates business principles and practices into the DevOps process. BizDevOps helps organizations achieve better collaboration and alignment between business stakeholders, developers, and operations teams. - DevOps
Blue/green deployment
Blue/green deployment is a deployment strategy where two identical production environments, one "blue" and one "green," are used to reduce downtime and minimize risks during software updates. - Digital Experience
Business analytics
Business analytics is the process of applying statistical analysis to historical data to gain new insight and improve strategic decision making. Business analytics solutions derive insights from disparate data, uncovering hidden patterns and relationships. They help you to see what is happening, to predict what might happen, and to understand why. - DevOps
Canary deployment
A canary deployment is a deployment strategy where new software changes are gradually rolled out to a small subset of users to test for any issues or bugs before a full deployment is done, reducing the risk of major issues affecting all users. - DevOps
Change impact analysis
Change impact analysis is a process for assessing the potential effects of a proposed change to a software system or application. It involves analyzing the relationships between components and dependencies to determine the change's scope and risks and can help mitigate negative impacts and ensure successful outcomes. - DevOps
Chargeback
In cloud FinOps, a chargeback is the process of connecting cloud costs with the teams or users that incurred them. Effective chargeback strategies increase cloud visibility and improve accountability. - Application Security
CIS Benchmarks
CIS Benchmarks are globally recognized standards for securing IT systems and data. Developed by the Center for Internet Security (CIS), these benchmarks aim to reduce the attack surface of IT systems and mitigate cybersecurity risks. - DevOps
Closed loop remediation
Closed loop remediation is a practice that extends automated problem remediation with an automated observation of the executed remediation actions' results. The closed loop uses the observed impact and automates the next remediation actions until the problem is automatically resolved while keeping all stakeholders informed about the progress. - AIOps
Cloud automation
Cloud automation enables Development, DevOps, and SRE teams to build better quality software faster by bringing observability, automation, and intelligence to DevOps processes. - Infrastructure
Cloud computing
Cloud Computing is the delivery of computing services, such as servers, storage, databases, and software, over the internet. - Infrastructure
Cloud cost modeling
Cloud cost modeling is a process for estimating and analyzing the expenses associated with using cloud computing services to run applications, store data, and perform various computing tasks. Estimating cloud spend uses machine learning and predictive analytic models. - Infrastructure
Cloud migration
Cloud migration is the process of transferring some or all data, software, and operations to a cloud-based computing environment that offers unlimited scale and high availability. Cloud migration involves moving from on-premises infrastructure to cloud-based services or migrating from one cloud to another. - Infrastructure
Cloud monitoring
Cloud monitoring is a set of solutions and practices used to observe, measure, analyze, and manage the health of cloud-based IT infrastructure. - Application Security
Cloud security
Cloud security is the practice of securing cloud-based applications and infrastructure to protect data and prevent unauthorized access. - Infrastructure
Cloud-native
Cloud-native is an approach to software development and deployment that leverages the scalability and flexibility of cloud computing. - Infrastructure
Cloud-native architecture
Cloud-native architecture is a structural approach to planning and implementing an environment for software development and deployment that uses resources and processes common with public clouds like Amazon Web Services, Microsoft Azure, and Google Cloud Platform. - Apps and Microservices
Code profiling
Code profiling helps identify the root causes of potential system issues by analyzing a desktop or mobile application's performance while it's running. - Application Security
Compliance
Compliance is the process of adhering to regulatory requirements and industry standards related to security, privacy, and data protection. - AIOps
Composite AI
Composite AI integrates multiple AI models and technologies to create a more comprehensive and advanced AI system. It combines multiple AI models to enable more advanced reasoning and bring precision, context, and meaning to the outputs produced by generative AI. - DevOps
Configuration management
Configuration management is the process of managing and tracking changes to system configurations over time. - Infrastructure
Container as a Service
Container as a Service (CaaS) is a cloud-based service that allows companies to manage and deploy containers at scale. Container environments enable enterprises to quickly deploy and develop cloud-native applications that can run anywhere. - Infrastructure
Container monitoring
Container monitoring is the process of collecting metrics, traces, logs, and other observability data to improve the health and performance of containerized applications. - Infrastructure
Container orchestration
Container orchestration is a process that automates the deployment and management of containerized applications and services at scale. This orchestration includes provisioning, scheduling, networking, ensuring availability, and monitoring container lifecycles. - Application Security
Container security
Container security is the practice of applying security tools, processes, and policies to protect container-based workloads. Container security has two main functions: Secure the container image and Secure container runtime configuration - DevOps
Continuous delivery
Continuous delivery (CD) is a series of processes for delivering software in which DevOps teams use automation to deliver complete portions of software in short, controlled cycles to different environments as part of a software delivery pipeline. - DevOps
Continuous integration
Continuous integration (CI) is a practice that involves merging code changes into a central repository frequently to detect and resolve conflicts and ensure software quality. - Digital Experience
Core web vitals
Core Web Vitals are three key metrics of web page performance that measure a page’s loading performance, interactivity, and visual stability. They are part of Web Vitals, a quality standards initiative by Google that helps web developers deliver great user experiences. - Digital Experience
Customer experience analytics
Customer experience analytics is the systematic collection, integration, and analysis of data related to customer interactions and behavior with an organization and its products. - Infrastructure
Data lakehouse
A data lakehouse features the flexibility and cost-efficiency of a data lake with the contextual and high-speed querying capabilities of a data warehouse. - Observability
Data minimization
Data minimization is the practice of collecting and using the least amount of data possible to fulfill a specific purpose. Minimization improves data privacy and is a key component of compliance with government data privacy regulations. - Observability
Data observability
Data observability is a discipline that aims to address the needs of organizations to ensure data availability, reliability, and quality throughout the data lifecycle—from ingestion to analytics and automation. - Infrastructure
Database monitoring
Database monitoring tracks the database performance and resources to create and maintain a high performing and available application infrastructure. To carry out monitoring, the database system collects information from the database manager, its database, and any connected applications. - DevOps
Delivery pipelines
Delivery pipelines are automated workflows that enable organizations to build, test, and deploy software changes quickly and reliably. They typically consist of multiple stages, from code commits to production deployments, and are designed to increase efficiency, reduce errors, and promote collaboration across development, testing, and operations teams. - Observability
Dependency mapping
Dependency mapping is a process that identifies relationships among system applications, processes, services, hosts, and data centers. IT professionals use dependency mapping to understand application and system availability, performance metrics, service flows, and to analyze hotspots. - DevOps
Deployment
Deployment is the process of moving software code from one environment to another, such as from a development environment to a test environment or from a test environment to a production environment. It involves configuring the necessary infrastructure, such as servers and databases, and installing the software code to make it available for use. - DevOps
DevOps
DevOps is a collection of flexible practices and processes organizations use to create and deliver applications and services by aligning and coordinating software development with IT operations. - DevOps
DevOps automation
DevOps automation is a set of tools and technologies that perform routine, repeatable tasks that engineers would otherwise do manually. Automating tasks throughout the SDLC helps software development and operations teams collaborate and improve. - DevOps
DevOps metrics
DevOps metrics and DevOps KPIs are essential for ensuring your DevOps processes, pipelines, and tooling meet their intended goal. Like any IT or business project, you’ll need to track critical key metrics. - DevOps
DevOps orchestration
DevOps orchestration tames the complexity of DevOps toolchains by automatically managing workflows and dependencies in DevOps workflows. - Application Security
DevSecOps
DevSecOps is a tactical trifecta that connects three disciplines: development, security, and operations. The goal is to seamlessly integrate security into your continuous integration and continuous delivery (CI/CD) pipeline in both pre-production (dev) and production (ops) environments. - Digital Experience
Digital experience
A digital experience (DX) is a user’s interaction with a digital touchpoint — whether it’s purchasing an item online, receiving updates from a mobile app, or power-using a business platform. A digital touchpoint may be a mobile application, a website, a smart TV, ATM, and so on. - Digital Experience
Digital experience monitoring
Digital experience monitoring (DEM) is the practice of using tools and technologies to evaluate metrics from multiple sources that affect end users—such as applications, distributed cloud networks, user behavior, Internet of Things (IoT) devices, location-based data, and more—to determine the quality of a user's interaction with a digital touchpoint. - Observability
Digital immunity
Digital immunity is an approach to software development that results in secure and resilient software applications and promotes a positive user experience. The methodology combines software design, development, automation, operations, and analytics. - Digital Experience
Digital transformation
Digital transformation is the integration of digital technology into all areas of a business. This process reinvents existing processes, operations, customer services, and organizational culture.Digital transformation requires modernization and change management so employees can embrace digitization. - Application Security
DISA STIG
The Defense Information Systems Agency (DISA) Security Technical Implementation Guide (STIG) is a framework of security protocols designed to safeguard the U.S. Department of Defense (DoD) systems and networks from cybersecurity threats. - Application Security
Disaster recovery
Disaster recovery is the process of restoring a system or application after a catastrophic event, such as a natural disaster or cyberattack. - Apps and Microservices
Distributed tracing
Distributed tracing is a method of observing requests as they propagate through distributed cloud environments. Distributed tracing follows an interaction by tagging it with a unique identifier. - Application Security
DORA compliance
The Digital Operational Resilience Act (DORA) is a regulatory framework designed to enhance the digital resilience of financial entities within the European Union. - DevOps
DORA’s Four Keys
Google’s DevOps Research and Assessment (DORA) team established four main DevOps metrics known as “The Four Keys.” These metrics are Deployment Frequency, Lead Time for Changes, Change Failure Rate, and Time to Restore Service. - DevOps
Drift detection
Drift Detection is the process of analyzing and alerting changes in the quality of software over time. It involves continuously monitoring the software to ensure that it is meeting its intended quality standards and detecting any deviations or drifts from those standards. - DevOps
Error budget
An error budget is the number of acceptable errors that a service can experience before it violates its SLO (service-level objective). An error budget is typically used for proactive alerting based on the error budget burndown rate. - Infrastructure
Event logs
Event logs are individual records of system activities and notable occurrences. Often referred to simply as events, these records are generated by various IT solutions and often cover a range of system components. - DevOps
Everything as Code
Everything as Code is a software development approach that involves managing and automating all aspects of the software development lifecycle using code and version control systems. This includes infrastructure, configuration, deployment processes, observability, remediation, and more, allowing for greater consistency, repeatability, and scalability of software development and operations. - Application Security
Exposure management
Exposure management identifies, assesses, and addresses potential security risks that could lead to digital compromise. Effective management reduces an organization's digital attack surface, making it more difficult for attackers to gain network access. - DevOps
Feature flagging
Feature flagging is a technique that allows developers to turn certain features or functionality on and off in production environments without deploying new code, giving them more control and flexibility over the software release process, allowing for staged rollouts, and testing of features in production. - DevOps
FinOps
FinOps is short for "financial operations," and is an emerging practice that focuses on optimizing cloud spending by bringing financial accountability and transparency to cloud usage. This involves collaboration between finance, operations, and engineering teams to manage cloud costs and drive business value. - Observability
Full-stack observability
Full-stack observability is the ability to determine the state of every endpoint in a distributed IT environment based on its telemetry data. Endpoints include on-premises servers, Kubernetes infrastructure, cloud-hosted infrastructure and services, and open-source technologies. - Infrastructure
Function as a Service
Function as a Service (FaaS) is a cloud computing model that runs code in small modular pieces, or microservices. FaaS enables developers to create and run a single function in the cloud using a serverless compute model. - AIOps
Generative AI
Generative AI is a class of AI models and algorithms that produce new content—such as images, text, audio, and other synthetic data—based on patterns and examples from existing data. These models use machine learning to understand the ingested data and create similar material. - DevOps
Git
Git is a distributed version control system used in software development to track changes in source code and collaborate with other developers on a project. - DevOps
GitOps
GitOps is a specialized, prescriptive discipline of DevOps used to achieve similar goals of speed and efficiency. Building on the success of DevOps practices, GitOps is a relatively new way to manage infrastructure through code and automation, around a single Git repository. - DevOps
Golden signals
Golden signals are the four golden signals that are a set of key metrics, providing a holistic view of a system's performance, reliability, and capacity. The four golden signals are the following: Latency Traffic Errors Saturation By tracking these signals, organizations can proactively address potential problems, optimize system performance, and deliver a better user experience. - Infrastructure
Google Cloud Functions
Google Cloud Functions is a serverless compute service for creating and launching microservices. The service pairs ideally with single-use functions that tie into other services and is intended to simplify application development and accelerate innovation. GCF is part of the Google Cloud Platform. - Infrastructure
High availability
High availability (HA) is a characteristic of a system or component that ensures it is always operational and accessible. - Infrastructure
Hybrid cloud
Hybrid cloud architecture is a computing environment that shares data and applications on a combination of public clouds and on-premises private clouds. - Infrastructure
Hyperconverged infrastructure
Hyperconverged infrastructure (HCI) is an IT architecture that combines servers, storage, and networking functions into a unified, software-centric platform to streamline resource management. HCI typically includes an on-premises component. - Infrastructure
Hyperscale computing
Hyperscale refers to an architecture’s ability to scale appropriately as organizations add increased demand to the system. Hyperscalers are cloud providers that offer services and seamless delivery to build robust and scalable application environments. Some examples are AWS, Microsoft, and Google. - Application Security
Identity and access management (IAM)
Identity and access management (IAM) is the process of managing and controlling user access to a software system or application to ensure security and prevent unauthorized access. - DevOps
Incident management
Incident management is the process of detecting, responding to, and resolving incidents to minimize the impact on users and systems. - Infrastructure
Infrastructure as a Service (IaaS)
Infrastructure as a Service (IaaS) is used to manage low-level resources like VMs and disks. The end user is responsible for what is running within the VM, starting with the OS. IaaS is most closely related to a regular automated virtualized system. - Infrastructure
Infrastructure as Code
Infrastructure as Code (IaC) is a practice that automates IT infrastructure provisioning and management by codifying it as software. IaC uses descriptive code that, in many ways, mimics the DevOps approach to source code. - Infrastructure
Infrastructure monitoring
Infrastructure monitoring is the process of collecting and analyzing data from IT infrastructure, systems, and processes, and using that data to improve business outcomes and drive value across the whole organization. - DevOps
Integrated development environment (IDE)
An integrated development environment (IDE) provides a unified environment for writing source code, building executables, and debugging in a wide range of programming languages and platforms. IDEs help streamline the software development process and increase developer productivity. - Application Security
Interactive application security tests (IAST)
Interactive application security tests (IAST) combines SAST and DAST together and improves on them by instrumenting applications to support deeper vulnerability analysis beyond exposed surfaces. - DevOps
Internal developer platforms
An internal developer platform (IDP) is an integrated set of tools, services, and infrastructure that DevOps teams use to streamline, automate, and enhance the software development process. IDPs make recurring tasks, such as spinning up application environments and resources, easier. - AIOps
IT automation
IT automation is the practice of using coded instructions to carry out IT tasks without human intervention. IT admins can automate virtually any time-consuming task that requires regular application. The range of use cases for automating IT is as broad as IT itself. - AIOps
ITOps
ITOps is an IT discipline involving actions and decisions made by the operations team responsible for an organization’s IT infrastructure. ITOps refers to the process of acquiring, designing, deploying, configuring, and maintaining equipment and services. - DevOps
Keptn
Keptn is an open source enterprise-grade control plane for cloud-native continuous delivery and automated operations. - Infrastructure
Kubernetes
Kubernetes (aka K8s) is an open source platform used to run and manage containerized applications and services on clusters of physical or virtual machines across on-premises, public, private, and hybrid clouds. It automates complex tasks during the container’s lifecycle. - Infrastructure
Kubernetes architecture
Kubernetes architecture is a collection of core components in the Kubernetes container management system that run and manage containerized applications and services. Kubernetes architecture manages containers and workloads, distributed storage, and control planes that manage global functions. - Observability
LLM monitoring
Large language monitoring refers to the processes and tools used to oversee and manage the performance of large language models (LLMs) during their deployment and operation. - Observability
LLM observability
LLM observability provides visibility into all aspects of large language models, including applications, prompts, data sources, and outputs, which is critical to ensure accuracy and reliability. - Infrastructure
Log aggregation
Log aggregation is a software function that collects, stores, and analyzes log data produced by applications and infrastructure in a central repository. By consolidating logs into a unified data store, log aggregation can make it easier to detect bottlenecks, measure resource utilization, and predict trends over time. - Infrastructure
Log analysis
Log analysis is the process of examining computer-generated records known as logs. - Infrastructure
Log analytics
Log analytics is the process of viewing, interpreting, and querying log data so developers and IT teams can quickly detect and resolve application and system issues. - Infrastructure
Log management
Log management is an organization’s rules and policies for managing and enabling the creation, transmission, analysis, storage, and other tasks related to IT systems’ and applications’ log data. - Infrastructure
Log monitoring
Log monitoring is a process by which developers and administrators continuously observe logs as they’re recorded. With log monitoring software, teams can collect information and trigger alerts if something affects system performance and health. - Infrastructure
Log parsing
Log parsing is a process that converts structured or unstructured log file data into a common format so a computer can analyze it. - Infrastructure
Log preparation
Log preparation is the process of ensuring log data is accurate, reliable, and properly formatted for use in log management and monitoring tools. This includes validating data, eliminating redundant data, and more to help IT teams detect, identify, and remediate issues across IT environments. - Application Security
Log4Shell
Log4Shell is a software vulnerability in Apache Log4j 2, a popular Java library for logging error messages in applications. The vulnerability, published as CVE-2021-44228, enables a remote attacker to take control of a device on the internet if the device is running certain versions of Log4j 2. - Apps and Microservices
Message queue
A message queue is a form of middleware used in software development to enable communications between services, programs, and dissimilar components, such as operating systems and communication protocols. A message queue enables the smooth flow of information to make complex systems work. - Apps and Microservices
Microservices
Microservices are small, flexible, modular units of software that fit together with other services to deliver complete applications. This method of structuring, developing, and operating software as a collection of smaller independent services is known as a microservices architecture. - Infrastructure
Microsoft Azure
Microsoft Azure is a cloud platform with an ever-expanding set of cloud services that help organizations build and run applications. With a wide range of tools and functions, Azure lets you build, test, deploy, and manage applications and services hosted in the cloud. - Digital Experience
Mobile app monitoring
Mobile app monitoring is the process of collecting and analyzing data about application performance. Mobile analytics and monitoring provide context around your mobile application performance—the better the performance, the better for your bottom line. - DevOps
Monaco
Monaco is a configuration as code tool that allows you to create, update and version your observability and security configurations in Dynatrace efficiently and at scale. Initially the focus was on "monitoring as code" hence the name Monaco. - Observability
Monitoring
Monitoring is the continuous observation and measurement of a system or service to identify issues or anomalies. - DevOps
MTTR
MTTR stands for "mean time to respond", "mean time to repair", "mean time to resolve", and "mean time to recovery". Each is distinct and fits into its own spot in the incident management framework. - Apps and Microservices
Network performance monitoring
Network performance monitoring is the practice of tracking and analyzing the performance of network-related resources and operations to maintain secure and stable connections. It uses elements of IT security, resource utilization, alerting, and analysis to maintain healthy network communications. - Observability
Observability
Observability is the ability to measure a system’s current state based on the data it generates, such as logs, metrics, and traces. Observability relies on telemetry derived from instrumentation that comes from the endpoints and services in your multicloud computing environments. - Observability
Observability engineering
Observability engineering is an advanced and systematic approach to understanding complex systems' internal states by examining their outputs. - Observability
Open observability
Open observability uses open source tools to expand observability capabilities to more environments. It delivers actionable insights via open source software while promoting faster and more reliable responses to issues and empowering teams to deliver high-quality software and services. - Apps and Microservices
OpenCensus
Google made the OpenCensus project open source in 2018 with the goal to give developers a vendor-agnostic library for collecting traces and metrics. The OpenTracing and OpenCensus projects converged into one project called OpenTelemetry. - Infrastructure
OpenShift
Red Hat OpenShift is a cloud-based Kubernetes platform that helps developers build applications. It offers automated installation, upgrades, and life cycle management throughout the container stack on any cloud. - Apps and Microservices
OpenTelemetry
OpenTelemetry (also referred to as OTel) is an open source observability framework made up of a collection of tools, APIs, and SDKs. Otel enables IT teams to instrument, generate, collect, and export telemetry data for analysis and to understand software performance and behavior. - Apps and Microservices
OpenTracing
OpenTracing is an open-source CNCF (Cloud Native Computing Foundation) project which provides vendor-neutral APIs and instrumentation for distributed tracing. OpenTracing and OpenCensus have merged to form OpenTelemetry in early 2019. - Infrastructure
Orchestration
Orchestration refers to coordinating the execution of multiple steps in a more complex workflow or pipeline. Orchestration leverages DevOps tools that allow for rapid updates and releases, version control, and other best practices for software engineering. - Infrastructure
OTLP
OTLP, or OpenTelemetry protocol, is a set of rules, conventions, and standards that specify how components exchange telemetry data. Using metrics, logs, and traces, OTLP serves as a vendor-neutral open standard for collecting and transmitting telemetry data from distributed systems. - Application Security
Penetration testing
Penetration testing is the process of simulating a cyberattack on a software system or application to identify vulnerabilities and assess the effectiveness of security measures. - Apps and Microservices
Performance monitoring
Performance monitoring is the process of collecting metrics, logs, and traces to understand the state of applications or infrastructure. Performance monitoring aims to ensure services are highly available and reliable. - Infrastructure
Platform as a Service (PaaS)
Platform as a Service (PaaS) provides faster development and deployment platforms by abstracting the user from the OS while adding well-defined APIs to many essential services (such as the Web, databases, mail, queues, and storage) that the developer must use. - DevOps
Platform engineering
Platform engineering is a practice that outlines how development teams build internal platforms to create self-service capabilities for software engineering teams and enable a cloud-native approach. - DevOps
Problem remediation
Problem remediation is the approach of identifying, addressing, and resolving issues or incidents that arise within a given context. It involves analysis to determine the root cause of the problem, followed by the implementation of appropriate solutions to mitigate or eliminate its impact. The goal of problem remediation is to restore stability, efficiency, and functionality to the affected system, process, or situation. - DevOps
Progressive delivery
Progressive delivery is a delivery technique to reduce the risk of failed deployments to SLOs by decoupling the deployment of changes from releasing new features. Blue/Green, canary, or feature flagging are the most common implementations of progressive delivery. - Infrastructure
Prometheus
Prometheus is an open-source monitoring and alerting toolkit that’s been heavily adopted by many companies and organizations, and its popularity has grown due to the large number of Exporters built by the community. - DevOps
Quality gates
Quality gates are automated checkpoints within the software development and delivery lifecycle that ensure code quality and compliance with established standards and best practices. These gates help to catch functional, performance, and resiliency defects and issues earlier in the development process, reducing the risk of downstream problems and improving the overall quality of the software. - Digital Experience
Real user monitoring
Real user monitoring (RUM) is a performance monitoring process that collects detailed data about a user’s interaction with an application. Real user monitoring collects data on a variety of metrics. - DevOps
Release
A release is the process of making software code available to end users or customers. It involves determining when the software is ready to be used and delivering it to users. A release may involve deploying the software to production environments, but deployment does not necessarily imply a release. - DevOps
Release validation
Release validation is the process by which increased deployment and release frequency require an automated approach to validate the success of a new release. This is done by automatically validating key health objectives as part of the release process. This mitigates the risk of introducing bugs and issues into production environments, ensuring a seamless and reliable user experience. - DevOps
Resiliency engineering
Resiliency engineering is the practice of designing systems that can withstand and recover from failures. Resiliency engineers often use chaos engineering as a method to test the resiliency of a system by artificially injecting failure into all layers of the deployed software stack. - Application Security
Risk assessment
Risk assessment is the process of identifying, evaluating, and prioritizing potential security risks to a software system or application. - DevOps
Root-cause analysis
Root-cause analysis is the process of investigating the source of a problem so teams can identify a solution and take remedial action. This analytical method is an essential part of incident management for achieving continuous improvement and system stability for DevOps and CloudOps teams. - Infrastructure
Scalability
Scalability is the ability of a system or component to handle an increasing amount of work or traffic without impacting performance or availability. - Application Security
SecDevOps
SecDevOps is a collaboration framework that expands the impact of DevOps by adding security practices to the software development and delivery process. It resolves the tension between DevOps teams that want to release software quickly and security teams that prioritize security over all else. - Application Security
Secure coding
Secure coding is the practice of writing code with security considerations in mind to prevent vulnerabilities and ensure the overall security of a software system or application. - Application Security
Security
Security is the practice of protecting computer systems, networks, and data from unauthorized access, theft, and damage. - Application Security
Security analytics
Security analytics is a process that uses a combination of data collection, data aggregation, and AI to proactively detect, identify, and defend against security threats. - Application Security
Security by design
Security by design is a development approach that prioritizes integrating security processes across the entire software development life cycle, from initial design to testing to deployment and upgrading. The goal of this framework is to identify and address security risks as soon as possible. - Application Security
Security operations center (SOC)
A security operations center (SOC) is a team or department responsible for monitoring and responding to security incidents and threats to a software system or application. - Infrastructure
Serverless monitoring
Serverless computing is a cloud-based, on-demand execution model where customers consume resources solely based on their application usage. Serverless computing is a newer approach that simplifies manageability and reduces costs. - Observability
Service discovery
Service discovery is the process of automatically identifying the location and availability of services in a distributed system. - Infrastructure
Service mesh
A service mesh is a dedicated infrastructure layer built into an application that controls service-to-service communication in a microservices architecture. It controls the delivery of service requests to other services, performs load balancing, encrypts data, and discovers other services. - Digital Experience
Session replay
Session replay is an IT technology that creates anonymized video-like recordings of actions taken by users interacting with your website or mobile application. Analysits can then watch the user’s mouse movements. - DevOps
Shift left
Shift left is the practice of moving testing, quality, and performance evaluation early in the development process, often before any code is written. Shift left testing helps teams anticipate changes that arise during the development process that can affect performance or other delivery processes. - DevOps
Shift right
Shift right is the practice of performing testing, quality, and performance evaluation in production under real-world conditions. Shift right methods ensure that applications running in production can withstand real user load while ensuring the same high levels of quality. - Application Security
SIEM
Security information and event management (SIEM) is a holistic security management system used to detect, monitor, analyze, and respond to IT infrastructure events. SIEM systems identify abnormal behaviors and other threats for advanced monitoring and detection, forensics, and rapid remediation. - DevOps
Site reliability engineering
Site reliability engineering (SRE) is the practice of applying software engineering principles to operations and infrastructure processes to help organizations create highly reliable and scalable software systems. Those who perform the tasks involved are known as site reliability engineers. - DevOps
SLA
SLAs, or service-level agreements, are contracts signed between a vendor and customer that guarantees a certain measurable level of service. - DevOps
SLI
SLIs provide the actual metrics and measurements that indicate whether you are meeting your service level objective. Most SLIs are measured in percentages to express the service level delivered. - DevOps
SLO
SLOs (service-level objectives) are an agreed-upon target within an SLA that must be achieved for each activity, function, and process to provide the best opportunity for customer success. In layman’s terms, service level objectives represent the performance or health of a service. - Application Security
Software composition analysis
Software composition analysis is an application security methodology that tracks and analyzes open source software components. Fundamentally, SCA tools provide insight into open source license limitations and possible vulnerabilities in your projects. - DevOps
Software quality
Software quality refers to the degree to which software meets its specified requirements and is fit for its intended purpose. It encompasses various aspects such as functionality, reliability, usability, performance, security, and maintainability, which are essential for delivering software that meets user expectations and provides value to the business. - Application Security
Software supply chain security
Software supply chain security helps protect companies against compromise from malicious actors. It aims to reduce the risk of a supply chain attack by monitoring and managing the software development process across all stages. - Application Security
Spring4Shell
Spring4Shell is a critical vulnerability in the Spring Framework, an open source platform for Java-based application development. Because 60% of developers use Spring for their main Java applications, many applications are potentially affected. - Digital Experience
Synthetic monitoring
Synthetic monitoring is an application performance monitoring practice that emulates the paths users might take when engaging with an application. It uses scripts to generate simulated user behavior for different scenarios, geographic locations, device types, and other variables. - Observability
Telegram
Telegraf is an open-source agent written by Influxdata. It’s a plugin-based system for collecting, processing, aggregating, and writing metrics. - Observability
Telemetry data
Telemetry data refers to the automatic recording and transmission of data from all hosts to an IT system for monitoring and analysis. - AIOps
Telemetry pipeline
A telemetry pipeline involves the collection and routing of telemetry data from applications, servers, databases, and more. Additionally, it supports the enrichment and transformation of this telemetry data from a source to its destination. - DevOps
Test automation
Test automation involves the use of special software (separate from the software being tested) to control the execution of tests and the comparison of actual outcomes with predicted outcomes. - Application Security
Threat
A threat is any potential danger or risk to the security of a software system or application, including cyberattacks, malware, and human error. - Application Security
Threat intelligence
Threat intelligence is the information and analysis about potential security threats and vulnerabilities, including current and emerging threats and trends. - Application Security
Threat modeling
Threat modeling is the process of identifying and analyzing potential security threats and vulnerabilities in a software system or application and developing strategies to mitigate them. - Observability
Tool sprawl
Tool sprawl is the accumulation of multiple IT management and monitoring tools for the same or similar purposes. Sprawl can lead to redundant processes and added expenses, as tools are often purchased and then abandoned due to their redundancy. - DevOps
Toolchain orchestration
Toolchain orchestration is the process of automating and integrating different tools and technologies used in the software development lifecycle, enabling seamless collaboration between teams and reducing the time and effort required to deliver high-quality software. It involves managing the flow of data and information between different tools and ensuring that they work together efficiently and effectively. - Infrastructure
Virtualization
Virtualization is the process of creating a virtual version of a physical resource, such as a server or operating system. - Application Security
Vulnerability
A vulnerability is a weakness or flaw in a software system or application that can be exploited by attackers to compromise the security of the system. - Application Security
Vulnerability assessment
Vulnerability assessment is the process of identifying, quantifying, and prioritizing the cybersecurity vulnerabilities in a given IT system. The goal of an assessment is to locate weaknesses that can be exploited to compromise systems. - Application Security
Vulnerability management
Vulnerability management is the practice of identifying, prioritizing, correcting, and reporting software vulnerabilities. - Application Security
Web application security
Web application security is the process of protecting web applications against various types of threats that are designed to exploit vulnerabilities in an application’s code. - Application Security
Zero-day vulnerability
A zero-day vulnerability is an unknown software vulnerability that has been discovered by attackers before the organization is aware of it.