Our disaster and business continuity strategy revolves around 3 risk scenarios:
- The unavailability or service discontinuity of the The Dynatrace SaaS environment
- Data center unavailability
- Regional disaster
All of these scenarios are covered by our management platform which provides failover for all of our SaaS customers.
One major advantage over other APM as a Service solutions is that we host each customer in its own environment which allows us to address security or availability problems in a very focused way without impacting other customers.
In the first case, we constantly monitor the customer’s environment. If it becomes unresponsive, we perform the following depending on the root cause:
- Restart the AppMon Server process.
- Recreate the Server instance.
- Recreate single components that are failing.
- Recreate the environment in a new data center without data loss.
- Recreate the environment in a new data center. The loss of live data may occur.
Our failover capabilities are based on services provided by Amazon EC2. We use failover secure data storages for all AppMon relevant data which includes server configurations, live data both stored on Elastic Block Storage (EBS) and historic data which is stored in an Oracle Database (RDS). Together with our collector technology which can buffer data, we can make sure even if a service interruption occurs that no data is actually lost. The only exception is the very unlikely event in which a whole Amazon data center complex (Region) goes down, which has yet to occur.