Disaster recovery

Short outages (up to three hours) of one data center do not require any recovery actions. When unaccessible data center becomes available again, Managed cluster will automatically synchronize data and restore cluster operations.

For longer outages (up to three days), first make sure that cluster nodes are operational and then execute the following command sequentially on all nodes in the recovering data center:

/opt/dynatrace-managed/utils/repair-cassandra-data.sh

For outages of a second data center for more than three days some data is lost and cannot be repaired. As a result, you must perform a recovery from either an operational data center, or from the backup.

Recover a data center from another data center

To recover a data center from another data center, you will:

  1. Remove unavailable nodes from the cluster.
  2. Update existing (surviving) data center configuration.
  3. Reinstall nodes in the recovered data center.
  4. Replicate Cassandra to recovered data center.
  5. Replicate Elasticsearch to recovered data center.
  6. Recreate the server, start ActiveGate, and start Nginx in the recovered data center.
  7. Enable the recovered data center.

For detailed procedure see Rebuild data center.

Restore a cluster from a backup

To restore a cluster from a backup, you need to uninstall Dynatrace Managed from all nodes and restore the first data center from backup. Then, follow procedure for data center recovery from the other data center. For detailed procedure see Disaster recovery from backup.