Back up and restore a cluster

See below to understand how to back up and restore a Managed cluster.

Prerequisites

  • The IP addresses for backup and restore should be the same.
  • Storage configured for backup must be accessible (both read and write) by all cluster nodes (i.e., using the NFS protocol). To ensure reliability of backup storage, it's recommended that you use an external and dedicated storage volume with a replication mechanism.

Back up a cluster

All important Dynatrace Managed configuration files and monitoring data can be backed up automatically on a daily basis. For maximum security, it's a good idea to save your backup files to an off-site location.

The configuration files and internal database are contained in an uncompressed tar archive. The overall size of the archive can be roughly estimated to be equal to the size of Metrics storage plus the double size of Elasticsearch storage presented on the node details page for all nodes.

Each node should be connected to the NFS (the NFS disk should be mounted at the same shared directory on each node). Dynatrace server process should have read/write permissions to the NFS. The protocol which is used to transmit data depends on your configuration. Our recommendation is to use the NFSv4 protocol. We don't recommend the CIFS protocol.

Notes:

  1. Backup history isn't preserved; Dynatrace Managed keeps only the latest backup.
  2. Transaction storage isn't backed up, so when you restore backups you may see some gaps in deep monitoring data.

Restore a cluster

To restore a cluster, follow the steps below.

On each node successively, execute the Dynatrace Managed installer using the following arguments:
--restore --backup-file <path-to-backup-file>/backup-001.tar

Note:

  • Please use the same version of the installer that was used during backup (get the installer from <path-to-backup>).
  • It's recommended that you restore all nodes from the cluster.

On each node successively, start the firewall using the launcher script via the following command:
<full-path-to-Dynatrace-binaries-directory>/launcher/firewall.sh start

On each node successively, start Cassandra using the launcher script:

  • Execute the command:
    <full-path-to-Dynatrace-binaries-directory>/launcher/cassandra.sh start

  • On the last node, check if Cassandra is running using the command:
    <full-path-to-Dynatrace-binaries-directory>/utils/cassandra-nodetool.sh status

  • You should get the following response:
    Status = Up
    State = Normal

On each node successively, run nodetool repair.

  • Execute the command:
    <full-path-to-Dynatrace-binaries-directory>/utils/cassandra-nodetool.sh repair

On each node successively, start Elasticsearch using a launcher script.

  • Execute the command:
    <full-path-to-Dynatrace-binaries-directory>/launcher/elasticsearch.sh start

  • On the last node check if Elasticsearch is running using the command:
    curl -s -N -XGET 'http://localhost:9200/_cluster/health?pretty' | grep status

  • You should get the following response:
    "status" : "green"
    or for one node setup:
    "status" : "yellow"

Create the dynatrace_repository.

  • On one of the nodes, execute the following command (enter proper location and the path.repo property from the configuration file <full-path-to-Dynatrace-binaries-directory>elasticsearch/config/elasticsearch.yml):
    curl -s -N -XPUT 'http://localhost:9200/_snapshot/dynatrace_repository' -H 'Content-Type: application/json' -d'
    {
        "type": "fs",
        "settings": {
            "location": "enter-here-full-path-to-elasticseach-backup-location",
            "compress": true
        }
    }'
    
  • You should get the following response:
    {"acknowledged":true}

Close indices.

  • On one of the nodes execute the command:
    for index in `curl -X XGET 'http://localhost:9200/_cat/indices/?h=index'`; do
      curl -s -N -XPOST "http://localhost:9200/$index/_close"
    done
    

Find the latest snapshot.

  • On one of the nodes execute the command:
    curl -s -N -XGET 'http://localhost:9200/_cat/snapshots/dynatrace_repository?h=id&s=end_epoch:desc' | head -n 1
  • In response, you should get a snapshot, for example:
    snapshot_2018-01-04-08-58-utc

Restore the Elasticsearch database.

  • On one of the nodes execute the command:
    curl -s -N -XPOST 'http://localhost:9200/_snapshot/dynatrace_repository/<put-snapshot-name-here>/_restore'

Monitor the progress of restoring a snapshot.

  • On one of the nodes execute the command:
    curl -s -N -XGET 'http://localhost:9200/_snapshot/dynatrace_repository/<put-snapshot-name-here>/_status?pretty' | grep state
  • In response, you should get a snapshot:
    "state" : "SUCCESS"

On each node successively, start Dynatrace Server and the other components using the launcher script via the following command:
<full-path-to-Dynatrace-binaries-directory>/launcher/dynatrace.sh start

Cluster is ready.