Multi-Cluster Backup and Restore Example

Example demonstrating multi-cluster backup and restore workflow with external PostgreSQL database

Multi-Cluster Backup and Restore Example

This example demonstrates a complete multi-cluster backup and restore workflow for a Hiero network using Solo’s config ops commands. It showcases an advanced deployment pattern with:

Dual-cluster deployment - Consensus nodes distributed across two Kubernetes clusters
External PostgreSQL database - Mirror node using external database for production-like setup
Complete component stack - Consensus, block, mirror, relay, and explorer nodes
Full backup/restore cycle - ConfigMaps, Secrets, Logs, State files, and database dumps
Disaster recovery - Complete cluster recreation and restoration from backup

Getting This Example

Download Archive

You can download this example as a standalone archive from the Solo releases page:

https://github.com/hiero-ledger/solo/releases/download/<release_version>/multicluster-backup-restore.zip

View on GitHub

Browse the source code and configuration files for this example in the GitHub repository.

Architecture

Cluster Distribution

Cluster 1 (solo-e2e-c1): node1 (first consensus node)
Cluster 2 (solo-e2e-c2): node2 (second consensus node), block node, mirror node, explorer, relay, PostgreSQL database

This demonstrates a realistic multi-cluster deployment where components are distributed across different Kubernetes clusters for high availability and fault tolerance.

Self-Contained Example: All configuration files and scripts are included in this directory - no external dependencies required.

Prerequisites

Task installed (brew install go-task/tap/go-task on macOS)
Kind installed (brew install kind on macOS)
kubectl installed
Helm installed (brew install helm on macOS)
Node.js 22+ and npm installed
Docker Desktop running with sufficient resources (Memory: 12GB+ recommended for dual clusters)

Quick Start

Run Complete Workflow

Execute the entire backup/restore workflow with a single command:

task

This will:

✅ Create two Kind clusters with Docker networking
✅ Deploy consensus nodes across both clusters (node1 on cluster 1, node2 on cluster 2)
✅ Deploy PostgreSQL database on cluster 2
✅ Deploy block node, mirror node (with external DB), explorer, and relay on cluster 2
✅ Generate test transactions to create network state
✅ Backup the entire network (ConfigMaps, Secrets, Logs, State, Database)
✅ Destroy both clusters completely
✅ Recreate clusters and restore all components from backup
✅ Verify the network is operational with new transactions

Clean Up

Remove the cluster and all backup files:

task destroy

Available Tasks

Main Workflow Tasks

Task	Description
`task` (default)	Run complete multi-cluster backup/restore workflow
`task initial-deploy`	Create dual clusters and deploy complete network
`task generate-transactions`	Create test transactions
`task backup`	Freeze network and create backup (including database)
`task restore-clusters`	Recreate Kind clusters from backup
`task restore-network`	Restore network components from backup
`task restore-config`	Restore ConfigMaps, Secrets, Logs, State, and database
`task verify`	Verify restored network functionality
`task destroy`	Remove clusters and backup files

Component Tasks

Task	Description
`task deploy-external-database`	Deploy PostgreSQL database with Helm
`task deploy-mirror-external`	Seed database for mirror node
`task destroy-cluster`	Delete all Kind clusters

Step-by-Step Workflow

1. Deploy Initial Multi-Cluster Network

task initial-deploy

This creates and configures:

Infrastructure:

2 Kind clusters (solo-e2e-c1, solo-e2e-c2)
Docker network for inter-cluster communication
MetalLB load balancer on both clusters
PostgreSQL database on cluster 2

Network Components:

node1 on cluster 1
node2 on cluster 2
1 block node on cluster 2
1 mirror node (with external PostgreSQL) on cluster 2
1 explorer node on cluster 2
Relay nodes on cluster 2

2. Generate Network State

task generate-transactions

Creates 3 test accounts with 100 HBAR each to generate network state.

3. Create Backup

task backup

This will:

Freeze the network
Backup ConfigMaps, Secrets, Logs, and State files using solo config ops backup
Export PostgreSQL database to SQL dump

Backup Location:

All backup files: ./solo-backup/
Database dump: ./solo-backup/database-dump.sql

4. Destroy Clusters

task destroy-cluster

This deletes both Kind clusters completely, simulating a complete disaster recovery scenario.

5. Restore Clusters

task restore-clusters

This will:

Clean Solo cache and temporary files
Recreate both Kind clusters from backup metadata
Setup Docker networking and MetalLB
Initialize cluster configurations

6. Restore Network

task restore-network

This will:

Deploy PostgreSQL database on cluster 2
Initialize cluster configurations
Deploy all network components (consensus, block, mirror, explorer, relay)

7. Restore Configuration and State

task restore-config

This will:

Freeze the network
Restore ConfigMaps, Secrets, Logs, and State files using solo config ops restore-config
Restore PostgreSQL database from SQL dump
Start consensus nodes

8. Verify Restored Network

task verify

Verifies:

All pods are running across both clusters
Previously created accounts exist (e.g., account 3.2.3)
Network can process new transactions
Database has been restored correctly

Configuration

Edit variables in Taskfile.yml to customize:

vars:
  NETWORK_SIZE: "2"                           # Number of consensus nodes
  NODE_ALIASES: "node1,node2"                 # Node identifiers
  DEPLOYMENT: "external-database-test-deployment"
  NAMESPACE: "external-database-test"
  BACKUP_DIR: "./solo-backup"                 # All backup files location
  
  # PostgreSQL Configuration
  POSTGRES_USERNAME: "postgres"
  POSTGRES_PASSWORD: "XXXXXXXX"
  POSTGRES_READONLY_USERNAME: "readonlyuser"
  POSTGRES_READONLY_PASSWORD: "XXXXXXXX"
  POSTGRES_NAME: "my-postgresql"
  POSTGRES_DATABASE_NAMESPACE: "database"
  POSTGRES_HOST_FQDN: "my-postgresql.database.svc.cluster.local"

Cluster Configuration

The Kind cluster configurations in kind-cluster-1.yaml and kind-cluster-2.yaml can be customized:

Node count - Add more worker nodes per cluster
Port mappings - Expose additional ports for services
Resource limits - Adjust CPU and memory constraints
Volume mounts - Add persistent storage options

MetalLB Configuration

The MetalLB configurations in metallb-cluster-1.yaml and metallb-cluster-2.yaml define:

IP address ranges for load balancer services
Load balancer type (Layer 2 mode)
Address allocation per cluster

What Gets Backed Up?

The backup process captures:

ConfigMaps (via `solo config ops backup`)

Network configuration (network-node-data-config-cm)
Bootstrap properties
Application properties
Genesis network configuration
Address book

Secrets (via `solo config ops backup`)

Node keys (TLS, signing, agreement)
Consensus keys
All Opaque secrets in the namespace

Logs (from each pod via `solo config ops backup`)

Account balances
Record streams
Statistics
Application logs
Network logs

State Files (from each consensus node via `solo config ops backup`)

Consensus state
Merkle tree state
Platform state
Swirlds state

PostgreSQL Database

Complete database dump (via pg_dump)
Mirror node schema and data
Account balances and transaction history

Backup Directory Structure

solo-backup/
├── solo-e2e-c1/                    # Cluster 1 backup
│   ├── configmaps/
│   │   ├── network-node-data-config-cm.yaml
│   │   └── ...
│   ├── secrets/
│   │   ├── node1-keys.yaml
│   │   └── ...
│   ├── logs/
│   │   └── network-node1-0.zip  (includes state files)
│   └── solo-remote-config.yaml
├── solo-e2e-c2/                    # Cluster 2 backup
│   ├── configmaps/
│   ├── secrets/
│   ├── logs/
│   │   └── network-node2-0.zip  (includes state files)
│   └── solo-remote-config.yaml
└── database-dump.sql               # PostgreSQL database dump

Troubleshooting

View Cluster Status

# Cluster 1
kubectl cluster-info --context kind-solo-e2e-c1
kubectl get pods -n external-database-test -o wide --context kind-solo-e2e-c1

# Cluster 2
kubectl cluster-info --context kind-solo-e2e-c2
kubectl get pods -n external-database-test -o wide --context kind-solo-e2e-c2
kubectl get pods -n database -o wide --context kind-solo-e2e-c2

View Pod Logs

# Node 1 (Cluster 1)
kubectl logs -n external-database-test network-node1-0 -c root-container --tail=100 --context kind-solo-e2e-c1

# Node 2 (Cluster 2)
kubectl logs -n external-database-test network-node2-0 -c root-container --tail=100 --context kind-solo-e2e-c2

# PostgreSQL
kubectl logs -n database my-postgresql-0 --tail=100 --context kind-solo-e2e-c2

Open Shell in Pod

# Consensus node
kubectl exec -it -n external-database-test network-node1-0 -c root-container --context kind-solo-e2e-c1 -- /bin/bash

# PostgreSQL
kubectl exec -it -n database my-postgresql-0 --context kind-solo-e2e-c2 -- /bin/bash

Check Database

# Connect to PostgreSQL
kubectl exec -it -n database my-postgresql-0 --context kind-solo-e2e-c2 -- \
  env PGPASSWORD=XXXXXXXX psql -U postgres -d mirror_node

# List tables
\dt

# Check account balances
SELECT * FROM account_balance LIMIT 10;

Manual Cleanup

# Delete clusters
kind delete cluster -n solo-e2e-c1
kind delete cluster -n solo-e2e-c2

# Remove Docker network
docker network rm kind

# Remove backup files
rm -rf ./solo-backup

# Clean Solo cache
rm -rf ~/.solo/*
rm -rf test/data/tmp/*

Advanced Usage

Run Individual Steps

# Deploy dual-cluster network
task initial-deploy

# Generate test data
task generate-transactions

# Create backup (includes database)
task backup

# Manually inspect backup
ls -lh ./solo-backup/
ls -lh ./solo-backup/solo-e2e-c1/
ls -lh ./solo-backup/solo-e2e-c2/

# Destroy clusters
task destroy-cluster

# Restore clusters only
task restore-clusters

# Restore network components
task restore-network

# Restore configuration and state
task restore-config

# Verify
task verify

Use Released Version of Solo

By default, the Taskfile uses the development version (npm run solo-test --). To use the released version:

USE_RELEASED_VERSION=true task

Customize Component Options

Edit command.yaml to customize mirror node deployment options:

mirror:
  - --deployment
  - external-database-test-deployment
  - --cluster-ref
  - solo-e2e-c2
  - --enable-ingress
  - --pinger
  - --dev
  - --quiet-mode
  - --use-external-database
  - --external-database-host
  - my-postgresql.database.svc.cluster.local
  # Add more options as needed

Modify Cluster Configuration

Since all configuration files are local, you can easily customize the clusters:

# Edit cluster configurations
vim kind-cluster-1.yaml    # Modify cluster 1 setup
vim kind-cluster-2.yaml    # Modify cluster 2 setup

# Edit MetalLB configurations  
vim metallb-cluster-1.yaml # Adjust IP ranges for cluster 1
vim metallb-cluster-2.yaml # Adjust IP ranges for cluster 2

# Then run the deployment
task initial-deploy

Custom MetalLB Configuration

You can specify custom MetalLB configuration files during restore operations:

# Use custom metallb configuration files
$SOLO_COMMAND config ops restore-clusters \
  --input-dir ./solo-backup \
  --metallb-config custom-metallb-{index}.yaml

# The {index} placeholder gets replaced with the cluster number (1, 2, etc.)
# Result: custom-metallb-1.yaml, custom-metallb-2.yaml, etc.

The metallb configuration files use the {index} placeholder to support multiple clusters:

metallb-cluster-{index}.yaml → metallb-cluster-1.yaml, metallb-cluster-2.yaml
Custom patterns like custom/loadbalancer-{index}.yaml also work

Key Commands Used

This example demonstrates the following Solo commands:

Backup/Restore Commands

solo config ops backup - Backs up ConfigMaps, Secrets, Logs, and State files
solo config ops restore-clusters - Recreates clusters from backup metadata (supports --metallb-config flag)
solo config ops restore-network - Restores network components from backup
solo config ops restore-config - Restores ConfigMaps, Secrets, Logs, and State files
solo consensus network freeze - Freezes the network before backup

Multi-Cluster Commands

solo cluster-ref config setup - Setup cluster reference configuration
solo cluster-ref config connect - Connect cluster reference to kubectl context
solo deployment config create - Create deployment with realm and shard
solo deployment cluster attach - Attach cluster to deployment with node count

Component Deployment Commands

solo consensus network deploy - Deploy consensus network with load balancer
solo consensus node setup/start - Setup and start consensus nodes
solo block node add - Add block node to specific cluster
solo mirror node add - Add mirror node with external database
solo explorer node add - Add explorer node with TLS
solo relay node add - Add relay node

Important Notes

Multi-cluster networking - Docker network enables communication between Kind clusters
External database - PostgreSQL must be backed up and restored separately
Network must be frozen before backup to ensure consistent state
Backup includes database - PostgreSQL dump is part of the backup process
Restore is multi-step - Clusters → Network → Configuration (in order)
Backup files can be large - Ensure sufficient disk space (2GB+ for dual clusters)
Realm and shard - Configured as realm 2, shard 3 for testing non-zero values

Files

Taskfile.yml - Main automation tasks and configuration
command.yaml - Component deployment options for restore
scripts/init.sh - PostgreSQL database initialization script
kind-cluster-1.yaml - Kind cluster 1 configuration
kind-cluster-2.yaml - Kind cluster 2 configuration
metallb-cluster-1.yaml - MetalLB configuration for cluster 1
metallb-cluster-2.yaml - MetalLB configuration for cluster 2

external-database-test - External database setup
state-save-and-restore - State file management

Support

For issues or questions:

Solo Documentation: https://github.com/hashgraph/solo
Task Documentation: https://taskfile.dev/
Hiero Documentation: https://docs.hedera.com/

Multi-Cluster Backup and Restore Example

Multi-Cluster Backup and Restore Example

Getting This Example

Download Archive

View on GitHub

Architecture

Cluster Distribution

Prerequisites

Quick Start

Run Complete Workflow

Clean Up

Available Tasks

Main Workflow Tasks

Component Tasks

Step-by-Step Workflow

1. Deploy Initial Multi-Cluster Network

2. Generate Network State

3. Create Backup

4. Destroy Clusters

5. Restore Clusters

6. Restore Network

7. Restore Configuration and State

8. Verify Restored Network

Configuration

Cluster Configuration

MetalLB Configuration

What Gets Backed Up?

ConfigMaps (via solo config ops backup)

Secrets (via solo config ops backup)

Logs (from each pod via solo config ops backup)

State Files (from each consensus node via solo config ops backup)

PostgreSQL Database

Backup Directory Structure

Troubleshooting

View Cluster Status

View Pod Logs

Open Shell in Pod

Check Database

Manual Cleanup

Advanced Usage

Run Individual Steps

Use Released Version of Solo

Customize Component Options

Modify Cluster Configuration

Custom MetalLB Configuration

Key Commands Used

Backup/Restore Commands

Multi-Cluster Commands

Component Deployment Commands

Important Notes

Files

Related Examples

Support

ConfigMaps (via `solo config ops backup`)

Secrets (via `solo config ops backup`)

Logs (from each pod via `solo config ops backup`)

State Files (from each consensus node via `solo config ops backup`)