Multi-Cluster Backup and Restore Example
Multi-Cluster Backup and Restore Example
This example demonstrates a complete multi-cluster backup and restore workflow for a Hiero network using Solo’s config ops commands. It showcases an advanced deployment pattern with:
- Dual-cluster deployment - Consensus nodes distributed across two Kubernetes clusters
- External PostgreSQL database - Mirror node using external database for production-like setup
- Complete component stack - Consensus, block, mirror, relay, and explorer nodes
- Full backup/restore cycle - ConfigMaps, Secrets, Logs, State files, and database dumps
- Disaster recovery - Complete cluster recreation and restoration from backup
Getting This Example
Download Archive
You can download this example as a standalone archive from the Solo releases page:
https://github.com/hiero-ledger/solo/releases/download/<release_version>/multicluster-backup-restore.zip
View on GitHub
Browse the source code and configuration files for this example in the GitHub repository.
Architecture
Cluster Distribution
- Cluster 1 (solo-e2e-c1): node1 (first consensus node)
- Cluster 2 (solo-e2e-c2): node2 (second consensus node), block node, mirror node, explorer, relay, PostgreSQL database
This demonstrates a realistic multi-cluster deployment where components are distributed across different Kubernetes clusters for high availability and fault tolerance.
Self-Contained Example: All configuration files and scripts are included in this directory - no external dependencies required.
Prerequisites
- Task installed (
brew install go-task/tap/go-taskon macOS) - Kind installed (
brew install kindon macOS) - kubectl installed
- Helm installed (
brew install helmon macOS) - Node.js 22+ and npm installed
- Docker Desktop running with sufficient resources (Memory: 12GB+ recommended for dual clusters)
Quick Start
Run Complete Workflow
Execute the entire backup/restore workflow with a single command:
task
This will:
- ✅ Create two Kind clusters with Docker networking
- ✅ Deploy consensus nodes across both clusters (node1 on cluster 1, node2 on cluster 2)
- ✅ Deploy PostgreSQL database on cluster 2
- ✅ Deploy block node, mirror node (with external DB), explorer, and relay on cluster 2
- ✅ Generate test transactions to create network state
- ✅ Backup the entire network (ConfigMaps, Secrets, Logs, State, Database)
- ✅ Destroy both clusters completely
- ✅ Recreate clusters and restore all components from backup
- ✅ Verify the network is operational with new transactions
Clean Up
Remove the cluster and all backup files:
task destroy
Available Tasks
Main Workflow Tasks
| Task | Description |
|---|---|
task (default) | Run complete multi-cluster backup/restore workflow |
task initial-deploy | Create dual clusters and deploy complete network |
task generate-transactions | Create test transactions |
task backup | Freeze network and create backup (including database) |
task restore-clusters | Recreate Kind clusters from backup |
task restore-network | Restore network components from backup |
task restore-config | Restore ConfigMaps, Secrets, Logs, State, and database |
task verify | Verify restored network functionality |
task destroy | Remove clusters and backup files |
Component Tasks
| Task | Description |
|---|---|
task deploy-external-database | Deploy PostgreSQL database with Helm |
task deploy-mirror-external | Seed database for mirror node |
task destroy-cluster | Delete all Kind clusters |
Step-by-Step Workflow
1. Deploy Initial Multi-Cluster Network
task initial-deploy
This creates and configures:
Infrastructure:
- 2 Kind clusters (solo-e2e-c1, solo-e2e-c2)
- Docker network for inter-cluster communication
- MetalLB load balancer on both clusters
- PostgreSQL database on cluster 2
Network Components:
- node1 on cluster 1
- node2 on cluster 2
- 1 block node on cluster 2
- 1 mirror node (with external PostgreSQL) on cluster 2
- 1 explorer node on cluster 2
- Relay nodes on cluster 2
2. Generate Network State
task generate-transactions
Creates 3 test accounts with 100 HBAR each to generate network state.
3. Create Backup
task backup
This will:
- Freeze the network
- Backup ConfigMaps, Secrets, Logs, and State files using
solo config ops backup - Export PostgreSQL database to SQL dump
Backup Location:
- All backup files:
./solo-backup/ - Database dump:
./solo-backup/database-dump.sql
4. Destroy Clusters
task destroy-cluster
This deletes both Kind clusters completely, simulating a complete disaster recovery scenario.
5. Restore Clusters
task restore-clusters
This will:
- Clean Solo cache and temporary files
- Recreate both Kind clusters from backup metadata
- Setup Docker networking and MetalLB
- Initialize cluster configurations
6. Restore Network
task restore-network
This will:
- Deploy PostgreSQL database on cluster 2
- Initialize cluster configurations
- Deploy all network components (consensus, block, mirror, explorer, relay)
7. Restore Configuration and State
task restore-config
This will:
- Freeze the network
- Restore ConfigMaps, Secrets, Logs, and State files using
solo config ops restore-config - Restore PostgreSQL database from SQL dump
- Start consensus nodes
8. Verify Restored Network
task verify
Verifies:
- All pods are running across both clusters
- Previously created accounts exist (e.g., account 3.2.3)
- Network can process new transactions
- Database has been restored correctly
Configuration
Edit variables in Taskfile.yml to customize:
vars:
NETWORK_SIZE: "2" # Number of consensus nodes
NODE_ALIASES: "node1,node2" # Node identifiers
DEPLOYMENT: "external-database-test-deployment"
NAMESPACE: "external-database-test"
BACKUP_DIR: "./solo-backup" # All backup files location
# PostgreSQL Configuration
POSTGRES_USERNAME: "postgres"
POSTGRES_PASSWORD: "XXXXXXXX"
POSTGRES_READONLY_USERNAME: "readonlyuser"
POSTGRES_READONLY_PASSWORD: "XXXXXXXX"
POSTGRES_NAME: "my-postgresql"
POSTGRES_DATABASE_NAMESPACE: "database"
POSTGRES_HOST_FQDN: "my-postgresql.database.svc.cluster.local"
Cluster Configuration
The Kind cluster configurations in kind-cluster-1.yaml and kind-cluster-2.yaml can be customized:
- Node count - Add more worker nodes per cluster
- Port mappings - Expose additional ports for services
- Resource limits - Adjust CPU and memory constraints
- Volume mounts - Add persistent storage options
MetalLB Configuration
The MetalLB configurations in metallb-cluster-1.yaml and metallb-cluster-2.yaml define:
- IP address ranges for load balancer services
- Load balancer type (Layer 2 mode)
- Address allocation per cluster
What Gets Backed Up?
The backup process captures:
ConfigMaps (via solo config ops backup)
- Network configuration (
network-node-data-config-cm) - Bootstrap properties
- Application properties
- Genesis network configuration
- Address book
Secrets (via solo config ops backup)
- Node keys (TLS, signing, agreement)
- Consensus keys
- All Opaque secrets in the namespace
Logs (from each pod via solo config ops backup)
- Account balances
- Record streams
- Statistics
- Application logs
- Network logs
State Files (from each consensus node via solo config ops backup)
- Consensus state
- Merkle tree state
- Platform state
- Swirlds state
PostgreSQL Database
- Complete database dump (via
pg_dump) - Mirror node schema and data
- Account balances and transaction history
Backup Directory Structure
solo-backup/
├── solo-e2e-c1/ # Cluster 1 backup
│ ├── configmaps/
│ │ ├── network-node-data-config-cm.yaml
│ │ └── ...
│ ├── secrets/
│ │ ├── node1-keys.yaml
│ │ └── ...
│ ├── logs/
│ │ └── network-node1-0.zip (includes state files)
│ └── solo-remote-config.yaml
├── solo-e2e-c2/ # Cluster 2 backup
│ ├── configmaps/
│ ├── secrets/
│ ├── logs/
│ │ └── network-node2-0.zip (includes state files)
│ └── solo-remote-config.yaml
└── database-dump.sql # PostgreSQL database dump
Troubleshooting
View Cluster Status
# Cluster 1
kubectl cluster-info --context kind-solo-e2e-c1
kubectl get pods -n external-database-test -o wide --context kind-solo-e2e-c1
# Cluster 2
kubectl cluster-info --context kind-solo-e2e-c2
kubectl get pods -n external-database-test -o wide --context kind-solo-e2e-c2
kubectl get pods -n database -o wide --context kind-solo-e2e-c2
View Pod Logs
# Node 1 (Cluster 1)
kubectl logs -n external-database-test network-node1-0 -c root-container --tail=100 --context kind-solo-e2e-c1
# Node 2 (Cluster 2)
kubectl logs -n external-database-test network-node2-0 -c root-container --tail=100 --context kind-solo-e2e-c2
# PostgreSQL
kubectl logs -n database my-postgresql-0 --tail=100 --context kind-solo-e2e-c2
Open Shell in Pod
# Consensus node
kubectl exec -it -n external-database-test network-node1-0 -c root-container --context kind-solo-e2e-c1 -- /bin/bash
# PostgreSQL
kubectl exec -it -n database my-postgresql-0 --context kind-solo-e2e-c2 -- /bin/bash
Check Database
# Connect to PostgreSQL
kubectl exec -it -n database my-postgresql-0 --context kind-solo-e2e-c2 -- \
env PGPASSWORD=XXXXXXXX psql -U postgres -d mirror_node
# List tables
\dt
# Check account balances
SELECT * FROM account_balance LIMIT 10;
Manual Cleanup
# Delete clusters
kind delete cluster -n solo-e2e-c1
kind delete cluster -n solo-e2e-c2
# Remove Docker network
docker network rm kind
# Remove backup files
rm -rf ./solo-backup
# Clean Solo cache
rm -rf ~/.solo/*
rm -rf test/data/tmp/*
Advanced Usage
Run Individual Steps
# Deploy dual-cluster network
task initial-deploy
# Generate test data
task generate-transactions
# Create backup (includes database)
task backup
# Manually inspect backup
ls -lh ./solo-backup/
ls -lh ./solo-backup/solo-e2e-c1/
ls -lh ./solo-backup/solo-e2e-c2/
# Destroy clusters
task destroy-cluster
# Restore clusters only
task restore-clusters
# Restore network components
task restore-network
# Restore configuration and state
task restore-config
# Verify
task verify
Use Released Version of Solo
By default, the Taskfile uses the development version (npm run solo-test --). To use the released version:
USE_RELEASED_VERSION=true task
Customize Component Options
Edit command.yaml to customize mirror node deployment options:
mirror:
- --deployment
- external-database-test-deployment
- --cluster-ref
- solo-e2e-c2
- --enable-ingress
- --pinger
- --dev
- --quiet-mode
- --use-external-database
- --external-database-host
- my-postgresql.database.svc.cluster.local
# Add more options as needed
Modify Cluster Configuration
Since all configuration files are local, you can easily customize the clusters:
# Edit cluster configurations
vim kind-cluster-1.yaml # Modify cluster 1 setup
vim kind-cluster-2.yaml # Modify cluster 2 setup
# Edit MetalLB configurations
vim metallb-cluster-1.yaml # Adjust IP ranges for cluster 1
vim metallb-cluster-2.yaml # Adjust IP ranges for cluster 2
# Then run the deployment
task initial-deploy
Custom MetalLB Configuration
You can specify custom MetalLB configuration files during restore operations:
# Use custom metallb configuration files
$SOLO_COMMAND config ops restore-clusters \
--input-dir ./solo-backup \
--metallb-config custom-metallb-{index}.yaml
# The {index} placeholder gets replaced with the cluster number (1, 2, etc.)
# Result: custom-metallb-1.yaml, custom-metallb-2.yaml, etc.
The metallb configuration files use the {index} placeholder to support multiple clusters:
metallb-cluster-{index}.yaml→metallb-cluster-1.yaml,metallb-cluster-2.yaml- Custom patterns like
custom/loadbalancer-{index}.yamlalso work
Key Commands Used
This example demonstrates the following Solo commands:
Backup/Restore Commands
solo config ops backup- Backs up ConfigMaps, Secrets, Logs, and State filessolo config ops restore-clusters- Recreates clusters from backup metadata (supports--metallb-configflag)solo config ops restore-network- Restores network components from backupsolo config ops restore-config- Restores ConfigMaps, Secrets, Logs, and State filessolo consensus network freeze- Freezes the network before backup
Multi-Cluster Commands
solo cluster-ref config setup- Setup cluster reference configurationsolo cluster-ref config connect- Connect cluster reference to kubectl contextsolo deployment config create- Create deployment with realm and shardsolo deployment cluster attach- Attach cluster to deployment with node count
Component Deployment Commands
solo consensus network deploy- Deploy consensus network with load balancersolo consensus node setup/start- Setup and start consensus nodessolo block node add- Add block node to specific clustersolo mirror node add- Add mirror node with external databasesolo explorer node add- Add explorer node with TLSsolo relay node add- Add relay node
Important Notes
- Multi-cluster networking - Docker network enables communication between Kind clusters
- External database - PostgreSQL must be backed up and restored separately
- Network must be frozen before backup to ensure consistent state
- Backup includes database - PostgreSQL dump is part of the backup process
- Restore is multi-step - Clusters → Network → Configuration (in order)
- Backup files can be large - Ensure sufficient disk space (2GB+ for dual clusters)
- Realm and shard - Configured as realm 2, shard 3 for testing non-zero values
Files
Taskfile.yml- Main automation tasks and configurationcommand.yaml- Component deployment options for restorescripts/init.sh- PostgreSQL database initialization scriptkind-cluster-1.yaml- Kind cluster 1 configurationkind-cluster-2.yaml- Kind cluster 2 configurationmetallb-cluster-1.yaml- MetalLB configuration for cluster 1metallb-cluster-2.yaml- MetalLB configuration for cluster 2
Related Examples
- external-database-test - External database setup
- state-save-and-restore - State file management
Support
For issues or questions:
- Solo Documentation: https://github.com/hashgraph/solo
- Task Documentation: https://taskfile.dev/
- Hiero Documentation: https://docs.hedera.com/