SOLO-2019

PrometheusInstallFailedSoloError — Deployment

PrometheusInstallFailedSoloError

CodeSOLO-2019
CategoryDeployment
OwnershipInfrastructure
RetryableNo

Description

Thrown during cluster setup when the Prometheus stack Helm chart fails to install; the underlying failure is wrapped in cause. The Prometheus stack supplies the monitoring and metrics collection for the cluster-level stack installed by solo cluster-ref config setup. The failure is typically a Helm error (bad chart version or values), an image that cannot be pulled, or a cluster without the resources/connectivity to schedule its pods.

Troubleshooting Steps

  1. Check solo logs: tail -n 100 ~/.solo/logs/solo.log
  2. Inspect cluster pods: kubectl get pods -A
  3. Check Helm release status: helm list -A
  4. Verify cluster connectivity: kubectl cluster-info