How to Monitor Your VKE Cluster with tobs

Updated on November 21, 2023
How to Monitor Your VKE Cluster with tobs header image

Introduction

tobs - The Observability Stack for Kubernetes is a Kubernetes monitoring stack that stores metrics in Prometheus, visualize them with Grafana, and store metrics in a TimescaleDB as long-term storage. Some key components of tobs are:

  • Prometheus to collect metrics
  • Grafana to visualize metrics from Prometheus
  • Promscale to store metrics in long-term storage and allow analysis with both PromQL and SQL
  • TimescaleDB as the long-term database for Promscale
  • Other useful components such as AlertManager, Node-Exporter, Kube-State-Metrics, and so on.

This tutorial explains how to:

  • Install tobs in your Vultr Kubernetes Engine (VKE) cluster
  • Use Prometheus to store your metrics in a long-term database
  • Use Vultr Object Storage as the backup for your metrics database
  • Use Grafana to visualize your metrics
  • Access the database and perform SQL queries for complex analysis on metrics

Prerequisites

Create Vultr Kubernetes Cluster

  • Create a new VKE cluster with at least two 4GB nodes. 2GB nodes are not recommended for the tobs stack.
  • Download the configuration and configure kubectl on your local machine

Create Vultr Object Storage

  • Create a new Vultr Object Storage to use as an S3-compatible backup for TimescaleDB.
  • Create a new bucket inside your Vultr Object Storage. The bucket name is demo-tobs in this tutorial.

Install tobs

  1. Check your Kubernetes version.

     $ kubectl get nodes

    The result is like:

     NAME                STATUS   ROLES    AGE     VERSION
     node-02e926515d59   Ready    <none>   2m20s   v1.22.6
     node-85a58c1750d1   Ready    <none>   2m17s   v1.22.6
  2. Check the compatibility matrix to see which tobs version supports your Kubernetes version.

  3. Install tobs with Helm.

    Using tobs to install a full observability stack with openTelemetry support requires installation of cert-manager. To do install it please follow the cert-manager documentation.

    cert-manager is not required when using tobs with opentelemetry support disabled.

    The following command will install Kube-Prometheus, TimescaleDB, OpenTelemetry Operator, and Promscale into your Kubernetes cluster:

     $ helm repo add timescale https://charts.timescale.com/
     $ helm repo update
     $ helm install --wait <release_name> timescale/tobs

    > Note: --wait flag is necessary for successful installation as tobs helm chart can create opentelemetry Custom Resources only after opentelemetry-operator is up and running. This flag can be omitted when using tobs without opentelemetry support.

  4. To deploy tobs on your Vultr Kubernetes cluster, you need to customize the configuration of the tobs stack. Run the following command to create a configuration file named values.yaml

     $ helm show values timescale/tobs > my_values.yml
  5. Edit the values.yaml with your text editor and replace storage: 8Gi with storage: 10Gi. This change is essential because Vultr Block Storage requires at least 10GB in volume size.

  6. Under timescaledb-single, you can change the storage volume from size: 150Gi to a smaller number to optimize the resource cost.

  7. Install the tobs stack with S3 backup with the following command. You should enter the information of your Vultr Object Storage (bucket name, hostname, key, and secret key)

     $ helm upgrade --wait --install <release_name> --values my_values.yml timescale/tobs

    Your output should look like this. When prompted for the S3 credentials, use the values from your Vultr Object Storage bucket.

     WARNING: Using a generated self-signed certificate for TLS access to TimescaleDB.
              This should only be used for development and demonstration purposes.
              To use a signed certificate, use the "--tls-timescaledb-cert" and "--tls-timescaledb-key"
              flags when issuing the tobs install command.
    
     Creating TimescaleDB tobs-certificate secret
     Creating TimescaleDB tobs-credentials secret
    
     We'll be asking a few questions about S3 buckets, keys, secrets and endpoints.
    
     For background information, visit these pages:
    
     Amazon Web Services:
     - https://docs.aws.amazon.com/AmazonS3/latest/gsg/CreatingABucket.html
     - https://docs.aws.amazon.com/IAM/latest/UserGuide/introduction.html
    
     Digital Ocean:
     - https://developers.digitalocean.com/documentation/spaces/#aws-s3-compatibility
    
     Google Cloud:
     - https://cloud.google.com/storage/docs/migrating#migration-simple
    
     What is the name of the S3 bucket?
     demo-tobs
    
     What is the name of the S3 endpoint? (leave blank for default)
     ewr1.vultrobjects.com
    
     What is the region of the S3 endpoint? (leave blank for default)
    
     What is the S3 Key to use?
     NBWOK-redacted-1IMQSU
    
     What is the S3 Secret to use?
     tdCV0vN-redacted-Kx3OIVy
    
     Creating TimescaleDB tobs-pgbackrest secret
     Installing The Observability Stack
  8. The installation takes a few minutes to complete. You can check the progress with the kubectl get pods command. Keep in mind that you can see many CrashLoopBackOff statuses during the deployments.

  9. The following output of kubectl get pods indicates a successful installation. If you have any errors, check the Troubleshooting section at the end of this tutorial.

     NAME                                               READY   STATUS      RESTARTS       AGE
     alertmanager-tobs-kube-prometheus-alertmanager-0   2/2     Running     0              2m42s
     prometheus-tobs-kube-prometheus-prometheus-0       2/2     Running     0              2m42s
     tobs-grafana-95df94dc9-qpnsm                       3/3     Running     4 (95s ago)    2m54s
     tobs-grafana-db--1-g7pz7                           0/1     Completed   4              2m53s
     tobs-kube-prometheus-operator-d65c55845-xjz6h      1/1     Running     0              2m54s
     tobs-kube-state-metrics-56c568fdcc-6f8hp           1/1     Running     0              2m54s
     tobs-prometheus-node-exporter-2v74v                1/1     Running     0              2m54s
     tobs-prometheus-node-exporter-fj7x9                1/1     Running     0              2m54s
     tobs-promlens-7f778cc958-wb9dj                     1/1     Running     0              2m54s
     tobs-promscale-57497c97cf-htv2b                    1/1     Running     4 (119s ago)   2m54s
     tobs-timescaledb-0                                 2/2     Running     3 (40s ago)    2m53s 

Configuration

All component configuration happens through the helm values.yaml file. You can view the self-documenting default values.yaml in the project repository. See more documentation about individual configuration settings in the Helm chart docs.

Troubleshooting

  • Get all events:

      $ kubectl get events --sort-by='.metadata.creationTimestamp'
  • Describe any specific pod to find the problem:

      $ kubectl describe pod <pod name>
  • View logs of any pod:

      $ kubectl logs <pod name>

Promscale pod can have a CrashLoopBackOff error due to failed password authentication. Here are the steps to resolve the problem:

  1. View the log of the Promscale pod. Replace the pod name with your pod name

     $ kubectl logs tobs-promscale-f984f8bf7-t4jts
  2. You may see the following error:

     password authentication failed for user \"postgres\" (SQLSTATE 28P01))
  3. Get the password to access the database from tobs-credentials secret.

     $ kubectl get secrets tobs-credentials -o jsonpath="{.data.PATRONI_SUPERUSER_PASSWORD}"
  4. Edit password stored inside tobs-promscale secret where the key is PROMSCALE_DB_PASSWORD.

     $ kubectl edit secrets tobs-promscale
  5. Delete the Promscale pod. Replace the pod name with your pod name.

     $ kubectl delete pod tobs-promscale-f984f8bf7-jlxw8

More Information