How to Deploy Apache Cassandra on Kubernetes

Updated on 06 May, 2025
Learn to deploy a scalable Apache Cassandra cluster on Kubernetes using K8ssandra Operator for efficient data management and high performance.
How to Deploy Apache Cassandra on Kubernetes header image

Apache Cassandra is a highly scalable, distributed NoSQL database designed to handle large volumes of data across multiple commodity servers. It's distributed architecture avoids single points of failure and enables horizontal scalability. Cassandra excels at write-heavy workloads and offers high write and read throughput, making it ideal for data-intensive applications. It also provides tunable consistency, accommodating varying data consistency needs.

K8ssandra is an open-source project that simplifies the deployment and management of Apache Cassandra on Kubernetes. It includes the K8ssandra Operator, which automates tasks such as cluster provisioning, scaling, backups, and repairs.

This article explains how to deploy a multi-node Apache Cassandra cluster on a Kubernetes Engine using the K8ssandra Operator.

Prerequisites

Before you begin, you need to:

Install Cassandra CLI (cqlsh)

The Cassandra CLI tool cqlsh is a Python-based command-line utility used to interact with Cassandra databases. Follow the steps below to install it.

  1. Update the server package index.

    console
    $ sudo apt update
    
  2. Install Python and Pip.

    console
    $ sudo apt install -y python3 python3-pip python3.12-venv
    
  3. View the installed Python version.

    console
    $ python3 --version
    

    Your output should be similar to the one below:

    Python 3.12.7
  4. View the installed Pip version.

    console
    $ pip --version
    

    Your output should be similar to the one below:

    pip 24.2 from /usr/lib/python3/dist-packages/pip (python 3.12)
  5. Create a python virtual environment.

    console
    $ python3 -m venv cassandra
    
  6. Activate the python virtual environment.

    console
    $ source cassandra/bin/activate
    
  7. Install the latest version of the cqlsh command-line interface:

    console
    $ pip install -U cqlsh
    
  8. View the installed cqlsh version.

    console
    $ cqlsh --version
    

    Your output should be similar to the one below:

    cqlsh 6.2.0

Install Cert-Manager

Cert-Manager is a Kubernetes operator that manages and issues TLS/SSL certificates within a cluster from trusted authorities such as Let's Encrypt. K8ssandra uses cert-manager to automate certificate management within a Cassandra clusters. This includes creating the Java key-stores and trust-stores needed from the certificates. Follow the steps in this section to install the cert-manager resources required by the K8ssandra Operator.

  1. Using Helm, add the Cert-Manager Helm repository to your local repositories.

    console
    $ helm repo add jetstack https://charts.jetstack.io
    
  2. Update the local Helm charts index.

    console
    $ helm repo update
    
  3. Install Cert-Manager to your VKE cluster.

    console
    $ helm install cert-manager jetstack/cert-manager \
      --namespace cert-manager \
      --create-namespace \
      --set crds.enabled=true
    
  4. When successful, verify that all Cert-Manager resources are available in the cluster.

    console
    $ kubectl get all -n cert-manager
    

    Your output should be similar to the one below:

    NAME                                           READY   STATUS    RESTARTS   AGE
    pod/cert-manager-cainjector-686546c9f7-m9gp7   1/1     Running   0          43s
    pod/cert-manager-d6746cf45-sjjs6               1/1     Running   0          43s
    ...
    NAME                              TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)            AGE
    service/cert-manager              ClusterIP   10.110.17.176   <none>        9402/TCP           44s
    ...
    NAME                                      READY   UP-TO-DATE   AVAILABLE   AGE
    deployment.apps/cert-manager              1/1     1            1           43s
    ...
    NAME                                                 DESIRED   CURRENT   READY   AGE
    replicaset.apps/cert-manager-cainjector-686546c9f7   1         1         1       43s
    ...

Install the K8ssandra Operator

To manage Apache Cassandra clusters on Kubernetes, install the K8ssandra Operator using Helm. Follow the steps below to install the operator.

  1. Add the K8ssandra operator repository to your Helm sources.

    console
    $ helm repo add k8ssandra https://helm.k8ssandra.io/stable
    
  2. Install the K8ssandra operator in your cluster.

    console
    $ helm install k8ssandra-operator k8ssandra/k8ssandra-operator \
      --namespace k8ssandra-operator \
      --create-namespace
    
  3. Wait a few minutes and view the cluster deployment to verify that the K8ssandra operator is available.

    console
    $ kubectl -n k8ssandra-operator get deployment
    

    Your output should look like the one below:

    NAME                               READY   UP-TO-DATE   AVAILABLE   AGE
    k8ssandra-operator                 1/1     1            1           20s
    k8ssandra-operator-cass-operator   1/1     1            1           20s
  4. Verify that the K8ssandra operator pods are ready and running.

    console
    $ kubectl get pods -n k8ssandra-operator
    

    Your output should look like the one below:

    NAME                                                READY   STATUS    RESTARTS   AGE
    k8ssandra-operator-65b9c7c9c-km28b                  1/1     Running   0          46s
    k8ssandra-operator-cass-operator-54845bc4f6-hsqds   1/1     Running   0          46s

Set Up a Multi-Node Apache Cassandra Cluster on Kubernetes Engine

Use the K8ssandra Operator to deploy a highly available Cassandra cluster on Kubernetes Engine. Follow the steps below to set up Apache Cassandra Cluster.

  1. Check available StorageClasses in your cluster.

    console
    $ kubectl get storageclass
    

    Your output should look like the one below:

    NAME                             PROVISIONER           RECLAIMPOLICY   VOLUMEBINDINGMODE   ALLOWVOLUMEEXPANSION   AGE
    vultr-block-storage (default)    block.csi.vultr.com   Delete          Immediate           true                   6m24s
    vultr-block-storage-hdd          block.csi.vultr.com   Delete          Immediate           true                   6m25s
    vultr-block-storage-hdd-retain   block.csi.vultr.com   Retain          Immediate           true                   6m25s
    ...
  2. Using a text editor such as nano, create a new manifest file cluster.yaml.

    console
    $ nano cluster.yaml
    
  3. Add the following contents to the file. Replace vultr-block-storage with the available StorageClass name in your cluster (as listed in the previous step).

    yaml
    apiVersion: k8ssandra.io/v1alpha1
    kind: K8ssandraCluster
    metadata:
      name: demo
    spec:
      cassandra:
        serverVersion: "4.0.1"
        datacenters:
          - metadata:
              name: dc1
            size: 3
            storageConfig:
              cassandraDataVolumeClaimSpec:
                storageClassName: vultr-block-storage
                accessModes:
                  - ReadWriteOnce
                resources:
                  requests:
                    storage: 10Gi
            config:
              jvmOptions:
                heapSize: 512M
            stargate:
              size: 1
              heapSize: 256M
    

    Save and close the file.

    The above manifest file defines the Cassandra cluster configuration with the following values:

    • Cassandra version: 4.0.1
    • Three cluster worker nodes.
    • The vultr-block-storage storage class with a 10 GB persistent volume per node.
    • The Cassandra node JVM heap size is 512 MB.
    • The Stargate node JVM is allocated 256 MB heap.
    Note
    If you are deploying this on Vultr Kubernetes Engine (VKE), make sure to configure the Vultr CSI driver with your API key before deploying the cluster. This enables dynamic provisioning of block storage volumes. Refer to the Vultr CSI documentation for detailed setup instructions.
  4. Apply the deployment to your cluster.

    console
    $ kubectl apply -n k8ssandra-operator -f cluster.yaml
    
  5. Wait for at least 15 minutes and view the cluster pods.

    console
    $ kubectl get pods -n k8ssandra-operator --watch
    

    Verify that all pods are ready and running similar to the output below:

    NAME                                                    READY   STATUS    RESTARTS   AGE
    demo-dc1-default-stargate-deployment-64747477d7-hfck9   1/1     Running   0          78s
    demo-dc1-default-sts-0                                  2/2     Running   0          6m5s
    demo-dc1-default-sts-1                                  2/2     Running   0          6m5s
    demo-dc1-default-sts-2                                  2/2     Running   0          6m5s
    k8ssandra-operator-65b9c7c9c-km28b                      1/1     Running   0          17m
    k8ssandra-operator-cass-operator-54845bc4f6-hsqds       1/1     Running   0          17m

    When all Cassandra database pods are ready, the Stargate Pod creation is initiated. Stargate provides a data gateway with REST, GraphQL, and Document APIs in front of the Cassandra database. The name of the Stargate Pod should be similar to: demo-dc1-default-stargate-deployment-xxxxxxxxx-xxxxx.

Verify the Linked Block Storage PVCs for Cassandra Cluster Persistence

The K8ssandra operator deploys Apache Cassandra pods as StatefulSets to ensure stable network identities and persistent storage. Each StatefulSet manages a Cassandra node and is backed by a Persistent Volume Claim (PVC) for durable data storage. This section helps you verify that the PVCs are provisioned through Block Storage service for your Kubernetes Engine cluster.

  1. Verify that the StatefulSet are available and ready in your cluster.

    console
    $ kubectl get statefulset -n k8ssandra-operator
    

    Your output should look like the one below:

    NAME                   READY   AGE
    demo-dc1-default-sts   3/3     7m14s

    This above output validates that all three Cassandra nodes are initialized and the StatefulSet is active.

  2. Verify the available Storage class.

    console
    $ kubectl get sc vultr-block-storage
    

    Your output should look like the one below:

    NAME                            PROVISIONER           RECLAIMPOLICY   VOLUMEBINDINGMODE   ALLOWVOLUMEEXPANSION   AGE
    vultr-block-storage (default)   block.csi.vultr.com   Delete          Immediate           true                   24m
  3. List all Persistent Volume Claims (PVCs) across namespaces to confirm they are bound.

    console
    $ kubectl get pvc --all-namespaces
    

    Your output should look like the one below:

    NAMESPACE            NAME                                 STATUS   VOLUME                 CAPACITY   ACCESS MODES   STORAGECLASS          VOLUMEATTRIBUTESCLASS   AGE
    k8ssandra-operator   server-data-demo-dc1-default-sts-0   Bound    pvc-a62852bae9d24dad   10Gi       RWO            vultr-block-storage   <unset>                 13m
    k8ssandra-operator   server-data-demo-dc1-default-sts-1   Bound    pvc-cfb279b19d0c4a55   10Gi       RWO            vultr-block-storage   <unset>                 13m
    k8ssandra-operator   server-data-demo-dc1-default-sts-2   Bound    pvc-b09e184f4d7741f6   10Gi       RWO            vultr-block-storage   <unset>                 13m

Create a Kubernetes Service To Access the Cassandra Cluster

To expose your Cassandra cluster externally and allow connections using the native Cassandra protocol (CQL) on port 9042, you must create a Kubernetes Service resource of type LoadBalancer. This will assign a public IP via Load Balancer integration.

  1. Create a new service resource file service.yaml.

    console
    $ nano service.yaml
    
  2. Add the following contents to the file.

    yaml
    apiVersion: v1
    kind: Service
    metadata:
      name: cassandra
      labels:
        app: cassandra
    spec:
      type: LoadBalancer
      externalTrafficPolicy: Local
      ports:
        - port: 9042
          targetPort: 9042
      selector:
        app.kubernetes.io/name: cassandra
    

    Save and close the file.

    The above configuration defines a Kubernetes service with a LoadBalancer type to access Cassandra cluster using port 9042.

  3. Apply the service in the k8ssandra-operator namespace.

    console
    $ kubectl apply -n k8ssandra-operator -f service.yaml
    
  4. Wait for at least 5 minutes to deploy the cluster Loadbalancer resource and view the Cassandra service.

    console
    $ kubectl get svc/cassandra -n k8ssandra-operator
    

    Verify the IP Address in your External-IP value to use for accessing the cluster.

    NAME        TYPE           CLUSTER-IP      EXTERNAL-IP   PORT(S)          AGE
    cassandra   LoadBalancer   10.103.92.169   192.0.2.1     9042:32444/TCP   2m12s
    Note
    If the EXTERNAL-IP field shows <pending>, wait a few more minutes and re-run the command. After an IP is assigned, use this IP to connect to Cassandra with a CQL client such as cqlsh.

Test the Apache Cassandra Cluster

cqlsh is a command-line interface that allows users to connect to the Cassandra cluster. Follow the steps in this section to execute CQL (Cassandra Query Language) statements, and perform various database operations, such as creating, modifying, and querying data.

  1. Export your Cassandra service Load Balancer IP to the CASS_IP variable.

    console
    $ CASS_IP=$(kubectl get svc cassandra -n k8ssandra-operator -o jsonpath="{.status.loadBalancer.ingress[*].ip}")
    
  2. View the assigned Cassandra IP.

    console
    $ echo $CASS_IP
    
  3. Export the cluster access username to the CASS_USERNAME variable.

    console
    $ CASS_USERNAME=$(kubectl get secret demo-superuser -n k8ssandra-operator -o=jsonpath='{.data.username}' | base64 --decode)
    
  4. View the Cassandra username.

    console
    $ echo $CASS_USERNAME
    
  5. Export the cluster the password to the CASS_PASSWORD variable.

    console
    $ CASS_PASSWORD=$(kubectl get secret demo-superuser -n k8ssandra-operator -o=jsonpath='{.data.password}' | base64 --decode)
    
  6. View the Cassandra password.

    console
    $ echo $CASS_PASSWORD
    
  7. Using cqlsh, log in to the Cassandra cluster using your variable values.

    console
    $ cqlsh -u $CASS_USERNAME -p $CASS_PASSWORD $CASS_IP 9042
    
  8. Create a new keyspace demo in the Cassandra database.

    sql
    demo-superuser@cqlsh> CREATE KEYSPACE demo WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 3};
    
  9. Create a new table users in the demo keyspace.

    sql
    demo-superuser@cqlsh> CREATE TABLE demo.users (id text primary key, name text, country text);
    
  10. Insert records into the users table.

    sql
    demo-superuser@cqlsh> BEGIN BATCH
                             INSERT INTO demo.users (id, name, country) VALUES ('42', 'John Doe', 'UK');
                             INSERT INTO demo.users (id, name, country) VALUES ('43', 'Joe Smith', 'US');
                          APPLY BATCH;
    
  11. Query the table data to view the stored values.

    sql
    demo-superuser@cqlsh> SELECT * FROM demo.users;
    

    Your output should be similar to the one below:

     id | country | name
    ----+---------+-----------
     43 |      US | Joe Smith
     42 |      UK |  John Doe
    
    (2 rows)

Conclusion

You have deployed an Apache Cassandra cluster on a Kubernetes Engine environment using the open-source K8ssandra operator. You configured persistent data storage with Block Storage and accessed the Cassandra cluster using the cqlsh CLI. For more information on advanced configuration and usage, refer to the official Cassandra documentation.

Comments

No comments yet.