How to Deploy Apache Cassandra on Vultr Kubernetes Engine

Updated on July 25, 2024
How to Deploy Apache Cassandra on Vultr Kubernetes Engine header image

Introduction

Apache Cassandra is an open-source distributed NoSQL database designed to handle large volumes of data across multiple commodity servers. It's distributed architecture avoids single points of failure and enables horizontal scalability. Cassandra excels at write-heavy workloads and offers high write and read throughput, making it ideal for data-intensive applications. It also provides tunable consistency, accommodating varying data consistency needs.

The K8ssandra project is a collection of components, including a Kubernetes operator designed to automate the management of Apache Cassandra clusters in a Kubernetes cluster. In this article, set up a multi-node Apache Cassandra cluster in a Vultr Kubernetes Engine (VKE) cluster using the K8ssandra Operator.

Prerequisites

Before you begin:

Install Cert-Manager

Cert-Manager is a Kubernetes operator that manages and issues TLS/SSL certificates within a cluster from trusted authorities such as Let's Encrypt. K8ssandra uses cert-manager to automate certificate management within a Cassandra clusters. This includes creating the Java keystores and truststores needed from the certificates. Follow the steps in this section to install the cert-manager resources required by the K8ssandra Operator.

  1. Using Helm, add the Cert-Manager Helm repository to your local repositories.

    console
    $ helm repo add jetstack https://charts.jetstack.io
    
  2. Update the local Helm charts index.

    console
    $ helm repo update
    
  3. Install Cert-Manager to your VKE cluster.

    console
    $ helm install cert-manager jetstack/cert-manager --namespace cert-manager --create-namespace --set installCRDs=true
    
  4. When successful, verify that all Cert-Manager resources are available in the cluster

    console
    $ kubectl get all -n cert-manager
    

    Output:

    NAME                                           READY   STATUS    RESTARTS   AGE
    pod/cert-manager-5f68c9c6dd-stmp6              1/1     Running   0          35h
    pod/cert-manager-cainjector-57d6fc9f7d-gwqr5   1/1     Running   0          35h
    pod/cert-manager-webhook-5b7ffbdc98-sq8kg      1/1     Running   0          35h
    
    NAME                           TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)    AGE
    service/cert-manager           ClusterIP   10.102.38.47   <none>        9402/TCP   35h
    service/cert-manager-webhook   ClusterIP   10.97.255.91   <none>        443/TCP    35h
    
    NAME                                      READY   UP-TO-DATE   AVAILABLE   AGE
    deployment.apps/cert-manager              1/1     1            1           35h
    deployment.apps/cert-manager-cainjector   1/1     1            1           35h
    deployment.apps/cert-manager-webhook      1/1     1            1           35h
    
    NAME                                                 DESIRED   CURRENT   READY   AGE
    replicaset.apps/cert-manager-5f68c9c6dd              1         1         1       35h
    replicaset.apps/cert-manager-cainjector-57d6fc9f7d   1         1         1       35h
    replicaset.apps/cert-manager-webhook-5b7ffbdc98      1         1         1       35h

Install the K8ssandra Operator

  1. Add the K8ssandra operator repository to your Helm sources.

    console
    $ helm repo add k8ssandra https://helm.k8ssandra.io/stable
    
  2. Install the K8ssandra operator in your cluster.

    console
    $ helm install k8ssandra-operator k8ssandra/k8ssandra-operator -n k8ssandra-operator --create-namespace
    
  3. Wait for at least 3 minutes and view the cluster deployment to verify that the K8ssandra operator is available

    console
    $ kubectl -n k8ssandra-operator get deployment
    

    Your output should look like the one below:

    NAME                               READY   UP-TO-DATE   AVAILABLE   AGE
    k8ssandra-operator                 0/1     1            1           10s
    k8ssandra-operator-cass-operator   0/1     1            1           10s
  4. Verify that the K8ssandra operator pods are ready and running

    console
    $ kubectl get pods -n k8ssandra-operator
    

    Your output should look like the one below:

    NAME                                               READY   STATUS              RESTARTS   AGE
    k8ssandra-operator-765bcf99bf-7jfmj                0/1     ContainerCreating   0          11s
    k8ssandra-operator-cass-operator-b9cc84556-hb6jv   0/1     Running             0          11s

Set Up a Multi-node Apache Cassandra Cluster on VKE

  1. Using a text editor such as nano, create a new manifest file cluster.yaml.

    console
    $ nano cluster.yaml
    
  2. Add the following contents to the file.

    yaml
    apiVersion: k8ssandra.io/v1alpha1
    kind: K8ssandraCluster
    metadata:
      name: demo
    spec:
      cassandra:
        serverVersion: "4.0.1"
        datacenters:
          - metadata:
              name: dc1
            size: 3
            storageConfig:
              cassandraDataVolumeClaimSpec:
                storageClassName: vultr-block-storage
                accessModes:
                  - ReadWriteOnce
                resources:
                  requests:
                    storage: 10Gi
            config:
              jvmOptions:
                heapSize: 512M
          stargate:
            size: 1
            heapSize: 256M
    

    Save and close the file.

    The above configuration file defines the Cassandra cluster configuration with the following values:

    • Cassandra version: 4.0.1
    • Three cluster worker nodes.
    • The vultr-block-storage storage class with a 10 GB volume size per PVC.
    • TheCassandra node JVM heap size is 512 MB.
    • The Stargate node JVM is allocated 256 MB heap.
  3. Apply the deployment to your cluster

    console
    $ kubectl apply -n k8ssandra-operator -f cluster.yaml
    
  4. Wait for at least 15 minutes and view the cluster pods.

    console
    $ kubectl get pods -n k8ssandra-operator --watch
    

    Verify that all pods are ready and running similar to the output below:

    NAME                                               READY   STATUS    RESTARTS   AGE
    demo-dc1-default-sts-0                             0/2     Pending   0          3s
    demo-dc1-default-sts-1                             0/2     Pending   0          3s
    demo-dc1-default-sts-2                             0/2     Pending   0          3s
    k8ssandra-operator-765bcf99bf-7jfmj                1/1     Running   0          6m21s
    k8ssandra-operator-cass-operator-b9cc84556-hb6jv   1/1     Running   0          6m21s

    When all Cassandra database pods are ready, the Stargate Pod creation is initiated. Stargate provides a data gateway with REST, GraphQL, and Document APIs in front of the Cassandra database. The name of the Stargate Pod shoudl be similar to: demo-dc1-default-stargate-deployment-597b876d8f-559pt.

Verify the Linked Vultr Block Storage PVCs For Cassandra Cluster Persistence

The Kubernetes StatefulSet controller creates the Cassandra cluster pods which in turn is created by the K8ssandra operator during cluster deployment. StatefulSet is the key to data persistence of each node in the Cassandra cluster and there is one StatefulSet in each cluster node. Verify the available cluster PVCs to improve the Cassandra cluster persistence.

  1. Verify that the StatefulSets are available and ready in your cluster

    console
    $ kubectl get statefulset -n k8ssandra-operator
    

    Output:

    NAME                   READY   AGE
    demo-dc1-default-sts   3/3     15m
  2. Verify the available Storage class.

    console
    $ kubectl get sc vultr-block-storage
    

    Output:

    NAME                  PROVISIONER           RECLAIMPOLICY   VOLUMEBINDINGMODE   ALLOWVOLUMEEXPANSION   AGE
    vultr-block-storage   block.csi.vultr.com   Delete          Immediate           true                   36m

    As displayed in the above output, Vultr offers both HDD and NVME block storage technologies.The Vultr Container Storage Interface (CSI) to connect your VKE cluster to Vultr block storage and deploy the high performance NVME class. It is automatically deployed by the managed control plane in the VKE cluster.

  3. To verify the deployed Vultr Block Storage volumes attached to your VKE cluster, view all cluster PVCs.

    console
    $ kubectl get pvc --all-namespaces
    

    In addition, open the Vultr Customer Postal and navigate to the Linked Resources tab in your VKE cluster control panel:

    View VKE Cluster Linked Resources

    You can also verify the linked volumes by navigating to your Vultr Account Block Storage Page.

    View Vultr Block Storage Volumes

Create a Kubernetes Service to Access the Cassandra cluster

  1. Create a new service resource file service.yaml.

    console
    $ nano service.yaml
    
  2. Add the following contents to the file.

    yaml
    apiVersion: v1
    kind: Service
    metadata:
      name: cassandra
      labels:
        app: cassandra
    spec:
      type: LoadBalancer
      externalTrafficPolicy: Local
      ports:
        - port: 9042
      selector:
        statefulset.kubernetes.io/pod-name: demo-dc1-default-sts-1
    

    Save and close the file.

    The above configuration defines a Kubernetes service with a LoadBalancer type to access Cassandra cluster using port 9042.

  3. Apply the service to your cluster.

    console
    $ kubectl apply -n k8ssandra-operator -f service.yaml
    
  4. Wait for at least 5 minutes to deploy the cluster Loadbalancer resource and view the Cassandra service

    console
    $ kubectl get svc/cassandra -n k8ssandra-operator
    

    Verify the IP Address in your External-IP value to use for accessing the cluster

    NAME        TYPE           CLUSTER-IP      EXTERNAL-IP   PORT(S)          AGE
    cassandra   LoadBalancer   10.110.178.86   <pending>     9042:32313/TCP   3s

Test the Apache Cassandra Cluster

cqlsh is a command-line interface that allows users to connect to a Cassandra cluster. Follow the steps in this section to execute CQL (Cassandra Query Language) statements, and perform various database operations, such as creating, modifying, and querying data.

  1. Export your Cassandra service Load Balancer IP to the CASS_IP variable.

    console
    $ CASS_IP=$(kubectl get svc cassandra -n k8ssandra-operator -o jsonpath="{.status.loadBalancer.ingress[*].ip}")
    

    View the variable value.

    console
    $ echo $CASS_IP
    
  2. Export the cluster access username to the CASS_USERNAME variable.

    console
    $ CASS_USERNAME=$(kubectl get secret demo-superuser -n k8ssandra-operator -o=jsonpath='{.data.username}' | base64 --decode)
    

    View the username value.

    console
    $ echo $CASS_USERNAME
    
  3. Export the cluster the password to the CASS_PASSWORD variable.

    console
    $ CASS_PASSWORD=$(kubectl get secret demo-superuser -n k8ssandra-operator -o=jsonpath='{.data.password}' | base64 --decode)
    

    View the password value.

    console
    $ echo $CASS_PASSWORD
    
  4. Using cqlsh, log in to the Cassandra cluster using your variable values.

    console
    $ cqlsh -u $CASS_USERNAME -p $CASS_PASSWORD $CASS_IP 9042
    
  5. Create a new keyspace demo in the Cassandra database.

    sql
    > CREATE KEYSPACE demo WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 3};
    
  6. Create a new table users in the demo keyspace.

    sql
    > CREATE TABLE demo.users (id text primary key, name text, country text);
    
  7. Add data to the users table.

    sql
    > INSERT INTO demo.users (id, name, country) values ('42', 'John Doe', 'UK');
    > INSERT INTO demo.users (id, name, country) values ('43', 'Joe Smith', 'US');
    
  8. Query the table data to view the stored values.

    sql
    > SELECT * FROM demo.users;
    

    Output:

    id | country | name
    ----+---------+-----------
    43 |      US | Joe Smith
    42 |      UK | John Doe
    
    (2 rows)

Conclusion

You have deployed Apache Cassandra in a Vultr Kubernetes Engine (VKE) cluster using the open-source K8ssandra cluster. In addition, you set up Vultr Block Storage for data persistence and accessed the Cassandra cluster using the cqlsh CLI tool. For more information and Cassandra operations, visit the official documentation.

More Information