How to Deploy Apache Cassandra on Vultr Kubernetes Engine
Introduction
Apache Cassandra is an open-source distributed NoSQL database designed to handle large volumes of data across multiple commodity servers. It's distributed architecture avoids single points of failure and enables horizontal scalability. Cassandra excels at write-heavy workloads and offers high write and read throughput, making it ideal for data-intensive applications. It also provides tunable consistency, accommodating varying data consistency needs.
The K8ssandra project is a collection of components, including a Kubernetes operator designed to automate the management of Apache Cassandra clusters in a Kubernetes cluster. In this article, set up a multi-node Apache Cassandra cluster in a Vultr Kubernetes Engine (VKE) cluster using the K8ssandra Operator.
Prerequisites
Before you begin:
Deploy a Vultr Kubernetes Engine (VKE) with at least
4
nodes and4 GB
RAM per nodeDeploy a Ubuntu Server on Vultr to use as the development machine
Access the server using SSHas a non-root user with sudo privileges
Install and Configure Kubectl to access the cluster
Using the Python
pip
package manager, install the Cassandracqlsh
CLI tool:console$ pip3 install -U cqlsh
Install the Helm package manager:
console$ snap install helm --classic
Install Cert-Manager
Cert-Manager is a Kubernetes operator that manages and issues TLS/SSL certificates within a cluster from trusted authorities such as Let's Encrypt. K8ssandra uses cert-manager
to automate certificate management within a Cassandra clusters. This includes creating the Java keystores and truststores needed from the certificates. Follow the steps in this section to install the cert-manager
resources required by the K8ssandra Operator.
Using Helm, add the Cert-Manager Helm repository to your local repositories.
console$ helm repo add jetstack https://charts.jetstack.io
Update the local Helm charts index.
console$ helm repo update
Install Cert-Manager to your VKE cluster.
console$ helm install cert-manager jetstack/cert-manager --namespace cert-manager --create-namespace --set installCRDs=true
When successful, verify that all Cert-Manager resources are available in the cluster
console$ kubectl get all -n cert-manager
Output:
NAME READY STATUS RESTARTS AGE pod/cert-manager-5f68c9c6dd-stmp6 1/1 Running 0 35h pod/cert-manager-cainjector-57d6fc9f7d-gwqr5 1/1 Running 0 35h pod/cert-manager-webhook-5b7ffbdc98-sq8kg 1/1 Running 0 35h NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/cert-manager ClusterIP 10.102.38.47 <none> 9402/TCP 35h service/cert-manager-webhook ClusterIP 10.97.255.91 <none> 443/TCP 35h NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/cert-manager 1/1 1 1 35h deployment.apps/cert-manager-cainjector 1/1 1 1 35h deployment.apps/cert-manager-webhook 1/1 1 1 35h NAME DESIRED CURRENT READY AGE replicaset.apps/cert-manager-5f68c9c6dd 1 1 1 35h replicaset.apps/cert-manager-cainjector-57d6fc9f7d 1 1 1 35h replicaset.apps/cert-manager-webhook-5b7ffbdc98 1 1 1 35h
Install the K8ssandra Operator
Add the K8ssandra operator repository to your Helm sources.
console$ helm repo add k8ssandra https://helm.k8ssandra.io/stable
Install the K8ssandra operator in your cluster.
console$ helm install k8ssandra-operator k8ssandra/k8ssandra-operator -n k8ssandra-operator --create-namespace
Wait for at least
3
minutes and view the cluster deployment to verify that the K8ssandra operator is availableconsole$ kubectl -n k8ssandra-operator get deployment
Your output should look like the one below:
NAME READY UP-TO-DATE AVAILABLE AGE k8ssandra-operator 0/1 1 1 10s k8ssandra-operator-cass-operator 0/1 1 1 10s
Verify that the K8ssandra operator pods are ready and running
console$ kubectl get pods -n k8ssandra-operator
Your output should look like the one below:
NAME READY STATUS RESTARTS AGE k8ssandra-operator-765bcf99bf-7jfmj 0/1 ContainerCreating 0 11s k8ssandra-operator-cass-operator-b9cc84556-hb6jv 0/1 Running 0 11s
Set Up a Multi-node Apache Cassandra Cluster on VKE
Using a text editor such as
nano
, create a new manifest filecluster.yaml
.console$ nano cluster.yaml
Add the following contents to the file.
yamlapiVersion: k8ssandra.io/v1alpha1 kind: K8ssandraCluster metadata: name: demo spec: cassandra: serverVersion: "4.0.1" datacenters: - metadata: name: dc1 size: 3 storageConfig: cassandraDataVolumeClaimSpec: storageClassName: vultr-block-storage accessModes: - ReadWriteOnce resources: requests: storage: 10Gi config: jvmOptions: heapSize: 512M stargate: size: 1 heapSize: 256M
Save and close the file.
The above configuration file defines the Cassandra cluster configuration with the following values:
- Cassandra version:
4.0.1
- Three cluster worker nodes.
- The
vultr-block-storage
storage class with a 10 GB volume size per PVC. - TheCassandra node JVM heap size is 512 MB.
- The Stargate node JVM is allocated 256 MB heap.
- Cassandra version:
Apply the deployment to your cluster
console$ kubectl apply -n k8ssandra-operator -f cluster.yaml
Wait for at least
15
minutes and view the cluster pods.console$ kubectl get pods -n k8ssandra-operator --watch
Verify that all pods are ready and running similar to the output below:
NAME READY STATUS RESTARTS AGE demo-dc1-default-sts-0 0/2 Pending 0 3s demo-dc1-default-sts-1 0/2 Pending 0 3s demo-dc1-default-sts-2 0/2 Pending 0 3s k8ssandra-operator-765bcf99bf-7jfmj 1/1 Running 0 6m21s k8ssandra-operator-cass-operator-b9cc84556-hb6jv 1/1 Running 0 6m21s
When all Cassandra database pods are ready, the Stargate
Pod
creation is initiated. Stargate provides a data gateway with REST, GraphQL, and Document APIs in front of the Cassandra database. The name of the StargatePod
shoudl be similar to:demo-dc1-default-stargate-deployment-597b876d8f-559pt
.
Verify the Linked Vultr Block Storage PVCs For Cassandra Cluster Persistence
The Kubernetes StatefulSet
controller creates the Cassandra cluster pods which in turn is created by the K8ssandra operator during cluster deployment. StatefulSet
is the key to data persistence of each node in the Cassandra cluster and there is one StatefulSet
in each cluster node. Verify the available cluster PVCs to improve the Cassandra cluster persistence.
Verify that the
StatefulSet
s are available and ready in your clusterconsole$ kubectl get statefulset -n k8ssandra-operator
Output:
NAME READY AGE demo-dc1-default-sts 3/3 15m
Verify the available Storage class.
console$ kubectl get sc vultr-block-storage
Output:
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE vultr-block-storage block.csi.vultr.com Delete Immediate true 36m
As displayed in the above output, Vultr offers both
HDD
andNVME
block storage technologies.The Vultr Container Storage Interface (CSI) to connect your VKE cluster to Vultr block storage and deploy the high performanceNVME
class. It is automatically deployed by the managed control plane in the VKE cluster.To verify the deployed Vultr Block Storage volumes attached to your VKE cluster, view all cluster PVCs.
console$ kubectl get pvc --all-namespaces
In addition, open the Vultr Customer Postal and navigate to the Linked Resources tab in your VKE cluster control panel:
You can also verify the linked volumes by navigating to your Vultr Account Block Storage Page.
Create a Kubernetes Service to Access the Cassandra cluster
Create a new service resource file
service.yaml
.console$ nano service.yaml
Add the following contents to the file.
yamlapiVersion: v1 kind: Service metadata: name: cassandra labels: app: cassandra spec: type: LoadBalancer externalTrafficPolicy: Local ports: - port: 9042 selector: statefulset.kubernetes.io/pod-name: demo-dc1-default-sts-1
Save and close the file.
The above configuration defines a Kubernetes service with a
LoadBalancer
type to access Cassandra cluster using port9042
.Apply the service to your cluster.
console$ kubectl apply -n k8ssandra-operator -f service.yaml
Wait for at least
5
minutes to deploy the cluster Loadbalancer resource and view the Cassandra serviceconsole$ kubectl get svc/cassandra -n k8ssandra-operator
Verify the IP Address in your External-IP value to use for accessing the cluster
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE cassandra LoadBalancer 10.110.178.86 <pending> 9042:32313/TCP 3s
Test the Apache Cassandra Cluster
cqlsh
is a command-line interface that allows users to connect to a Cassandra cluster. Follow the steps in this section to execute CQL (Cassandra Query Language) statements, and perform various database operations, such as creating, modifying, and querying data.
Export your Cassandra service Load Balancer IP to the
CASS_IP
variable.console$ CASS_IP=$(kubectl get svc cassandra -n k8ssandra-operator -o jsonpath="{.status.loadBalancer.ingress[*].ip}")
View the variable value.
console$ echo $CASS_IP
Export the cluster access username to the
CASS_USERNAME
variable.console$ CASS_USERNAME=$(kubectl get secret demo-superuser -n k8ssandra-operator -o=jsonpath='{.data.username}' | base64 --decode)
View the username value.
console$ echo $CASS_USERNAME
Export the cluster the password to the
CASS_PASSWORD
variable.console$ CASS_PASSWORD=$(kubectl get secret demo-superuser -n k8ssandra-operator -o=jsonpath='{.data.password}' | base64 --decode)
View the password value.
console$ echo $CASS_PASSWORD
Using
cqlsh
, log in to the Cassandra cluster using your variable values.console$ cqlsh -u $CASS_USERNAME -p $CASS_PASSWORD $CASS_IP 9042
Create a new keyspace
demo
in the Cassandra database.sql> CREATE KEYSPACE demo WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 3};
Create a new table
users
in thedemo
keyspace.sql> CREATE TABLE demo.users (id text primary key, name text, country text);
Add data to the
users
table.sql> INSERT INTO demo.users (id, name, country) values ('42', 'John Doe', 'UK'); > INSERT INTO demo.users (id, name, country) values ('43', 'Joe Smith', 'US');
Query the table data to view the stored values.
sql> SELECT * FROM demo.users;
Output:
id | country | name ----+---------+----------- 43 | US | Joe Smith 42 | UK | John Doe (2 rows)
Conclusion
You have deployed Apache Cassandra in a Vultr Kubernetes Engine (VKE) cluster using the open-source K8ssandra cluster. In addition, you set up Vultr Block Storage for data persistence and accessed the Cassandra cluster using the cqlsh
CLI tool. For more information and Cassandra operations, visit the official documentation.