How to Deploy a Multi-Node Apache Cassandra Database Cluster on Ubuntu 22.04

Introduction

A multi-node Apache Cassandra database cluster contains two or more database nodes (servers) in different cloud locations. This setup prevents Single-Point of Failure (SPOF) in case some nodes in the cluster fail. For instance, consider a scenario where you're running a cluster with three nodes. If one node fails due to network, power, or software problems, the rest of the nodes in the cluster can continue accepting read and write requests to the connected clients.

In a Cassandra multi-node setup, there is no leader election, and all participating nodes equally perform similar functions. Depending on your target level of data consistency, you must declare a replication strategy when creating a keyspace in the cluster. The strategy determines how Cassandra stores replicas across participating nodes and the total number of replicas per keyspace.

In this guide, deploy a multi-node Apache Cassandra database cluster on a Ubuntu 22.04 server, and test communication between the cluster nodes by adding sample data to a shared keyspace.

Prerequisites

Before you begin, in a single Vultr location:

Deploy three Ubuntu 22.04 servers with at least 16 GB RAM for production.
Add the servers to the same Vultr Virtual Private Cloud (VPC).

A VPC is a logical VLAN connection between Vultr servers that allows Apache Cassandra nodes to communicate in real time. Enable VPC peering to configure servers deployed in a different location to sync with the Cassandra cluster.

In this article, the following example IP Addresses represent the respective Server VPC Addresses.
- Server 1: 192.0.2.3
- Server 2: 192.0.2.4
- Server 3: 192.0.2.5
To view your server VPC address, run the following command in your SSH session.
```
  $ ip a
```

On each server:

Use SSH to access the server in a separate terminal session.
Create a non-root sudo user.
Switch to the sudo user account.
```
  # su example-user
```
Install the Apache Cassandra database server.

Configure Server Identity

In this section, configure the hostname of every server for fast identification in the Apache Cassandra cluster. For this article, Set each of the servers with the hostnames below:

Server 1: node-1
Server 2: node-1
Server 3: node-1

On each of the servers, set a new hostname identity as described in the steps below:

Using a text editor such as Nano, edit the /etc/hostname file.
```
 $ sudo nano /etc/hostname
```
Replace the default hostname (localhost) with the respective cluster-ID.
```
 node-1
```
Save and close the file.
Open the /etc/hosts file.
```
 $ sudo nano /etc/hosts
```
Add the following line at the end of the file.
```
 127.0.0.1 node-1
```
Save and close the file.
Reboot the server to apply changes.
```
 $ sudo reboot
```

Set Up Firewall Rules

To communicate and send keep-alive packets to all nodes in the cluster, set up firewall rules to accept Cassandra requests from the respective VPC server addresses. By default, Uncomplicated Firewall (UFW) is active on Vultr Ubuntu servers, set up UFW rules on each server as described below.

Server 1:

Allow connections to port 7000 and 9042 from Server 2 to Server 1.

 $ sudo ufw allow from 192.0.2.4 to 192.0.2.3 proto tcp port 7000,9042

Allow connections from Server 3 to Server 1.

 $ sudo ufw allow from 192.0.2.5 to 192.0.2.3 proto tcp port 7000,9042

Reload firewall rules to save changes.
```
 $ sudo ufw reload
```

Server 2:

Allow connections to port 7000 and 9042 from Server 1 to Server 2.

 $ sudo ufw allow from 192.0.2.3 to 192.0.2.4 proto tcp port 7000,9042

Allow connections from Server 3 to Server 2.

 $ sudo ufw allow from 192.0.2.5 to 192.0.2.4 proto tcp port 7000,9042

Reload firewall rules.
```
 $ sudo ufw reload
```

Server 3:

Allow connections to port 7000 and 9042 from Server 1 to Server 3.

 $ sudo ufw allow from 192.0.2.3 to 192.0.2.5 proto tcp port 7000,9042

Allow connections from Server 2 to Server 3.

 $ sudo ufw allow from 192.0.2.4 to 192.0.2.5 proto tcp port 7000,9042

Reload firewall rules to apply changes.
```
 $ sudo ufw reload
```

Configure the Apache Cassandra Cluster

In this section, set up the Apache Cassandra cluster information on each server to allow all nodes to communicate together. To set up the cluster information, make changes to the main configuration file /etc/cassandra/cassandra.yaml on each server as described below.

Stop the Apache Cassandra service.
```
 $ sudo systemctl stop cassandra
```
Clear existing Cassandra data directories to refresh the node state.
```
 $ sudo rm -rf /var/lib/cassandra/*
```
Open the /etc/cassandra/cassandra.yaml file.
```
 $ sudo nano /etc/cassandra/cassandra.yaml
```
The Cassandra configuration file contains many directives, when using the Nano text editor, press Ctrl + W to search a directive by string.
Find cluster_name:, and change the value from Test Cluster to AppDbCluster.
```
 ...
 cluster_name: 'AppDbCluster'
 ...
```
The above directive sets the cluster name to AppDbCluster. Verify that all servers in the cluster share the same name for a successful connection.
Find the seed_provider: directive, and enter the IP Addresses of other nodes in the format <ip>,<ip> as below.
```
 seed_provider:
       ...
       # Ex: "<ip1>,<ip2>,<ip3>"
       - seeds: "192.0.2.3:7000, 192.0.2.4:7000"
```
The above directives set the addresses Apache Cassandra sends seed nodes to find cluster information. Seed nodes are keep-alive packets known as gossips in Cassandra. To enable high availability in the cluster, you must set at least two seed addresses.
Find the listen_address: directive, and set the value to your node's VPC Address.
```
 listen_address: 192.0.2.3
```
The above directive sets the IP address used to communicate with other nodes in the cluster. Cassandra nodes use the inter-nodal gossip protocol that implements a peer-to-peer architecture to exchange state information in the cluster.
Find rpc_address:, and set it to your node's VPC address.
```
 rpc_address: 192.0.2.3
```
The above directive sets the client connection address. No connections are from other interfaces or addresses.
At the end of the file, add the auto_bootstrap: directive, and set it to false.
```
 auto_bootstrap: false
```
The above directive allows a node to join the Cassandra cluster without streaming data. When enabling it on existing clusters, set the value to true.

Save and close the Cassandra configuration file.
Start the Cassandra database server.
```
 $ sudo systemctl start cassandra
```
The Cassandra cluster may take up to 30 seconds to start.
Clear the Apache Cassandra data directories.
```
 $ sudo rm -rf /var/lib/cassandra/*
```

Verify that the Cassandra server is running.

 $ sudo systemctl status cassandra

Output:

 cassandra.service - LSB: distributed storage system for structured data
      Loaded: loaded (/etc/init.d/cassandra; generated)
      Active: active (running) since Tue 2023-08-01 20:33:27 UTC; 2min 19s ago

Manage the Apache Cassandra Cluster

nodetool is a utility used by the Apache Cassandra database server to manage view cluster information. On each of the server, use the tool to verify the cluster status.

Verify the database cluster status.

 $ sudo nodetool status

The above command displays the IP addresses, state, load, and node IDs in the cluster as below.

 Datacenter: datacenter1
 =======================
 Status=Up/Down
 |/ State=Normal/Leaving/Joining/Moving
 --  Address      Load        Tokens  Owns (effective)  Host ID                               Rack
 UN  192.0.2.3   104.34 KiB  16      64.7%             e29827ec-88b7-460c-98b4-88af145ec6ef  rack1
 UN  192.0.2.4   70.21 KiB   16      76.0%             a8742739-2439-4716-a6d9-4689a8fc7ad1  rack1
 UN  192.0.2.5   70.21 KiB   16      59.3%             f37874ad-28ac-49ec-b90a-322fa3b3d26c  rack1

On node-3, run the following command to view the IP list of remote seed nodes.
Run the following command on node-3 to get the IP list of the remote seed nodes.
```
 $ sudo nodetool getseeds
```
Output:
```
 Current list of seed node IPs, excluding the current node's IP: /192.0.2.3:7000 /192.0.2.3:7000
```
As displayed in the above output, the active node's IP address isn't listed if the node is part of the seeds.

On node-1, Log in to the Cassandra cluster.

 $ cqlsh 192.0.2.3 9042 -u cassandra -p cassandra

Output:

 Connected to AppDbCluster at 192.0.2.3:9042
 [cqlsh 6.1.0 | Cassandra 4.1.3 | CQL spec 3.4.6 | Native protocol v5]
 Use HELP for help.
 cassandra@cqlsh>

To redistribute the default system_auth keyspace in the cluster, change the replication_factor to 3.
```
 cqlsh> ALTER KEYSPACE system_auth 
        WITH replication = {
            'class': 'NetworkTopologyStrategy', 
            'replication_factor' : 3
        };
```
The above command keeps three copies of the system_auth keyspace on each node. When set to 1, the cluster shuts down when a single replica of the keyspace isn't available.

You should get a warning that requires a full cluster repair to redistribute data across all cluster nodes.
```
 Warnings :
 When increasing the replication factor you need to run a full (-full) repair to distribute the data.
```
Exit the cluster.
```
 cqlsh> QUIT;
```
Run the following cluster repair command to load the new replication strategy.
```
 $ sudo nodetool repair --full system_auth
```
Output:
```
 ... Repair completed successfully
```

Add Sample Data to the Cassandra Cluster

On node-1, connect to the Cassandra cluster:

 $ cqlsh 192.0.2.3 9042 -u cassandra -p cassandra

Create a sample keyspace with a replication_factor of 3.

 cqlsh> CREATE KEYSPACE my_keyspace
            WITH REPLICATION = { 
                'class': 'NetworkTopologyStrategy', 
                'replication_factor' : 3
            };

Switch to the new keyspace.
```
 cqlsh> USE my_keyspace;
```

Create a sample customers table with three columns.

 cqlsh:my_keyspace> CREATE TABLE customers (
                        customer_id UUID PRIMARY KEY,                     
                        first_name TEXT,
                        last_name TEXT
                    );

Insert data to the customers table.

 cqlsh:my_keyspace> INSERT INTO customers (customer_id, first_name, last_name) VALUES (UUID(), 'JOHN', 'DOE');
                    INSERT INTO customers (customer_id, first_name, last_name) VALUES (UUID(), 'MARY', 'SMITH');
                    INSERT INTO customers (customer_id, first_name, last_name) VALUES (UUID(), 'ANN', 'JOB');

On node-3, connect to the Cassandra cluster.

 $ cqlsh 192.0.2.5 9042 -u cassandra -p cassandra

Switch to the new my_keyspace keyspace.
```
 qlsh> USE my_keyspace;
```

View the customers table data.

 cqlsh:my_keyspace> SELECT
                        customer_id,                     
                        first_name,
                        last_name
                    FROM customers;

Output:

  customer_id                          | first_name | last_name
 --------------------------------------+------------+-----------
  5c8f596a-20e0-43de-95fb-60c5039898fb |       MARY |     SMITH
  39ff0f60-18a0-4e3f-b32c-e31c175ad5ad |       JOHN |       DOE
  284c7b9e-c39e-47a2-aad9-2fd409af0bb9 |        ANN |       JOB

 (3 rows)

As displayed in the above output, the data entered on node-1 replicates to node-2 and node-3 in real-time.

Add a new row to the customers table.

 cqlsh:my_keyspace> INSERT INTO customers (customer_id, first_name, last_name) VALUES (UUID(), 'PETER', 'JOBS');

Exit the cluster.
```
 cqlsh:my_keyspace> QUIT;
```

On node-1 run a SELECT statement and verify if the cluster replicates the last record you on node-3.

 cqlsh:my_keyspace> SELECT
                        customer_id,                     
                        first_name,
                        last_name
                    FROM customers;

Output:

  customer_id                          | first_name | last_name
 --------------------------------------+------------+-----------
  3c4d78e5-f2d5-4f3b-8651-fac121e51551 |       JOHN |       DOE
  d3cb1d3a-51e1-4989-8c67-22f372fd9bd1 |      PETER |      JOBS
  c6511b52-b250-46cb-a4a3-d2f08686f782 |        ANN |       JOB
  736a1084-9ecb-4103-b22b-e1e9f09e40c8 |       MARY |     SMITH

 (4 rows)

Exit the cluster.
```
 cqlsh:my_keyspace> QUIT;
```

Set the Consistency Level

In this section, use the QUORUM consistency level to set a level participating nodes must achieve before the Cassandra cluster accepts read or write operations. By using enforcing a consistency level policy, you can restrict data availability and accuracy when nodes fail. The cluster uses the following formula to calculate the available quorum level.

(SUM_OF_ALL_REPLICATION_FACTORS / 2) + 1 ROUNDED DOWN TO THE NEAREST WHOLE NUMBER/INTEGER

When expanded, the above formula evaluates to 2:

(3 / 2) + 1  = 1.5 + 1 = 2.5, then round down 2.5 to the nearest integer = 2

As listed in the above evaluation, any query you perform on the sample my_keyspace keyspace requires 2 healthy nodes to respond. This is because the cluster consists of 3 nodes, and tolerates only 1 replica down. If you have 10 nodes, a sample key space with a replication_factor of 10 tolerates 4 nodes down because (10 / 2) + 1 rounded down to the nearest integer is 6.

To test the above consistency level logic, follow the steps below:

On node-1, stop the Apache Cassandra cluster.
```
 $ sudo systemctl stop cassandra
```
The above command sets the total available nodes to two with node-2 and node-3 actively running.

On node-2, run the Cassandra nodetool to verify that one node is down and two nodes are running.

 $ sudo nodetool status

Output:

 Datacenter: datacenter1
 =======================
 Status=Up/Down
 |/ State=Normal/Leaving/Joining/Moving
 --  Address     Load       Tokens  Owns (effective)  Host ID                               Rack
 DN  192.0.2.3  95.85 KiB  16      100.0%            e29827ec-88b7-460c-98b4-88af145ec6ef  rack1
 UN  192.0.2.4  97.57 KiB  16      100.0%            a8742739-2439-4716-a6d9-4689a8fc7ad1  rack1
 UN  192.0.2.5  97.57 KiB  16      100.0%            f37874ad-28ac-49ec-b90a-322fa3b3d26c  rack1

The above output confirms that node-1 (192.0.2.3) is down with the DN status, and the rest of the nodes are running.

On node-2, log in to the Cassandra Cluster.

 $ cqlsh 192.0.2.4 9042 -u cassandra -p cassandra

Set the consistency level to QUORUM.

 cqlsh> CONSISTENCY QUORUM;

Output:

 Consistency level set to QUORUM.

Switch to the my_keyspace keyspace.
```
 cqlsh> USE my_keyspace;
```

View the customers table data.

 cqlsh:my_keyspace> SELECT
                        customer_id,                     
                        first_name,
                        last_name
                    FROM customers;

Output:

  customer_id                          | first_name | last_name
 --------------------------------------+------------+-----------
  3c4d78e5-f2d5-4f3b-8651-fac121e51551 |       JOHN |       DOE
  d3cb1d3a-51e1-4989-8c67-22f372fd9bd1 |      PETER |      JOBS
  c6511b52-b250-46cb-a4a3-d2f08686f782 |        ANN |       JOB
  736a1084-9ecb-4103-b22b-e1e9f09e40c8 |       MARY |     SMITH

 (4 rows)

The above output displays table data because the majority of nodes 2 out of 3 respond and there is a quorum.

Exit the cluster.
```
 > QUIT:
```
On node-3, stop the Apache Cassandra cluster.
```
 $ sudo systemctl stop cassandra
```

On node-2, access the cluster, and try viewing the customers table data.

 cqlsh:my_keyspace> SELECT
                        customer_id,                     
                        first_name,
                        last_name
                    FROM customers;

Output:

 NoHostAvailable: ('Unable to complete the operation against any hosts', {<Host: 192.0.2.4:9042 datacenter1>: Unavailable('Error from server: code=1000 [Unavailable exception] message="Cannot achieve consistency level QUORUM" info={\'consistency\': \'QUORUM\', \'required_replicas\': 2, \'alive_replicas\': 1}')})

As displayed in the output, there is no quorum because the majority of nodes are down. This is because the cluster requires a majority of healthy nodes (2) as compared to (1).

Troubleshooting

If you encounter any error while setting up a multi-node Apache cluster, follow the troubleshooting steps below to fix any cluster issues. To view detailed errors, view the /var/log/cassandra/system.log file.

Dead Node Fix.

In case of server restarts, a node may fail to respond. To enable a dead node fix, edit the cassandra-env.sh as below.
```
 $ sudo nano cd /etc/cassandra/cassandra-env.sh
```
Add the following line at the end of the file. Replace 192.0.2.3 with your actual node address.
```
 JVM_OPTS="$JVM_OPTS -Dcassandra.replace_address=192.0.2.3"
```

Clear the cluster data directories.

 $ sudo rm -rf /var/lib/cassandra/data/system/*

Verify the cluster status to verify that your node is up and running.
```
 $ sudo nodetool status
```

nodetool: Failed to connect to '127.0.0.1:7199' - ConnectException: 'Connection refused (Connection refused)'.

Verify that the Apache Cassandra database server is running.
```
 $ sudo service cassandra status
```
Open the /etc/cassandra/cassandra.yaml, and verify that the rpc_address: directive points to your node address to accept client connections.
```
 rpc_address: 192.0.2.3
```

Connection error: ('Unable to connect to any servers', {'192.0.2.3:9042': ConnectionRefusedError(111, "Tried connecting to [('192.0.2.3', 9042)]. Last error: Connection refused")})

Verify the cluster status.
```
 $ sudo nodetool status
```
Verify that Cassandra is running.
```
 $ sudo service cassandra status
```
Verify that the listen_address: directive points to your node address in your configuration file.
```
 $ sudo nano /etc/cassandra/cassandra.yaml
```

Conclusion

In this guide, you have deployed a multi-node Apache Cassandra cluster on a Ubuntu 22.04 server. Creating a multi-node Apache Cassandra cluster ensures high availability and improves the reliability of applications that use the cluster. For more information, visit the Apache Cassandra documentation.

Tags:

Database Cluster

Apache Cassandra

Ubuntu