
If you're looking for a powerful open-source distributed database system, but don't want to spend hours configuring and installing it, then Cassandra is the right solution for you.
Cassandra is a powerful open-source distributed database system designed to handle large amounts of data across many commodity servers, providing high availability with no single point of failure. Instead, data is partitioned and spread over several servers using a key-value store.
Prerequisites
- Deploy a new Vultr Rocky Linux instance
- Log in to the server with SSH
- Update your Vultr Rocky Linux 8 server
- Create a non-root user with sudo access
Installing Java OpenJDK
First, to run Cassandra on your system, you must install Java OpenJDK. OpenJDK is a free and open-source implementation of the Java Platform. Installation instructions are listed below.
Run the
dnf installcommand to install thejava-1.8.0-openjdkpackage. As of this writing, the latest version of Java OpenJDK is 1.8.xxx. The installation of Java OpenJDK might take some time to complete.$ sudo dnf install java-1.8.0-openjdk -yOnce the installation completes, verify the installed version of Java OpenJDK in your system using the
java -versioncommand.$ java -version openjdk version "1.8.0_322" OpenJDK Runtime Environment (build 1.8.0_322-b06) OpenJDK 64-Bit Server VM (build 25.322-b06, mixed mode)
Installing Python
Cassandra is written in Java, but you'll need to install python as well since the cqlsh tool is written in python. Cqlsh is a command-line interface for Cassandra; you'll need to install python to run Cassandra.
Run the
dnf installcommand to install thepython36package in your system. As of this writing, the latest version of python is 3.6.8.& sudo dnf install python36 -yOnce the installation completes, verify the installed version in your system using the
python3 --versioncommand.$ python3 --version Python 3.6.8Run the
alternatives --configcommand below to select the default Python interpreter. Cassandra requires python v3 or later. So you should select the latest one from the list and at least v3. Select the number corresponding to the latest python version and press Enter. In this demo, it's option 2.$ alternatives --config python 1 /usr/libexec/no-python 2 /usr/bin/python3
Installing Apache Cassandra
Now you have the required components installed in your system, and you are ready to install Apache Cassandra.
The base Rocky Linux repository does not have a Cassandra package, so you need to add its repository to your system first.
Create a new file named
cassandra.repounder the /etc/yum.repos.d directory using thenanotext editor.sudo nano /etc/yum.repos.d/cassandra.repoPopulate the
cassandra.repofile with the following contents. Thebaseurlspecifies where the RPMs are located(https://downloads.apache.org/cassandra/redhat/40x/). The40xhere means the latest version of Apache Cassandra 4.0.3 is downloaded from this location. You can always choose the latest version available in its official repository.[cassandra] name=Apache Cassandra baseurl=https://downloads.apache.org/cassandra/redhat/40x/ gpgcheck=1 repo_gpgcheck=1 gpgkey=https://downloads.apache.org/cassandra/KEYSSave and exit the file by pressing Ctrl+O, Enter, and Ctrl+X. Run the
dnf updatecommand to update your system's package management index with the newly added repository.$ sudo dnf update -yRun the
dnf repolistCassandra command to check if the new repo is properly set up. You will see the new Cassandra repo is enabled in the output.$ sudo dnf repolist cassandra repo id repo name status cassandra Apache Cassandra enabledFinally, install the
cassandrapackage using thednf installcommand.& sudo dnf install cassandra -yStart the Cassandra service.
sudo service cassandra start Reloading systemd: [ OK ] Starting cassandra (via systemctl): [ OK ]Enable the Cassandra service to start on system reboot.
$ sudo systemctl enable cassandra cassandra.service is not a native service, redirecting to systemd-sysv-install. Executing: /usr/lib/systemd/systemd-sysv-install enable cassandraCheck the status of Cassandra's service.
sudo service cassandra status cassandra.service - LSB: distributed storage system for structured data Loaded: loaded (/etc/rc.d/init.d/cassandra; generated) Active: active (running) since Mon 2022-02-21 23:35:06 UTC; 4min 44s ago
Securing Cassandra
Securing your Cassandra cluster is as important as installing it. Suppose you have several Cassandra nodes on the same network. You should secure the cluster at the beginning to prevent attackers from accessing your database.
Run the
firewall-cmdcommand below to create a new firewall zone namedcassandra-clusterfor Cassandra. You should create a new zone to associate with the Cassandra cluster to prevent conflict with other services in the system. The--permanentflag defines the new firewalls are permanent. The--new-zoneflag defines the new firewall zone.$ sudo firewall-cmd --permanent --new-zone cassandra-cluster successReload the
firewalldservice.$ sudo firewall-cmd --reloadAdd your server network CIDR into the new zone so that your client and server can communicate with each other.
$ sudo firewall-cmd --zone=cassandra-cluster --add-source=your-CIDR-here/24 --permanentRun the following commands to allow access to the default ports for Cassandra on the new
cassandra-clusterzone.sudo firewall-cmd --zone=cassandra-cluster --add-port=7000/tcp --permanent sudo firewall-cmd --zone=cassandra-cluster --add-port=9042/tcp --permanentFinally, reload the
firewalldrules. At this point, your Cassandra cluster is secured and can only access from your-CIDR-here/24.sudo firewall-cmd --reload
Testing Cassandra
Now that you have a new Cassandra cluster, you can test if it's up and running properly.
Run the
nodetool statuscommand to check your Cassandra cluster status. This command will return all nodes' information, including the IP address, the load average of each node, data center name, version, and health statistics about each node in the cluster.$ sudo nodetool status Datacenter: datacenter1 ======================= Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack UN 127.0.0.1 97.22 KiB 16 100.0% 5607b149-a79e-4e3e-8d98-5b4a26ff698f rack1* U indicates that the node is up. You can see which nodes are up or down by viewing the corresponding U or D in the output.
- N indicates that the node is normal.
- The Address shows the IP address of the node.
- The Host ID is a unique identifier for each node.
You can also use the
cqlshcommand to interact with your Cassandra cluster.$ cqlsh Connected to Test Cluster at 127.0.0.1:9042 [cqlsh 6.0.0 | Cassandra 4.0.3 | CQL spec 3.4.5 | Native protocol v5] Use HELP for help. cqlsh>For example, if you want to change the default name cluster(Test Cluster), you can use the
updatecommand to change it. ReplaceVultr Clusterwith your desired value.$ UPDATE system.local SET cluster_name = 'Vultr Cluster' WHERE KEY = 'local';Quit cqlsh shell.
$ quit
Next time you use the cqlsh command, it will use the new cluster name( Vultr Cluster). This output confirms that you have successfully installed Cassandra on your system.
$ cqlsh
Connected to Vultr Cluster at 127.0.0.1:9042
[cqlsh 6.0.0 | Cassandra 4.0.3 | CQL spec 3.4.5 | Native protocol v5]
Use HELP for help.
cqlsh>More Information
To learn more about Cassandra, please visit its official website.