NoSQL Databases Explained

Updated on June 23, 2024
NoSQL Databases Explained header image

Introduction

The Relational Database Management Systems (RDBMS) model dates to the 1970s. Over the years, the model of the RDBMS has conveniently allowed developers to store data in a collection of rows, columns, and tables. Apart from information storage, RDBMS are secure, convenient, and affordable. However, the RDBMS architecture has the following limitations:

  • Schema rigidity: After defining a schema in RDBMS, you can't add new columns on the fly without affecting the entire table. Modern-day information systems require the ability to store structured, semi-structured, and unstructured data.

  • Structural limitations: The hard limits for field lengths, columns per table, and row size in traditional databases don't match the modern-day IT needs.

  • Poor interoperability: Most RDBMS applications require special integration libraries or drivers. For faster integration, developers prefer database systems with inbuilt HTTP Application Programming Interface (API) and modern data access layers.

  • Scalability issues: The RDBMS model dates back when data was small, orderly, and neat. Modern-day massive data slows down the RDBMS applications.

  • Poor performance with distributed applications. The architecture of RDBMS performs poorly with distributed online databases. The design of Most RDBMS applications worked well with one computer before the birth of the internet.

Developers continuously work on NoSQL (not only SQL) databases to solve the above issues. This guide discusses the different types of NoSQL databases, their use-cases, and their benefits.

1. Types of NoSQL Databases

Four main types of NoSQL databases exist:

  1. Key-value databases.

  2. Document databases.

  3. Graph databases.

  4. Wide-column databases.

1.1. Key-value Databases

Key-value databases use an associative array model to write and read data. Associative arrays use unique identifiers (keys) to set a value. Key-value stores are also known as maps or dictionaries in different programming languages. The following is a sample of session data access tokens stored in a key-value database:

    +------------+-----------------------------------+
    |    key     |             value                 |
    +------------+-----------------------------------+
    |user_100731 | 8527f5ecd4f311ec9d640242ac120002  |
    |user_100732 | 8527f7ccd4f311ec9d640242ac120002  |
    |user_100733 | 8527f8d0d4f311ec9d640242ac120002  |
    |user_100734 | 8527f9b6d4f311ec9d640242ac120002  |
    +------------+-----------------------------------+

The key-value databases support the following data types:

  • Strings

  • Lists

  • Geospatial indexes

  • Sets

  • Hashes

The Key-value databases use an effective partitioning model that supports horizontal scaling better than any other type of database. This support for massive scalability is suitable for applications where end-users interact simultaneously with the system.

Use-cases for key-value databases:

  • Session management: When users log in to a system, an application generates session data like an access token, profile information, preferred themes, and recommendations. The application must rapidly fetch the session data in real-time. An in-memory key-value database uses a key to store each user's data.

  • Shopping cart management: Some websites receive millions of orders during peak seasons. Distributed key-value databases use vertical and horizontal scaling to handle massive data from millions of users.

  • Caching: A cache is a high-speed data storage for regularly accessed data like combo box values, products' catalogs, and more. The key-value databases use the computer's RAM as a cache to speed up program execution.

Popular open-source key-value databases:

1.2. Document Databases

Document-based databases use the JavaScript Object Notation (JSON) model to store data. The JSON structure handles big data effectively. The following is a sample of a JSON data for a product:

    {        
      "_id": e75a0c9cd4f211ec9d640242ac120002,
      "product_name": "4G WIRELESS ROUTER",
      "retail_price": 88.55       
    }

One advantage of document-based databases is interoperability. Developers find it easier to use the JSON data from document databases without the need for third-party libraries. This compatibility minimizes the application's development time and makes maintenance easier. Also, the JSON data format is flexible. This flexibility allows developers to evolve with the application's needs when storing unstructured data.

Use-cases for document-based databases:

  • Content Management Systems (CMS): In a blog, the document-based database represents each article as a single document. Sometimes, you may need to add a new attribute to an article. For instance, the author's location. The document database allows you to make that change only to the affected document without affecting the structure of the entire database. Adding a new column in a relational database affects the complete table.

  • Products catalogs: In an e-commerce database, each product contains different options. For instance, a shirt may be available in yellow, blue, and green colors. The same shirt might have short-sleeve and long-sleeve variants. Then, the different shirt options might come at different prices. Representing such information in a relational database involves multiple tables. However, a document database stores the information in a single document using nested attributes.

  • User profiles: The document-based databases store personal profile information better than relational databases. This flexible schema makes a document-based database a good choice in healthcare applications where patients' data vary.

Popular open-source document databases:

1.3. Graph Databases

Graph databases use nodes and relationships to store data instead of tables, documents, or key-value pairs. The graph databases emphasize on the correlations between different interconnected data entities. The data relationships established during storage allow developers to run queries and generate complex reports effortlessly.

In a graph database, nodes (rows or records) contain properties that store data. Nodes are then connected using edges. The following illustration represents a sample graph database data:

    +------------+-------------+-------------------+
    |    nodes   |    edges    | nodes             |
    +------------+-------------+-------------------+
    |            | reports_to  | marketing_manager |
    |            |             |                   |
    |  john_doe  |             |                   |
    |            |  works_in   | miami_branch      |
    |            |             |                   |
    +------------+---------------------------------+

Use-cases for graph databases:

  • Social networks: Most functionalities of social applications require graph databases. In a social network application, most queries revolve around finding friends, friends of friends, liked pages, nested comments, and more. Nodes and edges represent these relationships better than traditional database joins.

  • Fraud detection: Graph databases can detect and stop fraud in real-time. A graph database uncovers fraud patterns by analyzing relationships between different transactions. This type of functionality is slow and difficult to implement using relational databases.

  • Product recommendation: A product recommendation application uses data from other previous customers to predict choices for new customers. The recommendation engine must be fast, scalable, and accurate. Graph databases are suitable for video streaming, social media, and online shopping platforms to recommend new products to users.

Popular open-source graph databases:

  • Neo4j.

  • HyperGraphDB.

  • Apache TinkerPop.

1.4. Wide-column Databases

Wide-column databases are fast and highly scalable data stores that use flexible columns to write and query information. The column names and row format in a wide-column table can vary for each record. This flexibility allows developers to update the schema of a single row without affecting the entire table. Here is an illustration of the wide-column data:

    +------------+-----------+----------+
    |            | COLUMN A  | COLUMN B |
    |  ROW 1     |-----------|----------+
    |            |   VALUE   |  VALUE   |
    +------------+----------------------+

    +------------+-----------+
    |            | COLUMN A  |
    |  ROW 2     |-----------|
    |            |   VALUE   |
    +------------+------------

    +------------+-----------+----------+----------+
    |            | COLUMN A  | COLUMN B | COLUMN C |
    |  ROW 3     |-----------|----------+----------+
    |            |   VALUE   |  VALUE   |  VALUE   |
    +------------+----------------------+----------+

Use-cases for wide-column databases:

  • Data logging: Applications generate errors and access logs differently using varying attributes. Wide-column database stores such data effectively.

  • Managing sensor data for Internet of Things (IoT) devices: IoT devices require flexible, scalable, and distributed database systems to handle a large amount of unstructured data in real-time.

  • Storing user preferences: Wide-column databases are row-oriented rather than column-oriented. The model allows developers to store different users' preferences when the schema of each user varies.

Popular open-source wide-column databases:

  • Apache Cassandra.

  • ScyllaDB.

  • Apache HBase.

2. Benefits of NoSQL Databases

Here is a summary of the benefits of NoSQL databases:

  1. Vertical and horizontal scalability.

  2. Flexible database schema.

  3. Ability to handle large volumes of data at high speed.

  4. Support for structured, semi-structured, and unstructured data.

  5. Developer-friendly through modern API.

Conclusion

The traditional relational databases are not fading away any time soon. However, the NoSQL database model addresses many of the challenges presented by relational databases over the years. Most industries like e-commerce, IoT, and social media have adopted the new NoSQL databases for their everyday use. Also, hybrid database solutions are becoming common where companies interconnect the relational and NoSQL databases to speed up applications.