How to Import CSV Data to Vultr Managed Databases for PostgreSQL
Introduction
A Comma Separated Values (CSV) file is a plain-text file that stores data in tabular format. Usually, CSV files share data between spreadsheet applications and database systems. CSV files separate field values and records using commas and new line characters. When implementing PostgreSQL databases, you can easily import data from CSV files using the \COPY
command. The \COPY
command is available from the psql
package. This guide shows you how to copy CSV data from Ubuntu 20.04 server to a managed PostgreSQL database cluster.
Prerequisites
To proceed with this guide:
Log in to your Vultr account. Navigate to Products, then Databases. Click the name of the PostgreSQL database cluster under the Managed Database Name. Navigate to the Overview tab and locate Connection Details. This guide uses the following sample connection details:
- username:
vultradmin
- password:
EXAMPLE_POSTGRESQL_PASSWORD
- host:
SAMPLE_POSTGRESQL_DB_HOST_STRING.vultrdb.com
- port:
16751
- username:
1. Create a Sample CSV File
In this guide, you need to create a sample CSV file. When working in a production environment, you may generate the CSV file automatically from spreadsheet applications or other database systems. Follow the steps below to create the sample file:
SSH to your server and open a new
products.csv
file on a text editor.$ nano products.csv
Enter the following information into the
products.csv
file.product_id,product_name,retail_price,available,created_on 1,20L FOUNTAIN MINERAL WATER,5.25,Y,2022-12-12 2,QUARTZ WALL CLOCK,30.20,N,2022-12-10 3,RECHARGEABLE LED LAMP,14.40,Y,2022-12-08 4,TYPE C 30W CHARGER,29.95,Y,2022-12-10 5,DOUBLE SIDED TAPE,3.45,N,2022-12-12
Save and close the
products.csv
file.
After setting up the CSV file, proceed to the next step to copy data into a PostgreSQL database cluster.
2. Copying Data into a PostgreSQL Database Cluster
To import data into a managed PostgreSQL database cluster from CSV, you need to follow the steps below:
Log in to a managed PostgreSQL database cluster.
Create a database.
Create a table structure on the target PostgreSQL database that matches the data types of the CSV file fields.
Use the PostgreSQL server
\COPY
command to load CSV data to the table.
2.1. Log in to the Managed PostgreSQL Database Cluster
Install the
postgresql-client
package. This package allows you to use the PostgreSQLpsql
command from the Linux server.$ sudo apt update $ sudo apt install -y postgresql-client
Run the following
psql
command to log in to the managed PostgreSQL cluster. Replace the-h
(host),-p
(port), and-U
(username) values with the correct credentials.$ psql -h SAMPLE_POSTGRESQL_DB_HOST_STRING.vultrdb.com -p 16751 -U vultradmin defaultdb
Enter the password for the managed database cluster and press Enter to proceed.
Output.
defaultdb=>
Create a sample
xyz_db
database.defaultdb=> CREATE DATABASE xyz_db;
Output.
CREATE DATABASE
Connect to the new
xyz_db
database.defaultdb=> \c xyz_db;
Output.
psql ... ... You are now connected to database "xyz_db" as user "vultradmin".
After creating a database, initialize the table structure in the next step.
2.2. Create a Table Structure
After setting up and connecting to the target database, follow the steps below to create a table:
Match the
products.csv
fields' values with the correct PostgreSQL data types as illustrated in the following list.+---------------+---------------------+ | Field Name | Data Type | +---------------+---------------------+ | product_id | INTEGER | | product_name | VARCHAR(50) | | retail_price | FLOAT | | available | CHAR(1) | | created_on | DATE | +---------------+---------------------+
Use the PostgreSQL
CREATE TABLE
command to set up aproducts
table.xyz_db=> CREATE TABLE products ( product_id INTEGER PRIMARY KEY, product_name VARCHAR(50), retail_price FLOAT, available CHAR(1), created_on DATE );
Output.
CREATE TABLE
Run the following describe statement command against the
products
table to ensure you've got the correct schema.xyz_db=> \d products;
Output.
Column | Type | Collation | Nullable | Default --------------+-----------------------+-----------+----------+--------- product_id | integer | | not null | product_name | character varying(50) | | | retail_price | double precision | | | available | character(1) | | | created_on | date | | | Indexes: "products_pkey" PRIMARY KEY, btree (product_id)
You have now set up a target database and a table for the data. Follow the next step to load data into the database.
2.3. Use the PostgreSQL \COPY
Command to Load Data
The PostgreSQL server \COPY
command follows the syntax below:
=> \COPY SAMPLE_TABLE (SAMPLE_COMMA_SEPARATED_COLUMNS)
FROM 'SAMPLE_ABSOULTE_FILE_PATH'
WITH (SAMPLE_OPTIONS);
The \COPY
syntax explained:
The command starts with a
\COPY
keyword.The
SAMPLE_TABLE
is the name of the target table where you want to place the data (For example,products
).The
(SAMPLE_COMMA_SEPARATED_COLUMNS)
is an optional parameter that defines the order of the columns in the target table.The
FROM 'SAMPLE_ABSOULTE_FILE_PATH'
statement defines the absolute path where you have placed your CSV file (For example,~/products.csv
).The
WITH (SAMPLE_OPTIONS)
keyword defines the options that the PostgreSQL server should consider when executing the\COPY
command as illustrated below:FORMAT
: This is the input data format (For example,FORMAT CSV
).DELIMITER
: You should define the character separating the field values after theDELIMITER
keyword (For example,DELIMITER ','
).HEADER
: This directive tells the PostgreSQL server whether the raw input CSV file contains a header. If you set this value to true (HEADER true
), PostgreSQL discards the first line when loading data. However, if your CSV file doesn't contain any header, set this value tofalse
using theHEADER false
statement to load everything in the file.
After understanding the PostgreSQL \COPY
syntax, follow the steps below to load and verify the CSV data.
Run the following command to copy data from the '~/products.csv' file into the
products
table.xyz_db=> \COPY products (product_id, product_name, retail_price, available, created_on) FROM '~/products.csv' WITH (FORMAT CSV, DELIMITER ',', HEADER true);
Output.
COPY 5
Query the
products
table to ensure the data is in place.xyz_db=> SELECT product_id product_name, retail_price, available, created_on FROM products;
Output.
product_name | retail_price | available | created_on --------------+--------------+-----------+------------ 1 | 5.25 | Y | 2022-12-12 2 | 30.2 | N | 2022-12-10 3 | 14.4 | Y | 2022-12-08 4 | 29.95 | Y | 2022-12-10 5 | 3.45 | N | 2022-12-12 (5 rows)
The above output confirms that you've successfully loaded data from your CSV file into the managed PostgreSQL database cluster.
Conclusion
This guide illustrates the process of copying data from a CSV file into a managed PostgreSQL database cluster via Ubuntu 20.04 server. If you've several CSV files, repeat the above process to load all files into the respective tables.
Check out the following guides to learn more about Vultr's managed databases: