AI-powered Search with pgvector and Vultr Managed Database for PostgreSQL
Introduction
Among the most challenging aspects of relational databases is the ability to work with vector data at scale. Luckily, the PostgreSQL database server supports a pgvector
extension that allows you to efficiently store and query data over Machine Learning (ML) generated embeddings.
ML Embeddings contain arrays of floating point numbers that represent objects such as images, text, video, and audio. These numerical representations express objects in a high-dimensional vector space making search similarities searches possible. Below are sample real-life applications of embeddings:
- Online retail industry: A similarity search recommends related products to customers when they add items to a shopping cart
- Audio and video streaming services: An ML embedding search helps customers find top picks based on other visiting-customer preferences
- Digital image analysis: Similarity searches help in studying images at pixel level for classification
- Web-based chatbot applications: Embedding models simulate human-like conversations to answer instant questions
- Finance industry: An embedding model detects and blocks fraud based on transaction patterns
This guide implements the PostgreSQL pgvector
extension to run an AI-powered search application that answers customer Frequently Asked Questions (FAQs) using Python on a Ubuntu 22.04 server. You are to use sample data from the Vultr FAQ section to simulate common queries.
Prerequisites
Before your begin:
- Deploy a Ubuntu 22.04 management server on Vultr
- Deploy a Vultr Managed Database for PostgreSQL cluster
- Using SSH, access the server
- Create a standard user with sudo privileges.
- Create a free OpenAI account and create a secret key
This guide uses the OpenAI API to generate real embeddings to test the
pgvector
extensions. A free OpenAI account offers three API requests per minute and works well for this guide. In a production environment, add a payment method to your account and increase the limit
Activate the pgvector
PostgreSQL Extension
Update the server packages
$ sudo apt update
Install the Python
pip
package manager$ sudo apt install -y python3-pip
Using
pip
, install the Python PostgreSQL driver for Python and the OpenAI modules$ pip install psycopg2-binary
Install the OpenAI modules
$ pip install openai numpy
Install the
postgresql-client
package$ sudo apt install -y postgresql-client
The above command installs the PostgreSQL
psql
CLI tool used to access your managed database.Using
psql
, log in to your Vultr Managed Database for PostgreSQL$ psql postgres://vultradmin:example-password@prod-db.vultrdb.com:16751/defaultdb
The above command connects to your Vultr Managed Database for PostgreSQL using a connection sting. Replace the following details with your actual Vultr database credentials:
username
: vultradminpassword
: example-passwordhost
: prod-db.vultrdb.comport
: 16751
Create a new sample
company_db
database=> CREATE DATABASE company_db;
Switch to the database
=> \c company_db;
Output:
You are now connected to database "company_db" as user "vultradmin".
Enable the
pgvector
extension on each database that requires the extensioncompany_db=> CREATE EXTENSION vector;
When successful, your output should look like the one below:
CREATE EXTENSION
Query the
pg_type
table to verify the availability of a newVECTOR
data typecompany_db=> SELECT typname FROM pg_type WHERE typname = 'vector';
Output:
typname --------- vector (1 row)
As displayed in the above output, the new
vector
data type is ready for use.Create a
resource_base
tablecompany_db=> CREATE TABLE resource_base ( resource_id BIGSERIAL PRIMARY KEY, resource_description TEXT, embedding VECTOR(1536) );
The above command creates a table with the following columns:
resource_id
is aPRIMARY KEY
that uniquely identifies records and uses theBIGSERIAL
data type.resource_description
is a text-based column that stores answers to questions that customers are likely to ask in the FAQ question.embedding
uses theVECTOR
data type with1536
dimensions to store embeddings for theresource_description
values. Later in this guide, you are to generate embeddings using the OpenAI API.
Exit the PostgreSQL console
company_db=> \q
Simulate the Project's Logic Flow
Before developing the Python application, below is an overview of how the AI-powered search logic works:
The application accepts Linux
curl
commands containing sample POST requests to populate theresource_base
table. For example, for the queryWhat payment methods do you accept?
, the tool expects the following resource:$ curl -X POST http://localhost:8080/ -H "Content-Type: application/json" -d '{"resource_description": "We accept Visa, Mastercard, American Express, Discover, JCB, BitPay (BTC, BCH, ETH, DOGE, PAX, BUSD, LTC, USDC, GUSD), Alipay, UnionPay, and PayPal. Depending on your geographic region, Paypal supports additional payment networks, including Giropay and bank transfers."}'
* When populating the table, the application queries the OpenAI API to get embeddings for the
resource_description
you're adding to the database.The application adds the
resource_description
and vector data representing the resource in the database using the following SQL command:insert into resource_base (resource_description, embedding) values (%s, %s)'
The application accepts a
curl
GET with a target FAQ the customer searches for in the database. Then, the application uses the PostgreSQLpgvector
operators to perform a similarity search and returns the most relevant answer in JSON format
Create a Database Gateway
Based on the project's logic flow, create a separate database gateway file for the PostgreSQL database as described in the steps below.
Create a new
project
directory$ mkdir project
Switch to the directory
$ cd project
Using a text editor such as
Nano
, create a newpostgresql_gateway.py
file$ nano postgresql_gateway.py
Add the following contents to the file. Replace all
db_...
values with your actual Vultr Managed Database for PostgreSQL detailsimport psycopg2 class PostgresqlGateway: def __init__(self): db_host = 'prod-db.vultrdb.com' db_port = 16751 db_name = 'company_db' db_user = 'vultradmin' db_pass = 'example-password' self.db_conn = psycopg2.connect(host = db_host, database = db_name, user = db_user, password = db_pass, port = db_port) def insert_resource(self, resource_description, embedding): db_cursor = self.db_conn.cursor() query_string = 'insert into resource_base (resource_description, embedding) values (%s, %s)' db_cursor.execute(query_string, (resource_description, str(embedding))) self.db_conn.commit() return {'message': "Success"} def get_resources(self, embedding): db_cursor = self.db_conn.cursor() query_string = 'select resource_id, resource_description from resource_base ORDER BY embedding <=> (%s::vector(1536)) LIMIT 1;' db_cursor.execute(query_string, (embedding,)) rows = db_cursor.fetchall() return list(rows)
Save and close the file
The above Python code performs the following logic:
import psycopg2
loads the Python driver that connects the Python application to the Vultr Managed Database for PostgreSQL- The
class PostgresqlGateway:
section establishes a new module with the following methods:def __init__(self):
: A constructor method that runs when you create an instance of the class. This method connects to the PostgreSQL database you created earlier.def insert_resource(self, resource_description, embedding):
: Accepts theresource_description
value and anembedding
value from the OpenAI API, then, it uses the PostgreSQL database connection to add a new entry to the database table using theinsert into resource_base (resource_description, embedding) values (%s, %s)
query.def get_resources(self, embedding):
Accepts anembedding
generated from an HTTP GET query to search related records in the database table using theselect resource_id, resource_description from resource_base ORDER BY embedding <=> (%s::vector(1536)) LIMIT 1;
query. The query uses theLIMIT
clause to return a single row. In production, change this value to return more rows depending on your use case
- The application similarity query uses the cosine distance
<=>
vector operator. The operator is suitable for finding similar documents and performing natural language searches. Other common operators you can use when designing different types of applications include:- Euclidean distance
<->
- Negative inner product
<->
- Euclidean distance
Create an Embeddings Generator
In this section, set up an application that creates embeddings during the following operations:
- When populating the
resource_base
table - When passing clients' queries to the
resource_base
table for querying purposes
Create a central module to generate the embeddings instead of rewriting the logic on each file as described in the steps below.
Create a new
embeddings_generator.py
file$ nano embeddings_generator.py
Add the following contents to the file. Replace the
openai.api_key
value with your actual OpenAPI keyimport openai class EmbeddingsGenerator: def create_embedding(self, user_query): try: openai.organization = "" openai.api_key = "YOUR-OPEN-API-KEY" open_ai_model_id = "text-embedding-ada-002" embedding = openai.Embedding.create(input = user_query, model = open_ai_model_id)['data'][0]['embedding'] self.embedding = embedding self.resp_error = "" except openai.error.RateLimitError as error: self.resp_error = {'error': str(error)}
Save and close the file
The above application uses the OpenAI
text-embedding-ada-002
model to generate the embeddings. The model is suitable for text similarity searches. It accepts text inputs and converts them to numerical representations (embeddings).Below is how the above Python module works:
- The
import openai
declaration loads the OpenAI module functions into the project class EmbeddingsGenerator:
establishes one class with a single method- The
create_embedding(self, user_query):
inputs raw text (user_query
) and uses the OpenAI API to generate embeddings (vector data). Later, the sample application uses the embeddings to perform similarity searches
- The
Create an Index File
To run the Python application, create an entry point to the application as described in the steps below
Create a new
index.py
file$ nano index.py
Add the following contents to the file
import http.server from http import HTTPStatus import socketserver from urllib.parse import urlparse, parse_qs import json import postgresql_gateway import embeddings_generator class HttpHandler(http.server.SimpleHTTPRequestHandler): def do_POST(self): self.send_response(HTTPStatus.OK) self.send_header('Content-type', 'application/json') self.end_headers() content_length = int(self.headers['Content-Length']) http_post_data = json.loads(self.rfile.read(content_length)) resource_description = http_post_data['resource_description'] eg = embeddings_generator.EmbeddingsGenerator() req = eg.create_embedding(resource_description) if eg.resp_error == "": embedding = eg.embedding pg = postgresql_gateway.PostgresqlGateway() resp = pg.insert_resource(resource_description, embedding) else: resp = eg.resp_error self.wfile.write(bytes(json.dumps(resp , indent = 2) + "\r\n", "utf8")) def do_GET(self): self.send_response(HTTPStatus.OK) self.send_header('Content-type', 'application/json') self.end_headers() parsed_url = urlparse(self.path) params = parse_qs(parsed_url.query) query = params['query'][0] eg = embeddings_generator.EmbeddingsGenerator() req = eg.create_embedding(query) if eg.resp_error == "": embedding = eg.embedding pg = postgresql_gateway.PostgresqlGateway() resp = pg.get_resources(embedding) else: resp = eg.resp_error self.wfile.write(bytes(json.dumps(resp , indent = 2) + "\r\n", "utf8")) socketserver.TCPServer.allow_reuse_address = True httpd = socketserver.TCPServer(('', 8080), HttpHandler) print("The HTTP server is running at port 8080...") try: httpd.serve_forever() except KeyboardInterrupt: httpd.server_close() print("The HTTP server has stopped running.")
Save and close the file
Below is how the application works:
- The
import...
section imports the HTTP functionalities to create a web server using the declared inbuilt Python modules. Additionally, you import the custompostgresql_gateway
andembeddings_generator
modules you coded earlier in the project class HttpHandler(http.server.SimpleHTTPRequestHandler):
is a handler class for the HTTP server running on port8080
- The HTTP handler function runs the following two methods:
do_POST(self):
processes POST requests that contain a resource you want to add to theresource_base
table. This method retrieves an embedding from the OpenAI generator and passes it to the PostgreSQL database serverdo_GET(self):
runs the GET method to retrieve a user's query from an HTTP request. Then, it gets an embedding of the query from the OpenAI API and passes it to the PostgreSQL database to perform a similarity search
- The
Test the Application
Start the application
$ python3 index.py
In a new terminal window, establish another
SSH
connection to your server$ ssh example_user@SERVER-IP
Populate the
resource_base
table using the followingcurl
POST commands$ curl -X POST http://localhost:8080/ -H "Content-Type: application/json" -d '{"resource_description": "Yes, instances in a stopped state continue to reserve dedicated system resources (RAM, SSD storage, IP aliases, vCPU) and therefore incur charges until you destroy the instance. If you wish to no longer accumulate charges for a virtual machine, please use the DESTROY button in the customer portal."}' $ curl -X POST http://localhost:8080/ -H "Content-Type: application/json" -d '{"resource_description": "We have not charged your card. What you have observed is a temporary authorization in order to validate the card provided. The hold will automatically expire based on your banks policy, generally within a few days."}' $ curl -X POST http://localhost:8080/ -H "Content-Type: application/json" -d '{"resource_description": "At Vultr, we attempt to keep cloud pricing as consistent as possible globally. However, due to variation in regional costs (such as networking, hardware, labor, taxes, and fees), certain global data center locations may have different per hour and per month pricing for Vultr cloud services. Current data center location pricing is always available in the Control Panel."}' $ curl -X POST http://localhost:8080/ -H "Content-Type: application/json" -d '{"resource_description": "All servers on your account are billed hourly up to the monthly rate cap. The hourly rate is determined by dividing the monthly rate by 672 hours (28 days). If your server is online for more than 672 hours in a calendar month, you will only be billed the monthly rate. Accumulated charges are invoiced to your account on the 1st of every month."}' $ curl -X POST http://localhost:8080/ -H "Content-Type: application/json" -d '{"resource_description": "We accept Visa, Mastercard, American Express, Discover, JCB, BitPay (BTC, BCH, ETH, DOGE, PAX, BUSD, LTC, USDC, GUSD), Alipay, UnionPay, and PayPal. Depending on your geographic region, Paypal supports additional payment networks, including Giropay and bank transfers."}' $ curl -X POST http://localhost:8080/ -H "Content-Type: application/json" -d '{"resource_description": "Vultr.com is required to collect tax in several countries around the world. This tax is generally referred to as VAT (Value Added Tax), consumption tax, or sales tax. Prices listed on our website do not include tax. Tax will be added to your invoice as a separate line item."}'
This guide uses data samples from the Vultr FAQ section. When using a free OpenAI developer account, you must send one query every
20
seconds to avoid the rate-limiting errorOutput:
... { "message": "Success" }
Send
HTTP
GET queries to perform a similarity search on the PostgreSQL server. For example:Do you charge for stopped instances?
$ curl -G http://localhost:8080/ --data-urlencode "query=Do you charge for stopped instances?"
Output:
[ [ 1, "Yes, instances in a stopped state continue to reserve dedicated system resources (RAM, SSD storage, IP aliases, vCPU) and therefore incur charges until you destroy the instance. If you wish to no longer accumulate charges for a virtual machine, please use the DESTROY button in the customer portal." ] ]
I linked my credit card but I see a small charge on my card! What gives?
$ curl -G http://localhost:8080/ --data-urlencode "query=I linked my credit card but I see a small charge on my card! What gives?"
Output:
[ [ 2, "We have not charged your card. What you have observed is a temporary authorization in order to validate the card provided. The hold will automatically expire based on your banks policy, generally within a few days." ] ]
Full pricing list?
$ curl -G http://localhost:8080/ --data-urlencode "query=Full pricing list?"
Output:
[ [ 3, "At Vultr, we attempt to keep cloud pricing as consistent as possible globally. However, due to variation in regional costs (such as networking, hardware, labor, taxes, and fees), certain global data center locations may have different per hour and per month pricing for Vultr cloud services. Current data center location pricing is always available in the Control Panel." ] ]
Is pricing the same in all data center locations?
$ curl -G http://localhost:8080/ --data-urlencode "query=Is pricing the same in all data center locations?"
Output.
[ [ 3, "At Vultr, we attempt to keep cloud pricing as consistent as possible globally. However, due to variation in regional costs (such as networking, hardware, labor, taxes, and fees), certain global data center locations may have different per hour and per month pricing for Vultr cloud services. Current data center location pricing is always available in the Control Panel." ] ]
What payment methods do you accept?
$ curl -G http://localhost:8080/ --data-urlencode "query=What payment methods do you accept?"
Output:
[ [ 5, "We accept Visa, Mastercard, American Express, Discover, JCB, BitPay (BTC, BCH, ETH, DOGE, PAX, BUSD, LTC, USDC, GUSD), Alipay, UnionPay, and PayPal. Depending on your geographic region, Paypal supports additional payment networks, including Giropay and bank transfers." ] ]
Based on the above results, the application returns the most relevant answer for each query to the user
Check the Embeddings
After using the curl
commands to send HTTP POST requests, the application populates the resource_base
table with AI-generated embeddings. Access the company_db
database to verify the embeddings as described in this section.
Access the PostgreSQL database server. Append the
company_db
database to your connection string to access the database directly$ psql postgres://vultradmin:example-password@prod-db.vultrdb.com:16751/company_db
Query the
resource_base
table using the SQL command below. Apply the PostgreSQLSUBSTRING()
andRIGHT()
functions to return only the first and last few characters from theresource_description
andembedding
columnscompany_db=> SELECT resource_id, CONCAT(SUBSTRING(resource_description, 0, 15), '...', RIGHT(resource_description, 15)) as resource_description, CONCAT(SUBSTRING(embedding::VARCHAR, 0, 30), '...', RIGHT(embedding::VARCHAR, 15)) as embedding FROM resource_base;
Your output should look like the one below:
resource_id | resource_description | embedding -------------+----------------------------------+------------------------------------------------- 1 | Yes, instances...ustomer portal. | [0.00080605154,-0.04708257,0....7,0.0064484123] 2 | We have not ch...hin a few days. | [-0.023812525,-0.011136047,0....,-0.0033343954] 3 | At Vultr, we a... Control Panel. | [-0.0018340687,-0.028246619,0...24,0.015378715] 4 | All servers on...of every month. | [-0.003029692,-0.016905943,0....,0.00010902301] 5 | We accept Visa...bank transfers. | [0.009505584,0.0031462372,0.0...-0.00023975992] 6 | Vultr.com is r...rate line item. | [-0.00049098214,-0.039759535,...5,0.0070797624] (6 rows)
Create an index on the
resource_base
table. This is necessary when scaling your application and have more records in the table. The lists parameter in theivfflat
index sets the number of clusters that PostgreSQL creates when building the index. PostgreSQL uses the index clusters in its algorithms to find the relation between vectors. Apply the following formula when setting the list value:For a table with less than one million rows use:
lists = rows / 1000
For tables with more than one million rows use:
lists = squareroot(rows)
Verify that you have a minimum of ten clusters. When records in the sample application are still few, use the minimum value of
10
e_commerce=> CREATE INDEX ON resource_base USING ivfflat (embedding vector_cosine_ops) WITH (lists = 10);
Conclusion
In this guide, you implemented the PostgreSQL pgvector
extension that generates and queries data over ML-generated embeddings. You created a sample database that stores a company knowledge base using vector data, and used the PostgreSQL cosine distance operator <=>
to query data to display the most relevant results. For more information, visit the PgVector extension repository.
Next Steps
To implement more PostgreSQL use cases on your database, visit the following resources: