How to Create a Deep Learning REST API With Word2Vec and Flask

Updated on September 28, 2018
How to Create a Deep Learning REST API With Word2Vec and Flask header image

Traditional approaches to development are difficult to maintain when using complex machine learning models in production. Development on a laptop or local machine can be slow to train the machine learning model for deep learning engineers. As a result, we typically make use of cloud machines with more powerful hardware to both train and run our machine learning models. This is good practice since we abstract complex computation and instead make AJAX requests as necessary. In this tutorial, we will make a pre-trained deep learning model named Word2Vec available to other services by building a REST API from the ground up.


  • An Ubuntu 16.04 server instance with at least 4GB RAM. For testing and development purposes, you can choose an instance with 4GB RAM
  • Understanding of how to use the Linux operating system to create/navigate/edit folders and files
  • A sudo user

What Are Word Embeddings?

Word embeddings are a recent development in natural language processing and deep learning that has revolutionized both fields due to rapid progress. Word embeddings are essentially vectors that each correspond to a single word such that the vectors mean the words. This can be demonstrated by certain phenomena such as the vector for king - queen = boy - girl. Word vectors are used to build everything from recommendation engines to chat-bots that actually understand the English language.

Word embeddings are not random; they are generated by training a neural network. A recent powerful word embedding implementation comes from Google named Word2Vec which is trained by predicting words that appear next to other words in a language. For example, for the word "cat", the neural network will predict the words "kitten" and "feline". This intuition of words appearing near each other allows us to place them in vector space.

However, in practice, we tend to use the pre-trained models of other large corporations such as Google in order to quickly prototype and to simplify deployment processes. In this tutorial we will download and use Google’s Word2Vec pre-trained word embeddings. We can do this by running the following command in our working directory.


Installing the Flask and Magnitude Packages

The word embedding model we downloaded is in a .magnitude format. This format allows us to query the model efficiently using SQL, and is therefore the optimal embedding format for production servers. Since we need to be able to read the .magnitude format, we’ll install the pymagnitude package. We’ll also install flask to later serve the deep learning predictions made by the model.

pip3 install pymagnitude flask

We’ll also add it to our dependency tracker with the following command. This creates a file named requirements.txt and saves our Python libraries so we can re-install them at a later time.

pip3 freeze > requirements.txt

Querying the Word2Vec Model

To begin, we’ll create a file to handle opening and querying the word embeddings.


Next, we’ll add the following lines to to import Magnitude.

from pymagnitude import Magnitude
vectors = Magnitude('GoogleNews-vectors-negative300.magnitude')

We can play around with the pymagnitude package and the deep learning model by using the query method, providing an argument for a word.

cat_vector = vectors.query('cat')

For the core of our API, we will define a function to return the difference in meaning between two words. This is the backbone for most deep learning solutions for things such as recommendation engines (i.e. showing content with similar words).

We can play around with this function by using the similarity and most_similar  functions.

print(vectors.similarity("cat", "dog"))
print(vectors.most_similar("cat", topn=100))

We implement the similarity calculator as follows. This method will be called by the Flask API in the next section. Note that this function returns a real value between 0 and 1.

def similarity(word1, word2):
	return vectors.similarity(word1, word2)

Creating a REST API

We’ll create our server in a file named with the following contents. We import flask and request to handle our server capabilities and we import the similarity engine from the module we wrote earlier.

from flask import Flask, request
from model import similarity

app = Flask(__name__)

@app.route("/", methods=['GET'])
def welcome():
    return "Welcome to our Machine Learning REST API!"

@app.route("/similarity", methods=['GET'])
def similarity_route():
    word1 = request.args.get("word1")
    word2 = request.args.get("word2")
    return str(similarity(word1, word2))

if __name__ == "__main__":, debug=True)

Our server is rather bare bones, but can easily be extended by creating more routes using the @app.route decorator.

Making API Calls

We can run our Flask server by running the following commands to activate our virtual environment, install our packages, and run its associated Python file.

source venv/bin/activate
pip3 install -r requirements.txt

Our server will be available at localhost:8000. We can query our database at localhost:8000/similarity?word1=cat&word2=dog and view the response either in our browser or through another AJAX client.