How to Use Vultr's Anaconda Marketplace App

Updated on January 12, 2023
How to Use Vultr's Anaconda Marketplace App header image

Introduction

Anaconda is a powerful, open-source software distribution for Python and R programming languages. It is designed to solve many problems associated with managing and using Python packages and their dependencies. It also provides a wealth of powerful tools for data science and machine learning. The Anaconda repository is a vast database of over 8,000 open-source packages, offering a comprehensive range of data science and machine learning options.

This guide covers the deployment, use, and security of an Anaconda instance at Vultr. It explains the various components of Anaconda and how to deploy it securely. It also demonstrates how to install packages, use the Anaconda command line, and more. This guide explains how to set up an Anaconda instance to work with data science and machine learning tasks.

You should have a working knowledge of Python to follow this guide.

Deploy the Instance

Deploy a Cloud GPU server with the Vultr Marketplace Anaconda App. The deployment may take a few minutes to complete, and you can follow the initialization process on the instance console.

After the deployment, log in on the server via SSH, then update and reboot the machine to apply the updates.

# apt-get update && apt-get dist-upgrade -y
# conda init
# conda update conda
# reboot

Secure the Instance

If you don't have an SSH key pair installed yet, create one and install it on your instance. You need your SSH key pair installed because you cannot log in via SSH with your password after making the next set of changes.

  1. Configure SSH to not accept password logins by opening /etc/ssh/sshd_config with your favorite editor and uncomment the following line:

     PasswordAuthentication no
  2. Restart the SSH server.

     # systemctl restart ssh
  3. It is an excellent policy to close all ports on the instance firewall and open only the required ports. Here's an example that closes all ports except SSH.

     # ufw reset
     # ufw enable
     # ufw default allow outgoing
     # ufw default deny incoming
     # ufw allow ssh/tcp

    The last command makes the firewall accept connections on port 22 from any host. If you have a static IP, you can change the last line to only allow connections from that specific IP.

     # ufw allow proto tcp from YOUR_IP to INSTANCE_IP port 22
  4. The ufw firewall supports connection rate limiting, which is useful for protecting against brute-force login attacks. When a limit rule is used, ufw will normally allow the connection but will deny connections if an IP address attempts to initiate 6 or more connections within 30 seconds.

     # ufw limit ssh/tcp
  5. Verify the firewall rules using the command below.

     # ufw status verbose
     Status: active
     Logging: on (low)
     Default: deny (incoming), allow (outgoing), disabled (routed)
     New profiles: skip
    
     To                         Action      From
     --                         ------      ----
     22/tcp                     LIMIT IN    Anywhere
     22/tcp (v6)                LIMIT IN    Anywhere (v6)

How to use Anaconda

How to Manage Virtual Environments

It is recommended to use virtual environments while working on Python projects. Here are some examples.

  • Create a virtual environment with Anaconda:

      # conda create --name myvenv
  • Create the virtual environment specifying the Python version:

      # conda create --name myvenv python=3.10
  • Clone an existing environment:

      # conda create --name clonevenv --clone myvenv
  • Activate a virtual environment:

      # conda activate myvenv
  • Deactivate an environment:

      # conda deactivate
  • Remove a virtual environment:

      # conda env remove -n myvenv

Installing Packages with Anaconda

  • Install a package:

      # conda install pandas
  • View the installed packages:

      # conda list
  • Remove an installed package:

      # conda remove pandas

The commands above are executed on the active virtual environment. You can also execute those commands on another virtual environment by using the parameter -n

# conda install -n anothervenv pandas

Usage Example

Here's an example that shows how to use the Anaconda Marketplace app to run the K-Nearest Neighbor (KNN) algorithm.

  1. Create an environment for the example.

     # conda create --name knn_example
     # conda activate knn_example
  2. Install the scikit-learn library.

     # conda install scikit-learn
  3. Create the file knn_example.py with the following code:

     from sklearn import datasets
     from sklearn.model_selection import train_test_split
     from sklearn.neighbors import KNeighborsClassifier
     from sklearn import metrics
    
     # Load data
     iris = datasets.load_iris()
     print("Features:", iris.feature_names)
     print("Labels:", iris.target_names)
     print("Data size:", iris.data.shape)
     print("Examples:\n", iris.data[0:3])
    
     # Split train and test instances. 70% training, and 30% test
     X_train, X_test, y_train, y_test = train_test_split(iris.data, iris.target, test_size=0.3)
    
     # Model fit
     knn = KNeighborsClassifier(n_neighbors=5).fit(X_train, y_train)
    
     # Predict
     y_pred = knn.predict(X_test)
    
     # Result
     print("Accuracy:", metrics.accuracy_score(y_test, y_pred))

    This code loads the iris flower dataset and prints some information about this dataset. The dataset is split randomly into train and test subsets. The train portion is used to train the KNN classifier, and the test portion is used to validate the model by comparing the predicted result given by the model with the real labels.

  4. Run the code with:

     # python knn_example.py
     Features: ['sepal length (cm)', 'sepal width (cm)', 'petal length (cm)', 'petal width (cm)']
     Labels: ['setosa' 'versicolor' 'virginica']
     Data size: (150, 4)
     Examples:
      [[5.1 3.5 1.4 0.2]
      [4.9 3.  1.4 0.2]
      [4.7 3.2 1.3 0.2]]
     Accuracy: 0.9555555555555556

How to Secure Jupyter Notebook

With SSH Tunneling

Anaconda comes with many packages already installed in the base environment, and one of those packages is Jupyter. By default, Jupyter runs a webserver only accessible from localhost. To be able to use it from a remote machine, you can use SSH port forwarding.

  1. Connect to the Anaconda instance using SSH and port forwarding. Jupyter uses port 8888 by default.

     $ ssh -L 8888:127.0.0.1:8888 root@INSTANCE_IP
  2. After you connect to the Anaconda instance, start Jupyter Notebook:

     # jupyter nbclassic --allow-root --no-browser
     ...
         To access the notebook, open this file in a browser:
             file:///root/.local/share/jupyter/runtime/nbserver-17248-open.html
         Or copy and paste one of these URLs:
             http://localhost:8888/?token=864fa232b1c37127616370df9e6bf1f867658c240ac9d97f
          or http://127.0.0.1:8888/?token=864fa232b1c37127616370df9e6bf1f867658c240ac9d97f
  3. Copy the URL and paste it into the browser on your local machine.

The connection is secure because it's forwarded to port 8888 through the SSH tunnel.

With HTTPS and Password

You can also make Jupyter available publicly using HTTPS with password protection.

  1. Generate the Jupyter configuration file:

     # jupyter server --generate-config
  2. Define a secure password.

     # jupyter server password

    You can run this command anytime you need to change the password.

  3. Generate a self-signed SSL certificate with OpenSSL to secure the HTTPS connections.

     # openssl req -x509 -newkey rsa:4096 -keyout /etc/ssl/private/server.key -out /etc/ssl/server.crt -days 365 -nodes

    Or, if you have a domain, use a certificate authority like Let's Encrypt to issue the certificate.

  4. Edit the following lines on the Jupyter configuration file /root/.jupyter/jupyter_server_config.py.

     c.ServerApp.allow_password_change = False
     c.ServerApp.allow_root = True
     c.ServerApp.certfile = u'/etc/ssl/server.crt'
     c.ServerApp.keyfile = u'/etc/ssl/private/server.key'
     c.ServerApp.ip = '0.0.0.0'
     c.ServerApp.open_browser = False
     c.ServerApp.port = 8888
  5. Open the Jupyter port on the firewall

     # ufw allow 8888/tcp
  6. Starting the Jupyter server with no parameter is enough, as the needed parameters are already set on the configuration file

     # jupyter nbclassic

Anyone on the internet can see your server running on https://INSTANCE_IP:8888/, but the access is protected with your password. Notice that using self-signed SSL certificates will trigger a warning on your browser when you access your Jupyter webserver. Don't be afraid of those warnings because you can bypass them.

Jupyter Lab is also available by appending /lab in the URL like these examples, depending on which method you use.

  • http://127.0.0.1:8888/lab?token=864fa232b1c37127616370df9e6bf1f867658c240ac9d97f
  • https://INSTANCE_IP:8888/lab

Conclusion

This guide covered deploying and securing an Anaconda instance, managing Python virtual environments, and installing and uninstalling packages using Anaconda. It also showed two usage examples, KNN classification using scikit-learn and how to use Jupyter Notebooks.

More information