Generating Videos with HuggingFace ModelScope Text2Video Diffusion Model on Vultr Cloud GPU

Updated on July 25, 2024
Generating Videos with HuggingFace ModelScope Text2Video Diffusion Model on Vultr Cloud GPU header image

Introduction

ModelScope is an open-source platform that offers a range of machine learning models and datasets for use in AI applications. The ModelScope text-to-video model allows you to generate short videos from text prompts and customize the generation parameters. The text-to-video model is trained on public datasets with around 1.7 billion parameters and can be used with the WebUI open-source web interface. WebUI is primarily used with StableDiffusion models and supports the ModelScope text-to-video model to generate videos from prompts.

This article explains how to generate videos with the HuggingFace ModelScope text2video diffusion model on a Vultr GPU server. You will set up the model environment and generate videos from text prompts, existing images or videos. Additionally, you will upscale the generated videos to improve the resolution and quality to match your needs.

Prerequisites

Before you begin:

Install WebUI and ModelScope

  1. Install the Git LFS (Large File Storage) dependency package.

    console
    $ sudo apt install git-lfs -y
    
  2. Initialize Git LFS to use the WebUI and ModelScope repositories.

    console
    $ git lfs install
    
  3. Install the FastAPI framework using the Python Pip package manager.

    console
    $ pip3 install fastapi
    
  4. Switch to your user home directory.

    console
    $ cd
    
  5. Clone the WebUI repository using Git.

    console
    $ git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui.git
    
  6. Navigate to the new WebUI directory.

    console
    $ cd stable-diffusion-webui
    
  7. Set the WebUI version to 1.7.0 compatible with ModelScope.

    console
    $ git reset --hard cf2772f
    

    Output:

    HEAD is now at cf2772fa Merge branch 'release_candidate'

    ModelScope is not compatible with some WebUI versions, test the latest versions later in this article or use version 1.7.0 to avoid compatibility errors.

  8. Switch to the models directory to set up ModelScope files.

    console
    $ cd models
    
  9. Create a new directory named ModelScope.

    console
    $ mkdir ModelScope
    
  10. Switch to the directory.

    console
    $ cd ModelScope
    
  11. Create another directory named t2v.

    console
    $ mkdir t2v
    
  12. Switch to the t2v directory.

    console
    $ cd t2v
    
  13. Download the latest ModelScope text-to-video model.

    console
    $ git clone https://huggingface.co/ali-vilab/modelscope-damo-text-to-video-synthesis .
    

    The above command takes between 5 to 15 minutes to complete.

Configure Nginx as a Reverse Proxy to Access WebUI

WebUI is accessible on the localhost port 7860 and only accepts local connection requests. To enable external access to the WebUI interface, configure the Nginx web server as a reverse proxy to forward all requests using your domain name to the backend WebUI port. Follow the steps below to create a new Nginx host configuration and forward external requests to the WebUI port.

  1. Create a new Nginx host configuration file webui.conf using a text editor such as Nano.

    console
    $ sudo nano /etc/nginx/sites-available/webui.conf
    
  2. Add the following configurations to the file. Replace webui.example.com with your actual domain name.

    nginx
    upstream webui {
        server 127.0.0.1:7860;
    }
    
    server {
        listen 80;
        listen [::]:80;
        server_name example.com;
    
        proxy_set_header Host $host;
        proxy_http_version 1.1;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        proxy_set_header Sec-WebSocket-Extensions $http_sec_websocket_extensions;
        proxy_set_header Sec-WebSocket-Key $http_sec_websocket_key;
        proxy_set_header Sec-WebSocket-Version $http_sec_websocket_version;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection "Upgrade";
    
        location / {
            proxy_pass http://webui;
        }
    }
    

    Save and close the file.

    The above Nginx configuration forwards all incoming HTTP connections on port 80 using your domain name webui.example.com to the backend WebUI port 7860.

  3. Enable the new WebUI Nginx configuration file.

    console
    $ sudo ln -s /etc/nginx/sites-available/webui.conf /etc/nginx/sites-enabled/
    
  4. Test the Nginx configuration for errors.

    console
    $ sudo nginx -t
    

    Output:

    nginx: the configuration file /etc/nginx/nginx.conf syntax is ok
    nginx: configuration file /etc/nginx/nginx.conf test is successful
  5. Restart Nginx to apply the configuration changes.

    console
    $ sudo systemctl restart nginx
    
  6. Allow the HTTP port 80 through the default UFW application to enable HTTP connections on the server.

    console
    $ sudo ufw allow 80/tcp
    
  7. Reload the UFW firewall rules to apply changes.

    console
    $ sudo ufw reload
    

Secure WebUI with Trusted Let's Encrypt SSL Certificates

WebUI is accessible on the HTTP port 80 handled by the Nginx reverse proxy connection. Generate trusted Let's Encrypt SSL certificates to secure access to the WebUI interface and encrypt all traffic while generating videos with ModelScope. Follow the steps below to install the Certbot Let's Encrypt client and generate trusted SSL certificates on your WebUI domain webui.example.com.

  1. Install Certbot using the Snap package manager.

    console
    $ sudo snap install certbot --classic
    
  2. Create a symbolic link to enable the system-wide Certbot command.

    console
    $ sudo ln -s /snap/bin/certbot/ /usr/bin/certbot
    
  3. Request a new Let's Encrypt SSL certificate using the WebUI domain in your Nginx configuration. Replace webui.example.com and user@example.com with your actual details.

    console
    $ sudo certbot --nginx -d webui.example.com -m user@example.com --agree-tos
    

    Output:

    Successfully received certificate.
    Certificate is saved at: /etc/letsencrypt/live/webui.example.com/fullchain.pem
    Key is saved at:         /etc/letsencrypt/live/webui.example.com/privkey.pem
    This certificate expires on 2024-08-04.
    These files will be updated when the certificate renews.
    Certbot has set up a scheduled task to automatically renew this certificate in the background.
    
    Deploying certificate
    Successfully deployed certificate for webui.example.com to /etc/nginx/sites-enabled/webui.conf
    Congratulations! You have successfully enabled HTTPS on https://webui.example.com
  4. Test the Certbot automatic renewal process and verify that the SSL certificate auto-renews upon expiry.

    console
    $ sudo certbot renew --dry-run
    

    Output:

    Account registered.
    Simulating renewal of an existing certificate for webui.example.com
    
    - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
    Congratulations, all simulated renewals succeeded: 
      /etc/letsencrypt/live/webui.example.com/fullchain.pem (success)
    - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
  5. Allow the HTTPS port 443 through the UFW firewall to enable secure access to your server.

    console
    $ sudo ufw allow 443/tcp
    
  6. Reload UFW to apply the firewall changes.

    console
    $ sudo ufw reload
    

Create a New System Service for WebUI

WebUI is available on the server and runs on the localhost port 7860 if the main application file launch.py executes. To control the WebUI application processes on your server, set up a new system service to run and manage the application. In addition, output the process errors and warnings to a standalone log file as described in the following steps.

  1. Create a new system service file webui.service

    console
    $ sudo nano /etc/systemd/system/sdwebui.service
    
  2. Add the following configurations to the file. Replace exampleuser with your user and /home/user/stable-diffusion-webui with your actual WebUI application files path.

    systemd
    [Unit]
    Description=Stable Diffusion AUTOMATIC1111 Web UI service
    After=network.target
    StartLimitIntervalSec=0
    
    [Service]
    Type=simple
    Restart=always
    RestartSec=1
    User=exampleuser
    ExecStart=/usr/bin/env python3 /home/exampleuser/stable-diffusion-webui/launch.py
    StandardOutput=append:/var/log/sdwebui.log
    StandardError=append:/var/log/sdwebui.log
    WorkingDirectory=/home/exampleuser/stable-diffusion-webui
    [Install]
    WantedBy=multi-user.target
    

    Save and close the file.

    The above configuration runs WebUI as a system service that automatically restarts when the application fails and logs all output to the /var/log/sdwebui.log file.

  3. Reload the systemd daemon to apply the system service.

    console
    $ sudo systemctl daemon-reload
    
  4. Enable the WebUI service to start at boot time.

    console
    $ sudo systemctl enable sdwebui
    

    Output:

    Created symlink /etc/systemd/system/multi-user.target.wants/sdwebui.service → /etc/systemd/system/sdwebui.service.
  5. Start the WebUI service.

    console
    $ sudo systemctl start sdwebui
    
  6. View the WebUI service status and verify that the application is running.

    console
    $ sudo systemctl status sdwebui
    

    Output:

    ● sdwebui.service - Stable Diffusion AUTOMATIC1111 Web UI service
         Loaded: loaded (/etc/systemd/system/sdwebui.service; disabled; vendor preset: enabled)
         Active: active (running) since Sun 2024-04-28 18:07:19 UTC; 14s ago
       Main PID: 29672 (python3)
          Tasks: 3 (limit: 77017)
         Memory: 4.3G
            CPU: 13.483s
         CGroup: /system.slice/sdwebui.service
                 ├─29672 python3 /home/user/stable-diffusion-webui/launch.py
                 ├─29675 /bin/sh -c "\"/usr/bin/python3\" -m pip install torch==2.0.1 torchvision==0.15.2 --extra-index-url https://download.pytorch.org/whl/cu118"
                 └─29676 /usr/bin/python3 -m pip install torch==2.0.1 torchvision==0.15.2 --extra-index-url https://download.pytorch.org/whl/cu118
  7. Wait for at least 15 minutes for the ModelScope installation process to complete, then view the application logs to verify the WebUI application status.

    console
    $ tail -f /var/log/sdwebui.log
    

    Your output should be similar to the one below.

    Downloading: "https://huggingface.co/runwayml/stable-diffusion-v1-5/resolve/main/v1-5-pruned-emaonly.safetensors" to /home/exampleuser/stable-diffusion-webui/models/Stable-diffusion/v1-5-pruned-emaonly.safetensors
    
    100%|██████████| 3.97G/3.97G [00:51<00:00, 83.2MB/s]
    Calculating sha256 for /home/exampleuser/stable-diffusion-webui/models/Stable-diffusion/v1-5-pruned-emaonly.safetensors: Running on local URL:  http://127.0.0.1:7860
    
    To create a public link, set `share=True` in `launch()`.
    Startup time: 341.6s (prepare environment: 282.4s, import torch: 3.1s, import gradio: 0.7s, setup paths: 0.8s, initialize shared: 0.2s, other imports: 0.8s, setup codeformer: 0.1s, list SD models: 52.1s, load scripts: 0.6s, create ui: 0.7s, gradio launch: 0.1s).
    6ce0161689b3853acaa03779ec93eafe75a02f4ced659bee03f50797806fa2fa
    Loading weights [6ce0161689] from /home/exampleuser/stable-diffusion-webui/models/Stable-diffusion/v1-5-pruned-emaonly.safetensors
    Creating model from config: /home/exampleuser/stable-diffusion-webui/configs/v1-inference.yaml
    /home/exampleuser/.local/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
      warnings.warn(
    Applying attention optimization: Doggettx... done.
    Model loaded in 18.8s (calculate hash: 10.7s, load weights from disk: 0.2s, create model: 4.6s, apply weights to model: 3.0s, calculate empty prompt: 0.1s).

    Press Ctrl + C to stop the WebUI logs output.

  8. Update the tqdm module required by the WebUI interface to run models.

    console
    $ pip3 install --upgrade tqdm
    

Access WebUI and Generate Videos using ModelScope

  1. Access your WebUI domain using a web browser such as Chrome.

    https://webui.example.com

    Verify that the WebUI interface displays in your browser session.

    Access the WebUI Interface

Install ModelScope Extensions

  1. Click Extensions on the main WebUI navigation menu.

  2. Navigate to the Available tab and click Load from: to load all available models in the WebUI repository.

  3. Enter modelscope in the search field and press Space to run a new search.

  4. Click Install in the text2video result action tab to install the ModelScope extension. Wait for the extension installation process to complete and verify that it's installed in your project directory similar to the following output.

    Installed into /home/exampleuser/stable-diffusion-webui/extensions/sd-webui-text2video. Use Installed tab to restart.
  5. Remove the existing modelscope keyword in your search field and enter video in extras tab to browse new additional extensions.

  6. Click Install next to the Video in Extras tab result to install the new extension.

    Install the Video In Extras Tab Extension for Stable Diffusion Web UI

  7. Wait for the extension installation to complete and click Reload UI at the bottom of the web page to reload WebUI.

    Reload the Stable Diffusion Web UI

    Wait at least 2 minutes for WebUI to restart and load all installed extensions on your server.

Generate Videos From Text Prompts using ModelScope

  1. Navigate to the txt2video tab within the WebUI interface.

  2. Verify that ModelScope is selected as the Model type.

  3. Enter your desired text prompt in the Prompt field to define your generated video. For example, an astronaut driving on the moon in a photorealistic way.

    Generate a Video Using the Stable Diffusion Web UI

  4. Click Generate and wait for at least 3 minutes for the ModelScope generation process to complete.

  5. Preview the generated video and verify that it matches your needs. To modify the generated video, change your input text prompt and negative prompt to generate a new video using ModelScope to match your goals.

    Preview a Video Using the Stable Diffusion Web UI

Generate Videos From Images using Inpainting and ModelScope

  1. Scroll and expand the img2vid option below the txt2video customization section.

  2. Click the Drop File Here field to browse and upload a base image from your computer to the WebUI.

  3. Click the inpainting frames slider to set your target number of video frames to generate. For example, 24.

    Configure the Prompt for Generating a Video From an Image Using the Stable Diffusion Web UI

  4. Enter a text prompt in the Prompt field to define your generated video based on your input image. For example, upload an image of a car and use a prompt such as the car is moving on a road.

  5. Click Generate to create your video with ModelScope.

  6. Wait for the video generation process to complete and preview the generated video or modify your input prompts.

    Generate a Video From an Image Using the Stable Diffusion Web UI

Generate Videos from Existing Videos using Vid2vid

  1. Navigate to the vid2vid tab within the model options next to txt2vid.

  2. Click the Drop File Here field to browse your local computer files and upload a base video file. When a video file is available in your server files, enter the path in the Input video path field instead.

  3. Enter a text prompt in the Prompt field to define your generated video and verify any customization values such as the final video resolution.

  4. Click Generate to start the video generation process using ModelScope.

    Generate a Video From an Existing Video Using the Stable Diffusion Web UI

Upscale Generated Videos

The resolution of generated videos matches your target width and height customization options during the ModelScope generation process. Upscaling the generated videos enhances the original resolution to improve the final video quality. Follow the steps below to upscale your generated videos using the WebUI interface.

  1. Access your server SSH session and switch to your user home directory.

    console
    $ cd
    
  2. Create a new upscaled-videos directory to store your upscaled video files.

    console
    $ mkdir upscaled-videos
    
  3. Switch to your WebUI project directory. Replace /home/user/stable-diffusion-webui/outputs/img2img-images/text2video/ with your actual project directory path.

    console
    $ cd /home/user/stable-diffusion-webui/outputs/img2img-images/text2video/
    
  4. List files in the directory to verify the generated naming structure.

    console
    $ ls
    

    Your output should look like the one below.

    20240310182107

    Based on the above output, WebUI writes generated videos to independent directories based on the generation time. For example, 20240310182107 translates 2024-03-10 at 18:21. Copy the directory name to verify the file contents to use when upscaling the generated video files.

  5. Switch to your target output video directory. For example, 20240310182107.

    console
    $ cd 20240310182107
    
  6. Print the generated video file vid.mp4 absolute path to use when upscaling the video.

    console
    $ readlink -f vid.mp4
    

    Output:

    /home/user/stable-diffusion-webui/outputs/img2img-images/text2video/20240310182107/vid.mp4
  7. Access your WebUI interface in a new web browser session.

    https://webui.example
  8. Click Extras on the main WebUI navigation menu.

  9. Navigate to the Video tab next to Batch from Directory.

  10. Paste your generated video absolute path in the Input video field.

  11. Enter your upscaled videos directory path in the Output directory field. For example, /home/user/upscaled-videos/.

  12. Click the Upscaler 1 dropdown and select a variant such as ESRGAN_4x from the list of model options.

  13. Click Generate to start upscaling your target video.

    Upscale a Video Using the Stable Diffusion Web UI

  14. Wait for at least 5 minutes for the upscaling process to complete depending on the input file size. Then, switch to your terminal session to set up a new URL path to preview the upscaled video.

  15. Navigate to your output upscaled-videos directory.

    console
    $ cd /home/user/upscaled-videos
    
  16. List the directory files and verify that a new out_ directory is available.

    console
    $ ls
    

    Your output should look like the one below.

    out_1710096504
  17. Switch to the new upscaled output directory to view the enhanced video files. For example, out_1710096504.

    console
    $ cd out_1710096504
    
  18. List files in the directory and verify that a new .mp4 video file is available.

    console
    $ ls
    

    Output:

    1443.png 2100.png 3002.png output_vid.mp4_1710096504.mp4

    Copy the upscaled video filename that includes a .mp4 extension to rename and store in your web server root files directory.

  19. Copy the upscaled video file to your web files directory /var/www/html as upscaled_video.mp4. Replace output_vid.mp4 with your actual filename.

    console
    $ sudo cp output_vid.mp4 /var/www/html/upscaled_video.mp4
    
  20. Open the WebUI Nginx configuration file.

    console
    $ sudo nano /etc/nginx/sites-enabled/webui.conf
    
  21. Add the following configuration within your listen 443 ssl server block.

    nginx
    location /video_preview {
         alias /var/www/html;
         try_files $uri $uri/ =404;
    }
    

    Save and close the file.

    The above configuration forwards all /video_preview path requests to the /var/www/html directory.

  22. Test the Nginx configuration for errors.

    console
    $ sudo nginx -t
    

    Output:

    nginx: the configuration file /etc/nginx/nginx.conf syntax is ok
    nginx: configuration file /etc/nginx/nginx.conf test is successful
  23. Restart Nginx to apply the configuration changes.

    console
    $ sudo systemctl restart nginx
    
  24. Load your upscaled video in a new web browser window using your WebUI domain with the /video_preview path.

    https://example.com/video_preview/upscaled_video.mp4

    Verify that the upscaled video displays in your web browser session.

Troubleshooting

Depending on your production server setup, you may encounter errors while running WebUI or generating videos with Modelscope. Follow the troubleshooting steps below to fix any issues you may encounter while running the video generation pipeline.

  • WebUI does not generate videos with ModelScope: Verify that the model files are available in the specified path.
  • Unable to Access WebUI: View the application logs using the command sudo tail -f /var/logs/sdwebui.log to verify the WebUI or ModelScope application errors.
  • HTTP Error 502: The Nginx reverse proxy is unable to connect to the WebUI application. Verify that WebUI is running using the systemctl status sdbwebui command.
  • WebUI service startup failure: Verify that WorkingDirectory, ExecStart and User parameters are correctly set in your sdwebui.service file.
  • ModelScope generates distorted videos: Increase the CFG scale and number of generation steps.
  • ModuleNotFoundError: No module named 'tqdm.auto': Run pip3 install --upgrade tqdm to update the tqdm package using the Pip.
  • torch.cuda.OutOfMemoryError: CUDA out of memory.: The server doesn't have enough GPU memory resources to generate videos. Upgrade your server or clear the GPU memory to generate videos with ModelScope.

Conclusion

You have deployed ModelScope and generated videos using the WebUI web interface. The ModelScope txt2video model lets you generate videos from text prompts, existing images and videos using your Vultr Cloud GPU server resources. For more information and configuration options, visit the ModelScope Hugging Face page.

Note
The ModelScope text-to-video model is distributed with the Creative Commons (CC) Attribution Non Commercial 4.0 International license. You can use the model for research purposes with no commercial usage. In addition, you may be required to give appropriate credit to the model creators depending on your project size.