Getting Started8 min read

Getting Started with Pusa V1: Complete Setup Guide

Learn how to install and configure Pusa V1 on your system, from downloading the model to running your first video generation.

July 20, 2025By Pusa V1 Team

Pusa V1 is an open-source video generation model based on Alibaba's Juan 2.1, offering 5x faster processing with improved quality. This guide will walk you through the complete setup process to get Pusa V1 running on your system.

System Requirements

Before installing Pusa V1, ensure your system meets the following requirements:

  • CUDA 12.4: Required for optimal performance
  • GPU: NVIDIA GPU with sufficient VRAM (8GB+ recommended)
  • Python: Python 3.8 or higher
  • Storage: At least 10GB free space for model weights
  • RAM: 16GB+ system RAM recommended

Installation Steps

Step 1: Clone the Repository

Start by cloning the official Pusa V1 repository from GitHub:

git clone https://github.com/Yaofang-Liu/Pusa-VidGen.git
cd Pusa-VidGen

Step 2: Install Dependencies

Install the required Python packages:

pip install -r requirements.txt

Step 3: Download Model Weights

Download the Pusa V1 model weights from HuggingFace:

git lfs install
git clone https://huggingface.co/RaphaelLiu/PusaV1

Configuration

Environment Setup

Create a Python virtual environment for better dependency management:

python -m venv pusav1_env
source pusav1_env/bin/activate # On Windows: pusav1_env\Scripts\activate

Model Configuration

The model supports various configuration options for different use cases:

  • Text-to-Video: Generate videos from text descriptions
  • Image-to-Video: Convert static images to video sequences
  • Video Extension: Extend existing video clips
  • Start-End Frames: Generate videos between two images

Running Your First Generation

Text-to-Video Example

Here's a simple example to generate a video from text:

python generate_video.py --prompt "A cat walking in a garden" --output_path ./output/

Image-to-Video Example

Generate a video from a starting image:

python generate_video.py --image_path ./input/start_image.jpg --prompt "The image comes to life" --output_path ./output/

Performance Optimization

GPU Memory Management

Pusa V1 is optimized for efficiency, but you can further optimize performance:

  • Use gradient checkpointing for memory efficiency
  • Adjust batch size based on your GPU memory
  • Enable mixed precision training for faster processing
  • Use appropriate video resolution settings

Quality vs Speed Trade-offs

Pusa V1 offers several parameters to balance quality and generation speed:

  • Inference Steps: Fewer steps = faster generation, more steps = higher quality
  • Resolution: Lower resolution = faster processing
  • Frame Rate: Adjust based on your needs
  • Duration: Shorter videos generate faster

Troubleshooting

Common Issues

CUDA Version Mismatch

Ensure you have CUDA 12.4 installed. You can check your CUDA version with nvidia-smi.

Out of Memory Errors

Reduce batch size or video resolution if you encounter GPU memory issues.

Next Steps

Now that you have Pusa V1 installed and running, explore these resources to learn more:

  • Check out the GitHub repository for detailed documentation
  • Visit the HuggingFace model page for examples and discussions
  • Try the online demo to see Pusa V1 in action
  • Read our other blog posts for advanced techniques and tips

Pro Tip

Pusa V1 is 5x faster than the base Juan 2.1 model while maintaining high quality. This makes it perfect for rapid prototyping and iterative video generation workflows.