Wan AI is Alibaba Cloud's cutting-edge AI video generation platform designed to lower the barrier to creative video work using artificial intelligence. It offers advanced capabilities to transform text or static images into high-quality, cinematic videos with real-time lip synchronization and natural motion.
**Wan2.1 and Wan2.5 Models**: Versions of Wan AI that support both text-to-video and image-to-video generation
**Realistic Movements and Facial Expressions**: Produces natural body movements and professional camera work
**Multi-language Support**: Supports both Chinese and English text input for video creation
**High Resolutions**: Outputs videos in 480p, 720p, and up to 1080p (Wan 2.5)
**Audio-Driven Video Generation**: Convert static images and audio into videos with real-time lip sync
**One-Pass Audio-Video Sync**: Automatically synchronizes voiceover with lip movements without manual effort
**Support for Full-Body and Half-Body Characters**: Flexible character formats for diverse use cases
**Open Source Availability**: Wan2.1 series models are open-sourced under Apache 2.0 license for commercial and research use
**Scalable and Enterprise Ready**: Provides APIs, detailed analytics, and robust infrastructure for business use
**Cost Efficient**: Wan 2.5 offers a streamlined model reducing creator costs with multi-format and longer video durations (up to 10 seconds)
Technical Highlights
Built on Tongyi Wanxiang foundation model with AdaIN and CrossAttention control
Model sizes ranging from 1.3 billion to 14 billion parameters
Supports multi-format output with 24 fps frame rate
Runs optimally on GPUs with 24GB+ VRAM and 32GB+ RAM
License: Apache 2.0 for commercial application
Use Cases
Marketing teams for fast and polished video demos
Enterprises for multilingual, lip-synced video localization
Content creators and storytellers for immersive narratives
Corporate training with clear, engaging video materials