Stable Audio is a generative AI platform for music, sound effects, and audio augmentation. It enables users to create up to 3 minutes of high-quality audio for commercial use, leveraging text-to-audio, audio-to-audio, and vocal transformation capabilities. The tool targets musicians and creators who want to turn ideas into reproducible audio content, experiment with style transfers, and produce professional-sounding compositions and soundscapes without requiring advanced technical setup.
Key Capabilities
- Text-to-audio: describe what you want in text and generate music, sound effects, or soundscapes.
- Audio-to-audio: blend or transfer styles by adding existing audio into the generation process.
- Vocal transformation: transform vocals into music or sound effects.
- Up to 3 minutes of audio per generation, suitable for commercial use.
- 44.1 kHz stereo output suitable for professional projects.
- Industry-leading audio diffusion models for high-quality results.
- User guidance and resources to help you craft effective prompts and workflows.
- Beta features and ongoing improvements (e.g., vocals input and various input prompts).
How to Use Stable Audio
- Sign up / Log in to access the music generation tools.
- Choose a mode: Text-to-audio, Audio-to-audio, or Vocal transformation.
- Describe or input audio: Provide a text prompt (e.g., "Soulful Boom Bap Hip Hop instrumental") or feed existing audio for style transfer.
- Generate a track up to 3 minutes long.
- Refine via prompts or additional audio inputs until satisfied.
- Download the final audio for use in commercial projects.
Note: You can use generated music in commercial projects, and there is emphasis on high-quality output and flexible creative experimentation.
How It Works
- Leveraging cutting-edge audio diffusion models, Stable Audio converts text prompts and audio inputs into high-quality musical pieces and sound effects.
- The system supports text-to-audio, audio-to-audio, and vocal transformations to enable a wide range of creative workflows.
- Outputs are provided in stereo at 44.1 kHz, ready for professional usage.
- The platform focuses on ease of use for musicians and creators, enabling rapid iteration and experimentation.
Safety and Compliance
- Usage is designed for commercial and personal projects with clear licensing through Stable Audio outputs.
- Users should review terms for attribution and allowed usage in accordance with Stability AI policies.
Core Features
- Text-to-audio: generate music, sound effects, and soundscapes from descriptive prompts
- Audio-to-audio: apply style transfers and variations by inputting existing audio
- Vocal transformation: modify or transform vocal tracks within the generation pipeline
- Guaranteed output length up to 3 minutes per generation
- Commercial-use ready audio at 44.1 kHz stereo
- Access to the latest audio diffusion models for high-quality results
- Guided user resources (user guide, prompts, and tutorials)