HomeMusic & AudioStable Audio

Stable Audio Product Information

Stable Audio is a generative AI platform for music, sound effects, and audio augmentation. It enables users to create up to 3 minutes of high-quality audio for commercial use, leveraging text-to-audio, audio-to-audio, and vocal transformation capabilities. The tool targets musicians and creators who want to turn ideas into reproducible audio content, experiment with style transfers, and produce professional-sounding compositions and soundscapes without requiring advanced technical setup.


Key Capabilities

  • Text-to-audio: describe what you want in text and generate music, sound effects, or soundscapes.
  • Audio-to-audio: blend or transfer styles by adding existing audio into the generation process.
  • Vocal transformation: transform vocals into music or sound effects.
  • Up to 3 minutes of audio per generation, suitable for commercial use.
  • 44.1 kHz stereo output suitable for professional projects.
  • Industry-leading audio diffusion models for high-quality results.
  • User guidance and resources to help you craft effective prompts and workflows.
  • Beta features and ongoing improvements (e.g., vocals input and various input prompts).

How to Use Stable Audio

  1. Sign up / Log in to access the music generation tools.
  2. Choose a mode: Text-to-audio, Audio-to-audio, or Vocal transformation.
  3. Describe or input audio: Provide a text prompt (e.g., "Soulful Boom Bap Hip Hop instrumental") or feed existing audio for style transfer.
  4. Generate a track up to 3 minutes long.
  5. Refine via prompts or additional audio inputs until satisfied.
  6. Download the final audio for use in commercial projects.

Note: You can use generated music in commercial projects, and there is emphasis on high-quality output and flexible creative experimentation.


How It Works

  • Leveraging cutting-edge audio diffusion models, Stable Audio converts text prompts and audio inputs into high-quality musical pieces and sound effects.
  • The system supports text-to-audio, audio-to-audio, and vocal transformations to enable a wide range of creative workflows.
  • Outputs are provided in stereo at 44.1 kHz, ready for professional usage.
  • The platform focuses on ease of use for musicians and creators, enabling rapid iteration and experimentation.

Safety and Compliance

  • Usage is designed for commercial and personal projects with clear licensing through Stable Audio outputs.
  • Users should review terms for attribution and allowed usage in accordance with Stability AI policies.

Core Features

  • Text-to-audio: generate music, sound effects, and soundscapes from descriptive prompts
  • Audio-to-audio: apply style transfers and variations by inputting existing audio
  • Vocal transformation: modify or transform vocal tracks within the generation pipeline
  • Guaranteed output length up to 3 minutes per generation
  • Commercial-use ready audio at 44.1 kHz stereo
  • Access to the latest audio diffusion models for high-quality results
  • Guided user resources (user guide, prompts, and tutorials)