Stability.AI Releases ‘Stable Audio 2.0’—Now Generates 3-Minutes of Output

Stability.AI stable audio 2.0 model
  • Save

Stability.AI stable audio 2.0 model
  • Save
Photo Credit: Stability.AI

Stability AI has released a new version of its Stable Audio model that allows users to input their own audio samples. Users can then transform that sample using prompts to change the sound using Stable Audio 2.0.

The first version of this generative AI model was released in September 2023, but could only process up to 90 seconds of output. This new version of Stable Audio allows output of up to three minutes—generating an entire sound clip. While the new model can accept user-generated uploads, all uploaded audio must be copyright-free.

Stability AI says this version of its genAI output is more able to create something that sounds like an actual song—with intro, progression, and outro clearly defined. Other new features include the ability to adjust prompt strength and how closely the AI should follow the prompt, or how much of the uploaded audio will be modified.

While the new generative AI model certainly adds the ability to generate three-minute clips—it’s hard to say they’re worth anything.

Companies like Google and Meta have dabbled in audio generation that can generally follow a text-based prompt and arrive at something that sounds vaguely similar to the input words. Think of it as if you’re drunk listening to music through a glass pressed to the door of a night club playing the prompt you’ve suggested.

It’s the lack of deliberate sound creation that makes AI-generated music sound more like a cacophony of sound than a written song.

Stability AI trained Stable Audio on data from AudioSparx, which features a library of more than 800,00 audio files. AudioSparx artists were allowed to opt-out of their works being included in Stability’s training model—which has landed Stability in hot water in the past.

Stability AI argued in favor of ‘fair use’ of copyrighted material to power its generative AI models. That position caused its VP of Audio Ed Newton-Rex to bail on the company, telling MBW “I don’t agree with the company’s opinion that training generative AI models on copyrighted works is ‘fair use.’”