Stability AI Unveils Stable Audio 3.0: Six-Minute Tracks, Open Weights
James Ding May 21, 2026 19:19
Stability AI's Stable Audio 3.0 offers six-minute audio generation, open-weight models, and fully licensed training data, setting a new bar for generative music AI.
Stability AI has launched Stable Audio 3.0, a generative music model family that pushes the boundaries of AI-powered audio creation. Announced on May 20, 2026, the release includes four models—Small SFX, Small, Medium, and Large—capable of generating tracks up to six minutes in length. This marks a significant leap from earlier iterations like Stable Audio Open, which was limited to 47-second clips.
What makes Stable Audio 3.0 stand out? First, all models are trained on fully licensed and Creative Commons data, eliminating the copyright issues that have plagued rival platforms. Second, Stability AI has released open weights for the Small and Medium models, allowing developers and artists to run them on consumer-grade hardware. The Large model, meanwhile, is available via API or enterprise hosting.
Technically, the models are built on a new semantic-acoustic autoencoder architecture, dubbed SAME (Semantically-Aligned Music Autoencoder). This innovation supports variable-length generation with per-second granularity, enabling users to precisely tailor audio output. For instance, the 3.0 Small model can generate full compositions of up to two minutes directly on-device, a first for offline music generation. The Medium and Large models extend this capability to over six minutes.
Another compelling feature is the introduction of audio inpainting, which allows users to edit, extend, or refine specific segments of a track without starting from scratch. Additionally, support for LoRa (Low-Rank Adaptation) training enables model fine-tuning on custom datasets, further enhancing creative control.
"Stable Audio 3.0 represents a significant step forward in artist-centric AI," said Stability AI in its announcement. The company emphasized that its focus on licensed data and user ownership of outputs—guaranteed under the Stability AI Community License—sets a new standard for responsible generative AI development. Artists and organizations with over $1 million in annual revenue can opt for an enterprise license, which includes legal indemnification and white-glove support for customization.
The launch of Stable Audio 3.0 comes two years after the release of Stable Audio 2.0 in 2024, which was limited to clips under three minutes. The new models not only double the generation length but also offer improved melodic coherence and structural integrity, addressing a common criticism of earlier generative music systems.
For developers and audio professionals, the open weights for 3.0 Small and Medium are now available on Hugging Face, while the Large model can be accessed via the Stability AI API. The models are also being integrated into platforms like ComfyUI, expanding their accessibility.
Stability AI has hinted at future developments, including new products designed for musicians and expanded partnerships with industry giants like Universal Music Group and Warner Music Group. Interested users can join the waitlist for early access to these tools.
With Stable Audio 3.0, Stability AI is making a clear play to dominate the generative audio space, offering longer tracks, open-weight flexibility, and a commitment to ethical AI practices. As the market for creative AI continues to grow, tools like this are poised to reshape how music is composed, edited, and consumed.
Image source: Shutterstock