Versions

Overview of available model variants

Model	Release	Strengths	Weaknesses	Status
Stable Audio 3.0 Large Recommended	2026-05-20	Music generation up to 6:20 minutes 2.7B parameters Trained entirely on licensed data (AudioSparx, Freesound CC) Maintains musical structure and melody across long tracks	Only available via API/fal.ai or enterprise license (no open weights)	Current
Stable Audio 3.0 Medium	2026-05-20	Open weights on Hugging Face 1.4B parameters Full compositions up to 6:20 minutes Licensed training data	Lower quality than Large variant	Current
Stable Audio 3.0 Small / Small SFX	2026-05-20	Very compact (459M parameters) Open weights Dedicated sound-effects variant (Small SFX)	Shorter clips than Medium/Large	Current
Stable Diffusion 3.5 Large Recommended	2024-10	Highest image quality in the SD family 8B parameters Open weights (Hugging Face) Industry-leading prompt adherence	High VRAM requirement (~24 GB)	Current
Stable Diffusion 3.5 Medium	2024-10	Good quality/speed balance 2.6B parameters Lower resource requirements	Less detail than Large variant	Current
Stable Diffusion 3.5 Large Turbo	2024-10	Fast inference Few steps for good results Ideal for real-time applications	Slight quality trade-off vs. Large	Current
Stable Video Diffusion 2.0	2025	Text-to-video generation Image-to-video animation	Short clip lengths	Current
Stable Audio 2.0	April 2024	Music and audio generation Stereo output	Superseded by Stable Audio 3.0 (May 2026) Maximum ~3 minutes per track	Deprecated
StableLM 2 1.6B	2024	Very compact Open source	No longer actively developed Limited capabilities	Deprecated
StableLM Zephyr 3B	2024	Instruction following Compact	No longer actively developed	Deprecated
Stable Diffusion 3.0	2024	Good image quality	Superseded by SD 3.5	Deprecated
SDXL	2023	Large ecosystem Many community models	Older generation Weaker prompt adherence	Deprecated

Technical Details

API, features and capabilities

API & Availability

Availability Public

Latency (TTFT) ~2-10s per image (GPU-dependent)

Throughput Hardware-dependent Tokens/Sec

Features & Capabilities

Vision File Upload

Training & Knowledge

Knowledge Cutoff 2024

Fine-Tuning Available (LoRA, DreamBooth, Textual Inversion, Full Fine-Tuning)

Language Support

Best Quality English

Supported Primarily English

Image generation works best with English prompts; basic multilingual support via T5 encoder

Hosting & Compliance

GDPR-compliant hosting options and licensing

GDPR-Compliant Hosting Options

License & Hosting

License Stability AI Community License (open weights, commercial use permitted)

Security Filters Built-in safety filters, customizable

On-Premise

As AI consultants based in Rosenheim, Germany, we recommend Stability AI for enterprises seeking high-quality image, video, and audio generation with open weights and full data control. With Stable Diffusion 3.5 and the new Stable Audio 3.0 (May 2026), Stability AI offers one of the most powerful open-source generation ecosystems on the market.

From StableLM to Stable Diffusion 3.5

Stability AI underwent a turbulent period in 2023/2024 (founder resignation, funding difficulties) but has since stabilized. Since June 2024, CEO Prem Akkaraju (formerly Weta Digital) leads the company with fresh capital (~$80M). In November 2025, Stability AI won the copyright lawsuit with Getty Images in the High Court of England and Wales. The company focuses on its core competency: generative models for image, video, and audio. The StableLM language models are no longer actively developed.

Why Stability AI for Enterprises?

Open Weights: Models freely available on Hugging Face
EU Hosting: Via AWS Bedrock in Frankfurt or self-hosting
Broad Ecosystem: Thousands of community extensions and LoRA models
Multimodal: Image, video, and audio from a single provider
GDPR-Compliant: Full self-hosting possible

Stable Diffusion 3.5 – The Flagship

Stable Diffusion 3.5 (October 2024) represents a significant quality leap over previous versions. The new architecture is based on a Diffusion Transformer (DiT) with a dual text encoder (CLIP and T5).

Three Variants for Different Requirements

Variant	Parameters	VRAM	Strength
SD 3.5 Large	8B	~24 GB	Highest quality
SD 3.5 Medium	2.6B	~10 GB	Balanced profile
SD 3.5 Large Turbo	8B	~24 GB	Fast inference

SD 3.5 Large delivers industry-leading prompt adherence and image quality. The Turbo variant is suited for applications with real-time requirements, while Medium offers a good compromise between quality and resource demands.

Beyond Images: Video and Audio

Stable Video Diffusion 2.0

Stable Video Diffusion 2.0 (2025) enables the generation of short video clips from text prompts or single images. The technology is suitable for product animations, social media content, and creative prototypes.

Stable Audio 3.0 – New Audio Generation (May 2026)

On May 20, 2026, Stability AI released Stable Audio 3.0 – a family of four models (Small SFX and Small at 459M each, Medium at 1.4B, and Large at 2.7B parameters). Medium and Large can generate compositions up to 6 minutes and 20 seconds long while maintaining musical structure and melody – more than double the length of the predecessor Stable Audio 2.0.

Small SFX, Small, and Medium are available as open weights on Hugging Face and can be self-hosted in a GDPR-compliant manner. The Large variant is exclusively accessible through the Stability AI API, partner fal.ai, or an enterprise license. Important for businesses: companies with more than $1M in annual revenue require an enterprise license for commercial use. All models were trained on fully licensed data (AudioSparx library and Creative Commons recordings from Freesound) – a clear compliance advantage over competing audio models.

StableLM – Language Model Status

The StableLM language models (StableLM 2 1.6B, StableLM Zephyr 3B) have reached deprecated status. They are no longer actively maintained and are not recommended for production applications. For language models, we recommend more capable alternatives such as Meta Llama or Microsoft Phi.

GDPR-Compliant Deployment in the EU

Stability AI offers several options for European enterprises:

AWS Bedrock: Stable Diffusion 3.x available in Frankfurt (eu-central-1)
Self-Hosting: Download open weights from Hugging Face and run on your own infrastructure
Azure AI: Limited availability

Through open model weights, enterprises retain full control over their data – a decisive advantage for GDPR compliance.

Integration with CompanyGPT

Stability AI models can be integrated in CompanyGPT as a self-hosted option for image generation – ideal for marketing teams that want to create visual assets internally and in compliance with data protection regulations.

Our Recommendation

Stable Diffusion 3.5 Large is the top choice for enterprises that need high-quality image generation with full data control. For resource-constrained environments, SD 3.5 Medium offers a compelling alternative.

Those who additionally need video or audio generation will find a growing ecosystem with Stable Video Diffusion 2.0 and the new Stable Audio 3.0 (May 2026) – including open-weights options for GDPR-compliant self-hosting. For pure language model applications, however, we recommend alternatives like Meta Llama or Mistral.

Cost estimation for this model

For up-to-date token pricing, model variants and EU availability, see our sister project ai-prices.eu. It helps you compare and estimate the operational cost of leading AI models for your specific use case.

Compare prices on ai-prices.eu

ai-prices.eu is a project by innFactory AI Consulting GmbH and provides transparent cost estimates for leading AI models.

Stability AI

Versions

Use Cases

Technical Details

Hosting & Compliance

Benchmarks

From StableLM to Stable Diffusion 3.5

Why Stability AI for Enterprises?

Stable Diffusion 3.5 – The Flagship

Three Variants for Different Requirements

Beyond Images: Video and Audio

Stable Video Diffusion 2.0

Stable Audio 3.0 – New Audio Generation (May 2026)

StableLM – Language Model Status

GDPR-Compliant Deployment in the EU

Integration with CompanyGPT

Our Recommendation

Cost estimation for this model

Similar Models

SOOFI (Soofi S)

Tencent Hunyuan (Hy3)

NVIDIA Nemotron

Consultation for this model?