Versions

Overview of available model variants

Model	Release	Strengths	Weaknesses	Status
Muse Spark	April 2026	Successor to the Llama family Developed by Meta Superintelligence Labs	Very few details available EU availability and licensing unclear	Current
Llama 4 Maverick	April 2025	400B parameters (MoE, 17B active) 1M token context window Natively multimodal (text, image, video) Outperforms GPT-4 on several benchmarks	Not available in the EU due to licensing Multi-GPU required (200-400 GB VRAM)	Current
Llama 4 Scout	April 2025	10M token context window – industry-unique 109B parameters (MoE, 17B active) Runs on single H100 80GB (INT4) Natively multimodal	Not available in the EU due to licensing	Current
Llama 3.3 70B Recommended	December 2024	Proven and widely deployed Good performance/resource balance	No multimodal input	Current
Llama 3.2 (1B/3B/11B/90B)	September 2024	Wide size range Compact variants for edge	Older generation	Current
Llama 3.1 (405B/70B/8B)	July 2024	405B variant with top performance 128k context window	High resource requirements (405B)	Current

Technical Details

API, features and capabilities

API & Availability

Availability Public

Latency (TTFT) Depends on hosting

Throughput Depends on hardware Tokens/Sec

Features & Capabilities

Tool Use Function Calling Structured Output Vision File Upload

Training & Knowledge

Knowledge Cutoff 2024-08

Fine-Tuning Available (LoRA, QLoRA, Full Fine-Tuning, PEFT)

Language Support

Best Quality English, German, French, Spanish

Supported 50+ languages

Best quality in English, good quality in Western European languages

Hosting & Compliance

GDPR-compliant hosting options and licensing

GDPR-Compliant Hosting Options

License & Hosting

License Llama 4 Community License (EU use excluded), Llama 3.x Community License (EU permitted)

Security Filters Customizable

On-Premise Edge-capable

innFactory AI Consulting based in Rosenheim, Germany supports enterprises across the DACH region with GDPR-compliant self-hosting of Meta Llama. With open weights you have full control – no data leaves your infrastructure.

Muse Spark: Successor to Llama (April 2026)

On April 9, 2026, Meta announced Muse Spark through its new Meta Superintelligence Labs division as the successor to the Llama family. Details on architecture, parameter count, and licensing remain scarce.

Note: EU availability and licensing for Muse Spark are not yet known. We recommend monitoring Muse Spark for now and continuing to use Llama 3.3 70B or Mistral for production EU applications.

Important Notice: EU Licensing Restriction for Llama 4

The Llama 4 Community License explicitly prohibits use and distribution within the EU. Companies domiciled or with their main place of business in the EU may not use or host Llama 4 Scout or Maverick. For EU enterprises, we recommend Llama 3.3 70B or alternatively Mistral as a European open-source alternative.

Llama 4: Technical Excellence – Without EU Access

Llama 4 Maverick (400B MoE)

128 experts, 17B active parameters per token
1M token context window
Natively multimodal (text, image, video)
Outperforms GPT-4 on reasoning and coding benchmarks

Llama 4 Scout (109B MoE)

16 experts, 17B active parameters
10M token context window – industry-unique
Runs on a single H100 80GB (INT4)
Ideal for massive document analysis and codebase parsing

Llama 4 Behemoth (announced)

~2 trillion parameters, 288B active
Announced but never released. Superseded by Muse Spark.
Positioned as a “teacher model” for other Llama models

Llama 3.x: Recommended for EU Enterprises

The Llama 3.x series is not subject to EU restrictions and remains the recommended choice for European enterprises:

Key Strengths

Full Control: Model runs in your infrastructure
No API Costs: Only hardware/cloud costs
Customizable: Fine-tuning on your own data possible
GDPR-Friendly: No data leaves your company

Hardware Requirements

Model	VRAM	Recommended GPU
Llama 4 Scout	80+ GB	H100 / A100
Llama 4 Maverick	400+ GB	Multi-H100
Llama 3.3 70B	40+ GB	A100 80GB
Llama 3.2 11B	24 GB	RTX 4090
Llama 3.2 3B	8 GB	RTX 4070
Llama 3.2 1B	4 GB	Smartphone

Integration with CompanyGPT

CompanyGPT supports Llama 3.x models and enables completely self-hosted operation without external dependencies.

Our Recommendation

For EU enterprises, Llama 3.3 70B is the best choice from the Llama family – proven, EU-compatible, and with a strong performance profile. For edge applications, the compact Llama 3.2 (1B/3B) models are ideal. If you’re looking for a powerful European open-source alternative, we recommend Mistral as a Llama 4 replacement for the DACH region.

Meta Llama