Skip to main content
9 – 17 UHR +49 8031 3508270 LUITPOLDSTR. 9, 83022 ROSENHEIM
DE / EN
LLM Meta USA

Meta Llama

Meta Llama and Muse Spark – Open-source LLMs for self-hosting. Muse Spark replaces the Llama family (April 2026). Note: Llama 4 is not available in the EU due to licensing. AI consulting from Germany for Llama 3.x deployment.

License Llama 4 Community License (EU use excluded), Llama 3.x Community License (EU permitted)
GDPR Hosting Available
Context 10M (Llama 4 Scout), 1M (Llama 4 Maverick), 128k (Llama 3.x) Tokens
Modality Text, Image, Video → Text

Versions

Overview of available model variants

ModelReleaseEUStrengthsWeaknessesStatus
Muse Spark
April 2026
Successor to the Llama family Developed by Meta Superintelligence Labs
Very few details available EU availability and licensing unclear
Current
Llama 4 Maverick
April 2025
400B parameters (MoE, 17B active) 1M token context window Natively multimodal (text, image, video) Outperforms GPT-4 on several benchmarks
Not available in the EU due to licensing Multi-GPU required (200-400 GB VRAM)
Current
Llama 4 Scout
April 2025
10M token context window – industry-unique 109B parameters (MoE, 17B active) Runs on single H100 80GB (INT4) Natively multimodal
Not available in the EU due to licensing
Current
Llama 3.3 70B Recommended
December 2024
Proven and widely deployed Good performance/resource balance
No multimodal input
Current
Llama 3.2 (1B/3B/11B/90B)
September 2024
Wide size range Compact variants for edge
Older generation
Current
Llama 3.1 (405B/70B/8B)
July 2024
405B variant with top performance 128k context window
High resource requirements (405B)
Current

Use Cases

Typical applications for this model

Data-sensitive applications
High-volume without API costs
Offline scenarios
Custom models / Fine-tuning
Embedded AI
Edge deployment
On-premise solutions

Technical Details

API, features and capabilities

API & Availability
Availability Public
Latency (TTFT) Depends on hosting
Throughput Depends on hardware Tokens/Sec
Features & Capabilities
Tool Use Function Calling Structured Output Vision File Upload
Training & Knowledge
Knowledge Cutoff 2024-08
Fine-Tuning Available (LoRA, QLoRA, Full Fine-Tuning, PEFT)
Language Support
Best Quality English, German, French, Spanish
Supported 50+ languages
Best quality in English, good quality in Western European languages

Hosting & Compliance

GDPR-compliant hosting options and licensing

GDPR-Compliant Hosting Options
Self-Hosted
Own infrastructure
Full data control - recommended for sensitive data
AWS
Frankfurt (eu-central-1)
Amazon Bedrock / SageMaker
Azure
West Europe
Azure AI / ML
Google Cloud
Frankfurt (europe-west3)
Vertex AI
License & Hosting
License Llama 4 Community License (EU use excluded), Llama 3.x Community License (EU permitted)
Security Filters Customizable
On-Premise Edge-capable

innFactory AI Consulting based in Rosenheim, Germany supports enterprises across the DACH region with GDPR-compliant self-hosting of Meta Llama. With open weights you have full control – no data leaves your infrastructure.

Muse Spark: Successor to Llama (April 2026)

On April 9, 2026, Meta announced Muse Spark through its new Meta Superintelligence Labs division as the successor to the Llama family. Details on architecture, parameter count, and licensing remain scarce.

Note: EU availability and licensing for Muse Spark are not yet known. We recommend monitoring Muse Spark for now and continuing to use Llama 3.3 70B or Mistral for production EU applications.

Important Notice: EU Licensing Restriction for Llama 4

The Llama 4 Community License explicitly prohibits use and distribution within the EU. Companies domiciled or with their main place of business in the EU may not use or host Llama 4 Scout or Maverick. For EU enterprises, we recommend Llama 3.3 70B or alternatively Mistral as a European open-source alternative.

Llama 4: Technical Excellence – Without EU Access

Llama 4 Maverick (400B MoE)

  • 128 experts, 17B active parameters per token
  • 1M token context window
  • Natively multimodal (text, image, video)
  • Outperforms GPT-4 on reasoning and coding benchmarks

Llama 4 Scout (109B MoE)

  • 16 experts, 17B active parameters
  • 10M token context window – industry-unique
  • Runs on a single H100 80GB (INT4)
  • Ideal for massive document analysis and codebase parsing

Llama 4 Behemoth (announced)

  • ~2 trillion parameters, 288B active
  • Announced but never released. Superseded by Muse Spark.
  • Positioned as a “teacher model” for other Llama models

Llama 3.x: Recommended for EU Enterprises

The Llama 3.x series is not subject to EU restrictions and remains the recommended choice for European enterprises:

Key Strengths

  • Full Control: Model runs in your infrastructure
  • No API Costs: Only hardware/cloud costs
  • Customizable: Fine-tuning on your own data possible
  • GDPR-Friendly: No data leaves your company

Hardware Requirements

ModelVRAMRecommended GPU
Llama 4 Scout80+ GBH100 / A100
Llama 4 Maverick400+ GBMulti-H100
Llama 3.3 70B40+ GBA100 80GB
Llama 3.2 11B24 GBRTX 4090
Llama 3.2 3B8 GBRTX 4070
Llama 3.2 1B4 GBSmartphone

Integration with CompanyGPT

CompanyGPT supports Llama 3.x models and enables completely self-hosted operation without external dependencies.

Our Recommendation

For EU enterprises, Llama 3.3 70B is the best choice from the Llama family – proven, EU-compatible, and with a strong performance profile. For edge applications, the compact Llama 3.2 (1B/3B) models are ideal. If you’re looking for a powerful European open-source alternative, we recommend Mistral as a Llama 4 replacement for the DACH region.

Consultation for this model?

We help you select and integrate the right AI model for your use case.