Skip to main content
9 – 17 UHR +49 8031 3508270 LUITPOLDSTR. 9, 83022 ROSENHEIM
DE / EN
LLM MiniMax AI China

MiniMax-M3

MiniMax-M3 as an open-weight frontier model with 1M-token context, native multimodality and MSA architecture for agentic coding. AI consultancy from Rosenheim supports GDPR-compliant self-hosting in EU data centres.

License MIT
GDPR Hosting Available
Context 1M Tokens
Modality Text, Image, Video → Text

Versions

Overview of available model variants

ModelReleaseEUStrengthsWeaknessesStatus
MiniMax-M3 Recommended
2026-06-01
1M-token context window Native multimodality (image and video understanding) MSA architecture (MiniMax Sparse Attention) – up to 20x lower per-token compute at 1M context 9x faster prefill, 15x faster decoding vs. predecessor 59.0% on SWE-Bench Pro (beats GPT-5.5 and Gemini 3.1 Pro) Open weights announced (Hugging Face, GitHub)
No native EU cloud availability Official API runs on Chinese infrastructure Open weights not yet final at launch (release within ~10 days announced)
Current
MiniMax-M2.7
2026-05-19
Sparse Mixture-of-Experts (MoE) architecture Optimised for agent teams and complex coding workflows Top performance on SWE-Pro, Terminal Bench 2 and MLE-Bench Lite Native multi-agent collaboration Open source (MIT licence)
Superseded by M3 No native EU cloud availability Official API runs on Chinese infrastructure
Deprecated
MiniMax-M2
October 2025
First generation of the agentic coding model Open source via Hugging Face and GitHub
Superseded by M2.7 and M3
Deprecated

Use Cases

Typical applications for this model

Agentic coding & software engineering
Multi-agent workflows
SRE incident response
Complex productivity tasks
Document processing (Word, Excel, PowerPoint)
ML engineering & experimentation
Tool/function calling

Technical Details

API, features and capabilities

API & Availability
Availability Public
Features & Capabilities
Tool Use Function Calling Structured Output Vision Reasoning Mode Code Execution Web Browsing File Upload
Training & Knowledge
Knowledge Cutoff Early 2026
Fine-Tuning Available (Full Fine-tuning, LoRA)
Language Support
Best Quality English, Chinese
Supported Multilingual
Best quality in English and Chinese

Hosting & Compliance

GDPR-compliant hosting options and licensing

GDPR-Compliant Hosting Options
Self-Hosted
EU (your choice)
Deployment on own infrastructure in EU data centres possible
License & Hosting
License MIT
Security Filters Configurable
On-Premise

Benchmarks

Performance comparison with standardized tests

SWE-Bench Pro (M3)
59 2026-06
SWE-Pro (M2.7)
56.22 2026
MLE-Bench Lite (Medal Rate, M2.7)
66.6 2026
VIBE-Pro (M2.7)
55.6 2026
Terminal Bench 2 (M2.7)
57 2026
Toolathon (M2.7)
46.3 2026

As an AI consultancy from Rosenheim, we support companies across the DACH region (Germany, Austria, Switzerland) with GDPR-compliant integration of open-weight models such as MiniMax-M3. Through self-hosting in EU data centres, the model can be deployed for agentic workflows and long-context tasks in compliance with data protection regulations.

MiniMax-M3: Frontier Coding with 1M Context

MiniMax unveiled MiniMax-M3 on 1 June 2026 – according to the lab, the first open-weight model to combine frontier coding, a 1-million-token context window and native multimodality (image and video understanding) in a single model.

MSA Architecture (MiniMax Sparse Attention)

The new architecture design is the key to its efficiency on long contexts:

  • Up to 20x lower per-token compute at 1M context compared to the predecessor
  • More than 9x faster prefill, more than 15x faster decoding
  • Significantly lower inference cost for long-context workloads

Native Multimodality

M3 understands text, images and video in a unified model – ideal for agentic coding with UI screenshots, document processing and multimodal research.

Coding Performance

  • 59.0% on SWE-Bench Pro – beats GPT-5.5 and Gemini 3.1 Pro, narrowly behind Claude Opus 4.7
  • Optimised for complex, multi-step software engineering tasks
  • Well-suited for autonomous agent harnesses

API Pricing

  • approx. $0.60 / 1M input tokens and $2.40 / 1M output tokens
  • Launch promotion: 50% off ($0.30 / $1.20)
  • Substantially cheaper than Claude Opus 4.7 or GPT-5.5 at comparable coding performance

Open Weights

MiniMax has announced that weights will be released on Hugging Face and GitHub within approximately 10 days of launch – enabling self-hosting on your own EU infrastructure.

Previous Generation: MiniMax-M2.7

MiniMax-M2.7 (released May 2026) remains relevant for pure coding-agent workflows without multimodality and with smaller context requirements. It offers:

  • Sparse Mixture-of-Experts (MoE) architecture
  • 56.22% on SWE-Pro, 57.0% on Terminal Bench 2, 66.6% on MLE-Bench Lite
  • MIT licence, fully open source

For new projects, however, we recommend going directly with MiniMax-M3 thanks to its 1M context, multimodality and higher coding performance.

EU Deployment Options

Self-Hosting in EU Data Centres

For GDPR compliance, we offer support with:

  • Deployment on AWS EU regions (Frankfurt, Ireland)
  • Azure EU regions (West Europe, Germany)
  • Google Cloud EU regions (Frankfurt, Belgium)
  • Private cloud or on-premise in your own data centre

Hardware Requirements

MiniMax-M3 and M2.7 are available in various quantisations:

  • BF16: Full precision for research and benchmarks
  • FP8: Recommended for production deployment
  • INT4/INT8: Efficient quantisation for limited resources

Alternative API Access

For rapid prototyping without your own infrastructure:

  • MiniMax API (platform.minimaxi.com): Official access from MiniMax AI
  • Third-party providers: Together.ai, Fireworks AI, OpenRouter
  • Model weights: Hugging Face (MiniMaxAI) and GitHub

Note: Direct API usage occurs via Chinese infrastructure and is not GDPR-compliant without a data processing agreement and self-hosting.

Local or Cloud? Calculate the Cost

For agentic coding workloads, a precise cost calculation between API usage and self-hosting often pays off. The local-vs-cloud AI inference calculator from ai-prices.eu lets you compare hardware, electricity and operating costs against cloud API pricing and determine the break-even point for your workload.

Integration & Support

Our Recommendation

Self-hosting in EU data centres is the best option for GDPR-compliant usage of MiniMax-M3. We support you with:

  • Infrastructure planning and hardware sizing (especially for 1M-context workloads)
  • Deployment, quantisation and optimisation
  • Integration into existing agent frameworks
  • Compliant usage within CompanyGPT
  • Fine-tuning for specific use cases

For companies seeking a leading open-weight model with frontier coding, 1M context and native multimodality, MiniMax-M3 is an excellent choice – provided the appropriate infrastructure is available.

Cost estimation for this model

For up-to-date token pricing, model variants and EU availability, see our sister project ai-prices.eu. It helps you compare and estimate the operational cost of leading AI models for your specific use case.

Compare prices on ai-prices.eu

ai-prices.eu is a project by innFactory AI Consulting GmbH and provides transparent cost estimates for leading AI models.

Consultation for this model?

We help you select and integrate the right AI model for your use case.