Versions

Overview of available model variants

Model	Release	Strengths	Weaknesses	Status
MAI-Thinking-1 Recommended	2 June 2026	First fully in-house Microsoft reasoning model (no OpenAI distillation) Sparse Mixture-of-Experts with 35B active parameters 256k context window 97.0% AIME 2025, 94.5% AIME 2026 Matches Claude Opus 4.6 on SWE-Bench Pro Trained on commercially licensed data	New – limited enterprise track record Not open source	Current
MAI-Code-1-Flash Recommended	2 June 2026	5B parameter coding model – fast and cost-efficient Beats Claude Haiku 4.5 on all four tested coding benchmarks 16-point lead on SWE-Bench Pro (51.2% vs. 35.2%) Integrated directly in GitHub Copilot (Free, Pro, Pro+, Max)	Specialized for coding – not a general-purpose model Not open source	Current
Phi-4-reasoning-vision-15B Recommended	4 March 2026	15B parameters with vision and reasoning Decides autonomously when deep reasoning is needed 84.8 AI2D, 83.3 ChartQA, 75.2 MathVista, 88.2 ScreenSpot v2 MIT license, on Microsoft Foundry, HuggingFace and GitHub	Higher resource requirements than Phi-4 16k context window	Current
Phi-4-reasoning (14B)	April 2025	Pure reasoning model (no vision) MIT license	No vision support	Current
Phi-4-mini	February 2025	3.8B parameters – edge-capable MIT license	Lower capacity	Current
Phi-4-multimodal	February 2025	Multimodal inputs (text, image, audio) MIT license	—	Current
Phi-4	December 2024	14B parameters - very efficient Strong reasoning capabilities	Smaller than frontier models	Current
Phi-3.5-MoE	2024	Mixture-of-Experts 42B parameters (6.6B active)	—	Current
Phi-3.5-mini	2024	3.8B parameters Runs on smartphones	Limited capacity	Current
Phi-3.5-vision	2024	Multimodal Image understanding	—	Current
Phi-3-medium	2024	14B parameters Balance of size and performance	—	Current

Technical Details

API, features and capabilities

API & Availability

Availability Public

Latency (TTFT) ~100ms (local)

Throughput Hardware-dependent Tokens/Sec

Features & Capabilities

Tool Use Function Calling Structured Output Vision Reasoning Mode File Upload

Training & Knowledge

Knowledge Cutoff February 2026 (Phi-4-reasoning-vision), 2024-10 (Phi-4)

Fine-Tuning Available (LoRA, QLoRA, Full Fine-Tuning)

Language Support

Best Quality English, German

Supported 20+ languages

Optimized for English, usable quality in German

Hosting & Compliance

GDPR-compliant hosting options and licensing

GDPR-Compliant Hosting Options

License & Hosting

License MIT

Security Filters Customizable

Enterprise Support Yes

SLA Available Yes

On-Premise Edge-capable

As AI consultants based in Rosenheim, Germany, we recommend Microsoft Phi for enterprises in the DACH region (Germany, Austria, Switzerland) that want to run powerful AI on resource-constrained hardware. Since the release of Phi-4 in December 2024, the Phi models have established themselves as reliable solutions for edge and on-premise deployments, offering impressive performance with minimal requirements. With the MAI family (Build 2026), Microsoft now extends its in-house model portfolio with fully proprietary reasoning and coding models.

New: MAI-Thinking-1 and MAI-Code-1-Flash (June 2026)

At Build 2026 (June 2, 2026), Microsoft unveiled its first fully in-house foundation models – trained without OpenAI distillation and exclusively on commercially licensed data. From a DACH perspective, two are especially relevant:

MAI-Thinking-1: Sparse Mixture-of-Experts with 35B active parameters and a 256k context window. Scores 97.0% on AIME 2025 and 94.5% on AIME 2026, and matches Claude Opus 4.6 on SWE-Bench Pro. Available via Microsoft Foundry / Azure in West Europe and Germany.
MAI-Code-1-Flash: 5B-parameter coding model that beats Claude Haiku 4.5 on all four tested coding benchmarks (51.2% vs. 35.2% on SWE-Bench Pro). Rolling out in GitHub Copilot (Free, Pro, Pro+, Max) for VS Code from June 2, 2026.

Both models are part of a seven-model MAI family (including MAI-Image-2.5, MAI-Transcribe-1.5 covering 43 languages, MAI-Voice-2 covering 15+ languages) and significantly reduce Microsoft’s dependence on OpenAI. They are, however, not open source – for full data control, Phi remains the better choice.

New: Phi-4-reasoning-vision (March 2026)

Microsoft has released Phi-4-reasoning-vision-15B, a new model that combines vision and reasoning in a 15B-parameter model. The model can autonomously decide when deeper reasoning is required — a key capability for efficient local deployments.

15B Parameters: More capacity than Phi-4 (14B)
Vision + Reasoning: Understands images and can draw complex conclusions
MIT License: Full commercial use permitted
Self-Hosting: Runs on RTX 4090 or M2 Mac

Small Language Models (SLMs)

Microsoft Phi is Microsoft’s family of “Small Language Models” - compact but powerful models specifically optimized for efficiency.

Why Phi for Enterprises?

Compact: 3.8B to 14B parameters
Efficient: Runs on consumer hardware
MIT License: Full commercial use permitted
Self-Hosting: Full data control
Edge-Ready: Smartphones, IoT, embedded

Key Strengths

Efficiency

Phi models achieve impressive performance at minimal size:

Model	Parameters	Comparable Performance to
Phi-4	14B	GPT-4 (partially)
Phi-3.5-MoE	42B (16 active)	Llama 3 70B
Phi-3.5-mini	3.8B	Llama 3 8B

Local Deployment

Phi models can be operated completely locally:

Ollama: ollama run phi4
LM Studio: Simple GUI
vLLM: Production deployment
ONNX: Optimized inference

Hardware Requirements

Model	RAM/VRAM	Recommended Hardware
Phi-4	16 GB	RTX 4070 / M2 Mac
Phi-3.5-MoE	24 GB	RTX 4090
Phi-3.5-mini	4 GB	Laptop / Smartphone
Phi-3.5-vision	8 GB	RTX 3060

Reasoning Strength

Phi-4 shows particularly strong reasoning capabilities:

Mathematics and logic
Coding tasks
Structured analysis
Chain-of-thought

Comparison to Other SLMs

Feature	Phi-4	Llama 3.2 3B	Gemma 2 2B
Parameters	14B	3B	2B
Reasoning	Strong	Medium	Medium
Coding	Strong	Good	Good
Vision	Yes (3.5)	No	No
License	MIT	Community	Apache 2.0

Integration with CompanyGPT

Microsoft Phi can be integrated in CompanyGPT as a self-hosted option - ideal for enterprises that want to operate AI without cloud dependency.

Our Recommendation

For reasoning-heavy cloud workloads on Azure / Foundry, MAI-Thinking-1 has been our new top pick in the Microsoft stack since June 2026. For coding assistants in GitHub Copilot, MAI-Code-1-Flash offers the best price-performance ratio.

For local and edge deployments with vision and reasoning requirements, we still recommend Microsoft Phi-4-reasoning-vision-15B. For pure text tasks, Phi-4 remains an excellent choice. For smartphones and IoT, Phi-3.5-mini is ideal, for simpler multimodal applications Phi-3.5-vision.

For applications requiring maximum quality where cloud hosting is acceptable, we recommend OpenAI GPT or Anthropic Claude instead.

Cost estimation for this model

For up-to-date token pricing, model variants and EU availability, see our sister project ai-prices.eu. It helps you compare and estimate the operational cost of leading AI models for your specific use case.

Compare prices on ai-prices.eu

ai-prices.eu is a project by innFactory AI Consulting GmbH and provides transparent cost estimates for leading AI models.

Microsoft Phi & MAI

Versions

Use Cases

Technical Details

Hosting & Compliance

Benchmarks

New: MAI-Thinking-1 and MAI-Code-1-Flash (June 2026)

New: Phi-4-reasoning-vision (March 2026)

Small Language Models (SLMs)

Why Phi for Enterprises?

Key Strengths

Efficiency

Local Deployment

Hardware Requirements

Reasoning Strength

Comparison to Other SLMs

Integration with CompanyGPT

Our Recommendation

Cost estimation for this model

Similar Models

SOOFI (Soofi S)

Tencent Hunyuan (Hy3)

NVIDIA Nemotron

Consultation for this model?