Skip to main content
9 – 17 UHR +49 8031 3508270 LUITPOLDSTR. 9, 83022 ROSENHEIM
DE / EN
LLM Microsoft USA

Microsoft Phi & MAI

Microsoft MAI-Thinking-1, MAI-Code-1-Flash, Phi-4-reasoning-vision and Phi-4 - in-house Microsoft models plus compact open-source LLMs for edge and local deployments. GDPR-compliant via Azure / self-hosting. AI consulting from Germany.

License MIT
GDPR Hosting Available
Context 16k (Phi-4), up to 128k (Phi-3 variants) Tokens
Modality Text, Image → Text

Versions

Overview of available model variants

ModelReleaseEUStrengthsWeaknessesStatus
MAI-Thinking-1 Recommended
2 June 2026
First fully in-house Microsoft reasoning model (no OpenAI distillation) Sparse Mixture-of-Experts with 35B active parameters 256k context window 97.0% AIME 2025, 94.5% AIME 2026 Matches Claude Opus 4.6 on SWE-Bench Pro Trained on commercially licensed data
New – limited enterprise track record Not open source
Current
MAI-Code-1-Flash Recommended
2 June 2026
5B parameter coding model – fast and cost-efficient Beats Claude Haiku 4.5 on all four tested coding benchmarks 16-point lead on SWE-Bench Pro (51.2% vs. 35.2%) Integrated directly in GitHub Copilot (Free, Pro, Pro+, Max)
Specialized for coding – not a general-purpose model Not open source
Current
Phi-4-reasoning-vision-15B Recommended
4 March 2026
15B parameters with vision and reasoning Decides autonomously when deep reasoning is needed 84.8 AI2D, 83.3 ChartQA, 75.2 MathVista, 88.2 ScreenSpot v2 MIT license, on Microsoft Foundry, HuggingFace and GitHub
Higher resource requirements than Phi-4 16k context window
Current
Phi-4-reasoning (14B)
April 2025
Pure reasoning model (no vision) MIT license
No vision support
Current
Phi-4-mini
February 2025
3.8B parameters – edge-capable MIT license
Lower capacity
Current
Phi-4-multimodal
February 2025
Multimodal inputs (text, image, audio) MIT license
Current
Phi-4
December 2024
14B parameters - very efficient Strong reasoning capabilities
Smaller than frontier models
Current
Phi-3.5-MoE
2024
Mixture-of-Experts 42B parameters (6.6B active)
Current
Phi-3.5-mini
2024
3.8B parameters Runs on smartphones
Limited capacity
Current
Phi-3.5-vision
2024
Multimodal Image understanding
Current
Phi-3-medium
2024
14B parameters Balance of size and performance
Current

Use Cases

Typical applications for this model

Edge AI & IoT
Mobile Applications
Offline Scenarios
Embedded Systems
Resource-Constrained Environments
Coding Assistants
Local LLM Deployments

Technical Details

API, features and capabilities

API & Availability
Availability Public
Latency (TTFT) ~100ms (local)
Throughput Hardware-dependent Tokens/Sec
Features & Capabilities
Tool Use Function Calling Structured Output Vision Reasoning Mode File Upload
Training & Knowledge
Knowledge Cutoff February 2026 (Phi-4-reasoning-vision), 2024-10 (Phi-4)
Fine-Tuning Available (LoRA, QLoRA, Full Fine-Tuning)
Language Support
Best Quality English, German
Supported 20+ languages
Optimized for English, usable quality in German

Hosting & Compliance

GDPR-compliant hosting options and licensing

GDPR-Compliant Hosting Options
Self-Hosted
Own Infrastructure
Recommended - full data control
Azure
West Europe / Germany
Azure AI Model Catalog
Ollama
Local
Easy local deployment
License & Hosting
License MIT
Security Filters Customizable
Enterprise Support Yes
SLA Available Yes
On-Premise Edge-capable

Benchmarks

Performance comparison with standardized tests

MMLU
84.8%
HumanEval
82.6%
GSM8K
91.1%
MATH
55.5%

As AI consultants based in Rosenheim, Germany, we recommend Microsoft Phi for enterprises in the DACH region (Germany, Austria, Switzerland) that want to run powerful AI on resource-constrained hardware. Since the release of Phi-4 in December 2024, the Phi models have established themselves as reliable solutions for edge and on-premise deployments, offering impressive performance with minimal requirements. With the MAI family (Build 2026), Microsoft now extends its in-house model portfolio with fully proprietary reasoning and coding models.

New: MAI-Thinking-1 and MAI-Code-1-Flash (June 2026)

At Build 2026 (June 2, 2026), Microsoft unveiled its first fully in-house foundation models – trained without OpenAI distillation and exclusively on commercially licensed data. From a DACH perspective, two are especially relevant:

  • MAI-Thinking-1: Sparse Mixture-of-Experts with 35B active parameters and a 256k context window. Scores 97.0% on AIME 2025 and 94.5% on AIME 2026, and matches Claude Opus 4.6 on SWE-Bench Pro. Available via Microsoft Foundry / Azure in West Europe and Germany.
  • MAI-Code-1-Flash: 5B-parameter coding model that beats Claude Haiku 4.5 on all four tested coding benchmarks (51.2% vs. 35.2% on SWE-Bench Pro). Rolling out in GitHub Copilot (Free, Pro, Pro+, Max) for VS Code from June 2, 2026.

Both models are part of a seven-model MAI family (including MAI-Image-2.5, MAI-Transcribe-1.5 covering 43 languages, MAI-Voice-2 covering 15+ languages) and significantly reduce Microsoft’s dependence on OpenAI. They are, however, not open source – for full data control, Phi remains the better choice.

New: Phi-4-reasoning-vision (March 2026)

Microsoft has released Phi-4-reasoning-vision-15B, a new model that combines vision and reasoning in a 15B-parameter model. The model can autonomously decide when deeper reasoning is required — a key capability for efficient local deployments.

  • 15B Parameters: More capacity than Phi-4 (14B)
  • Vision + Reasoning: Understands images and can draw complex conclusions
  • MIT License: Full commercial use permitted
  • Self-Hosting: Runs on RTX 4090 or M2 Mac

Small Language Models (SLMs)

Microsoft Phi is Microsoft’s family of “Small Language Models” - compact but powerful models specifically optimized for efficiency.

Why Phi for Enterprises?

  • Compact: 3.8B to 14B parameters
  • Efficient: Runs on consumer hardware
  • MIT License: Full commercial use permitted
  • Self-Hosting: Full data control
  • Edge-Ready: Smartphones, IoT, embedded

Key Strengths

Efficiency

Phi models achieve impressive performance at minimal size:

ModelParametersComparable Performance to
Phi-414BGPT-4 (partially)
Phi-3.5-MoE42B (16 active)Llama 3 70B
Phi-3.5-mini3.8BLlama 3 8B

Local Deployment

Phi models can be operated completely locally:

  • Ollama: ollama run phi4
  • LM Studio: Simple GUI
  • vLLM: Production deployment
  • ONNX: Optimized inference

Hardware Requirements

ModelRAM/VRAMRecommended Hardware
Phi-416 GBRTX 4070 / M2 Mac
Phi-3.5-MoE24 GBRTX 4090
Phi-3.5-mini4 GBLaptop / Smartphone
Phi-3.5-vision8 GBRTX 3060

Reasoning Strength

Phi-4 shows particularly strong reasoning capabilities:

  • Mathematics and logic
  • Coding tasks
  • Structured analysis
  • Chain-of-thought

Comparison to Other SLMs

FeaturePhi-4Llama 3.2 3BGemma 2 2B
Parameters14B3B2B
ReasoningStrongMediumMedium
CodingStrongGoodGood
VisionYes (3.5)NoNo
LicenseMITCommunityApache 2.0

Integration with CompanyGPT

Microsoft Phi can be integrated in CompanyGPT as a self-hosted option - ideal for enterprises that want to operate AI without cloud dependency.

Our Recommendation

For reasoning-heavy cloud workloads on Azure / Foundry, MAI-Thinking-1 has been our new top pick in the Microsoft stack since June 2026. For coding assistants in GitHub Copilot, MAI-Code-1-Flash offers the best price-performance ratio.

For local and edge deployments with vision and reasoning requirements, we still recommend Microsoft Phi-4-reasoning-vision-15B. For pure text tasks, Phi-4 remains an excellent choice. For smartphones and IoT, Phi-3.5-mini is ideal, for simpler multimodal applications Phi-3.5-vision.

For applications requiring maximum quality where cloud hosting is acceptable, we recommend OpenAI GPT or Anthropic Claude instead.

Cost estimation for this model

For up-to-date token pricing, model variants and EU availability, see our sister project ai-prices.eu. It helps you compare and estimate the operational cost of leading AI models for your specific use case.

Compare prices on ai-prices.eu

ai-prices.eu is a project by innFactory AI Consulting GmbH and provides transparent cost estimates for leading AI models.

Consultation for this model?

We help you select and integrate the right AI model for your use case.