AI Gateway - Cost Control & Governance for Agentic AI
The innFactory AI Gateway is the central enterprise security and governance layer for artificial intelligence - a single, OpenAI-compatible API for all providers. Budget teams, users and AI agents, attribute every token to a cost center and run true AI FinOps: blazing fast, GDPR-compliant and made in Germany.

Run standalone or seamlessly integrated into CompanyGPT | Pricing basis: ai-prices.eu | Available from August 2026
What is the AI Gateway?
Development teams and AI agents use many providers in parallel today - each with its own key, billing and tracking. The result: fragmented costs, budget overruns and no governance. The innFactory AI Gateway sits as a central proxy between your tools and all AI providers. One URL, one key, full control - and every token is transparently attributed.
The problem: ungoverned AI sprawl in larger enterprises
The more teams, tools and agents use AI, the faster you lose oversight, control and budget.
Key sprawl & shadow AI
API keys live scattered across .env files, scripts and CI pipelines. Nobody knows who uses which provider – and every leaked key is a security and cost risk.
Costs with no attribution
At month's end a lump-sum bill arrives from OpenAI, Anthropic & Google. Which team, project or customer caused it? No one can answer cleanly.
No governance, compliance risk
Which models may who use? Is GDPR respected? Is there an audit log? Without a central layer this stays open – a real risk under the EU AI Act.
Agents scale the costs
Autonomous AI agents fire thousands of requests. Without hard per-agent budgets, a single loop can burn an entire monthly budget overnight.
That's exactly what the AI Gateway is for – a central layer that every AI request flows through.
How your data flows through the gateway
The gateway is your central enterprise security and governance layer for AI: users, agents and coding tools speak one OpenAI-compatible API. The gateway authenticates, budgets, applies guardrails, logs — and routes to the right model.
Why your own AI Gateway?
Cost control, governance and speed - without vendor lock-in
AI FinOps
Every request is captured, priced and attributed to a cost center. Finally know which team, user and agent drives which AI cost.
Rust performance
Built in Rust - the same language AWS uses for many of its high-load services: a single binary, P95 overhead under 50 ms and over 700 requests per second per instance. Speed that won't slow your agents down.
Hierarchical budgets
Set limits at organization, team, user, key and agent level. When a budget is exhausted, the gateway stops automatically - no nasty surprises.
Multi-provider, one API
One OpenAI-compatible interface for 9+ providers. Switch models via alias without reconfiguring your tools or agents.
Enterprise security layer
The gateway is the central security layer between tools and LLMs: policy-based guardrails inspect requests and responses, role-based access via Microsoft Entra ID, provider keys secured in Azure Key Vault and a complete audit log for every activity.
Sovereign & GDPR-compliant
Run in your cloud or in German sovereign data centers (STACKIT). Your prompts and data stay under your control - EU AI Act included.

Transparent pricing - in Euro, always up to date
Models are auto-fetched from your providers. Token prices are based on our European pricing catalog ai-prices.eu and can be refreshed with one click or overridden individually at any time.
- Input, cache and output prices per million tokens - visible per model
- Real costs in Euro, not estimated in US dollars
- Price updates directly from provider APIs and ai-prices.eu
- Custom markups and internal transfer prices per cost center
AI FinOps: Cost Centers & Code Profiles
Token FinOps in practice: attribute AI costs to where they arise - down to the individual customer

Every token lands on the right cost center
Create cost centers for departments, projects or individual customers - e.g. companygpt, media-innfactory26 or Customer A. Every API key bills against exactly one cost center, and an optional monthly or weekly cap stops further usage automatically.
- Pool or per-user mode per cost center
- Monthly & weekly caps with live spend display
- Per-model caps and sub-limits for fine-grained control
- Clean re-billing to customers - traceable down to the token
Cost centers, not guesswork
Create cost centers for departments, projects or customers. Every token is attributed automatically - the basis for clean internal allocation and re-billing to customers.
Code profiles for developers
Developers work with code profiles that book directly onto a cost center. Development tokens land automatically on the right customer project - AI FinOps without manual bookkeeping.
User scoping - unlike LiteLLM
Users authenticate via Microsoft Entra ID and are scoped granularly: allowed models, budgets and rate limits per person. No more shared catch-all key for the whole team.
Budgets for AI agents
In the agentic enterprise, not only humans consume tokens, but autonomous agents too. Give every agent its own budget, models and limits - secure and traceable.
Token budgets that cascade
Limits apply top-down - the gateway blocks before it gets expensive
One gateway in front of all AI desktop tools
Put OpenCode, Claude Code, Codex, Cowork and VS Code behind the gateway - centrally controlled, budgeted and assigned to a cost center. Ready-made setups, configured in minutes.
OpenCode
Terminal coding agent. Bundle providers, models, aliases and agents into one opencode.json.
Claude Code
Anthropic's agentic CLI. Map gateway models onto Claude's role slots (main, background, opus, sonnet, haiku).
Codex
OpenAI Codex CLI. Pick one Responses-API-capable model to drive the agent.
Cowork
Cowork desktop agent. Auto-discover models, choose the auth scheme and ship macOS / Windows config.
VS Code · Continue
Continue.dev extension. Map models onto chat, edit, apply and autocomplete roles.
VS Code · Copilot
GitHub Copilot custom (BYOK) models. Choose the API type per model (chat / responses / messages).
Plus MCP server registry, agent-to-agent protocol (A2A) and passthrough targets for custom endpoints.

Built-in Playground - test instantly
Try any model, alias and router right inside the gateway - with a concrete API key, so permissions, budget and cost center apply exactly as in production.
- Chat, image and TTS mode in one interface
- Set system prompt, temperature and max tokens live
- Export as
cURLand drop it straight into your app
Why not just LiteLLM or OpenRouter?
Many tools handle routing and model access. What matters in the enterprise is the control, governance and FinOps layer on top - exactly where the AI Gateway comes in.
| What matters in the enterprise | LiteLLM / OpenRouter & co. | innFactory AI Gateway |
|---|---|---|
| One control point in front of all AI tools - Claude Code, Codex, Cowork, VS Code, OpenCode & your own apps run through the gateway | wire up per tool yourself | ✓ ready-made setups, 1 endpoint |
| Bill costs per customer, project & cost center cleanly and re-bill them | not designed for it | ✓ cost centers + code profiles |
| Budget & limit per team, user, API key and AI agent | coarse / partial | ✓ cascading, hard stop limit |
| Access & identity per employee (SSO, role-based) | limited | ✓ Entra ID / Keycloak / Cognito |
| Audit log, governance & EU AI Act conformity | your own effort | ✓ complete, GDPR, sovereign in DE |
| Stable under thousands of agent requests - without becoming a bottleneck | depends on setup | ✓ < 50 ms overhead, 700+ req/s |
| Operations, support & accountability | self-hosted / DIY | ✓ managed from Germany + rollout |
Deployment & Integration
Standalone or natively in CompanyGPT - run as a container in any cloud
Standalone product
Use the AI Gateway as the central hub for all AI access of your engineering teams, CI/CD pipelines and agents - entirely without CompanyGPT. One URL, one key, full FinOps transparency.
Natively integrated into CompanyGPT
The gateway is natively integrated into our CompanyGPT: customers who use both work with it seamlessly - the same budgets, cost centers and guardrails for chat, agents and addons.
Runs anywhere - simply as a container
You're not locked to a vendor: the gateway runs as a Docker container in your own environment.
Container (Docker)
A single, lean Docker image - quickly rolled out, easily scaled and runnable anywhere containers run.
Database
Stores configuration and usage data in Azure Cosmos DB or MongoDB - you choose what fits your cloud.
Identity / SSO
Connects to your identity provider: Keycloak, Microsoft Entra ID or AWS Cognito - role-based and SSO-ready.
Any cloud
Runs on AWS, STACKIT, Microsoft Azure and Google Cloud - in the region of your choice, also GDPR-compliant and sovereign in Germany.
We help with the rollout in your cloud
Together with innFactory GmbH - an official Microsoft Cloud Solution Provider (CSP) - we set up the gateway in your environment. No cloud account of your own? We provide the matching Azure subscription for your company - including billing, support and optional maintenance.
Frequently Asked Questions
Everything important about the AI Gateway, availability and AI FinOps
What is an AI gateway?
An AI gateway is a central proxy between your applications, tools and AI agents and the AI providers (OpenAI, Anthropic, Google and others). It bundles all AI access behind a single, OpenAI-compatible API and handles authentication, routing, budgets, guardrails and cost accounting - the central enterprise security and FinOps layer for AI in the company.
When does a company need an AI gateway?
As soon as more than one team, several tools or autonomous AI agents use AI. Without a central layer you then lack cost control, attribution and governance. An AI gateway prevents shadow AI, budget overruns and compliance risks - and makes AI costs traceable per team, user, agent and customer.
Is the AI Gateway an enterprise security layer?
Yes. Every AI request goes through the gateway, where it is authenticated (Microsoft Entra ID), checked by guardrails, authorized role-based and logged completely. Provider keys are kept securely in Azure Key Vault, data stays in the EU. This makes the gateway the central security and governance layer between your tools and the LLMs.
When will the AI Gateway be available?
Launch is planned for August 2026. Before that, we grant a limited number of beta accesses to selected partners. Join the beta list to be among the first and help shape the product.
What does the AI Gateway cost?
Pricing will be announced at launch in August 2026. During the beta phase we work closely with pilot customers. We're happy to discuss your scenario in a no-obligation demo.
Do I need CompanyGPT to use the gateway?
No. The AI Gateway is a standalone product for all AI access of your tools, pipelines and agents. If you already use CompanyGPT, it integrates seamlessly as the shared cost-control layer.
How is it different from LiteLLM, OpenRouter & Portkey?
LiteLLM (open-source proxy), OpenRouter (large model catalog) and Portkey (control plane) are strong tools for routing and access. Our gateway adds the governance and FinOps layer on top: real user scoping via Entra ID, hierarchical budgets, cost centers & code profiles, pricing in Euro (ai-prices.eu) and a Rust-built, fast engine - as a managed service from Germany, natively integrated with CompanyGPT.
Which providers and models are supported?
More than 9 provider adapters: OpenAI, Azure OpenAI, Anthropic Claude, Google Gemini, Mistral, AWS Bedrock, STACKIT, Ollama and custom OpenAI-compatible endpoints. Models are auto-fetched, prices come from ai-prices.eu.
How fast is the gateway really?
Thanks to the Rust implementation, the additional latency overhead is below 50 ms at P95, budget checks below 10 ms. A single instance handles over 700 requests per second - your agents barely notice the gateway.
Where is my data processed?
The gateway runs in your cloud or in sovereign German data centers (STACKIT). Prompts and data stay under your control, GDPR and EU AI Act compliant. API keys are stored only as a hash, provider credentials reside in Azure Key Vault.
What does AI FinOps mean in practice?
AI FinOps means managing AI costs like cloud costs: measure, attribute, budget and optimize. With cost centers, code profiles and cascading budgets, you attribute every token to a team, user, agent or customer.
AI Gateway for Your Enterprise
What you get:
- One OpenAI-compatible API for 9+ AI providers
- Built in Rust - P95 overhead below 50 ms
- AI FinOps: cost centers, code profiles, token attribution
- Cascading budgets for org, team, user, key & agent
- User scoping via Microsoft Entra ID (unlike LiteLLM)
- Guardrails, audit log, MCP registry & A2A protocol
- Pricing in Euro based on ai-prices.eu
- Standalone or integrated into CompanyGPT
Secure Beta Access
August 2026
Limited beta slots for pilot customers. Pricing at launch.
Help shape the product.
Join the beta listReady for AI Cost Control?
Schedule a conversation and learn how the AI Gateway makes your AI costs transparent and controllable.
