Skip to main content
9 – 17 UHR +49 8031 3508270 LUITPOLDSTR. 9, 83022 ROSENHEIM
DE / EN
AI Gateway AI FinOps Enterprise Security Made in Germany

AI Gateway - Cost Control & Governance for Agentic AI

The innFactory AI Gateway is the central enterprise security and governance layer for artificial intelligence - a single, OpenAI-compatible API for all providers. Budget teams, users and AI agents, attribute every token to a cost center and run true AI FinOps: blazing fast, GDPR-compliant and made in Germany.

AI Gateway dashboard - token usage, provider and customer cost and margin at a glance
Beta

Available from August 2026

The innFactory AI Gateway is in final development. Secure early beta access now and help shape the product. Pricing will be announced at launch.

Join the beta list
Technology
Rust
Maximum speed, minimal overhead
Latency overhead
< 50 ms
P95 per request - barely measurable
AI providers
9+
OpenAI, Azure, Anthropic, Gemini, Mistral, Bedrock, STACKIT & mehr

Run standalone or seamlessly integrated into CompanyGPT | Pricing basis: ai-prices.eu | Available from August 2026

What is the AI Gateway?

Development teams and AI agents use many providers in parallel today - each with its own key, billing and tracking. The result: fragmented costs, budget overruns and no governance. The innFactory AI Gateway sits as a central proxy between your tools and all AI providers. One URL, one key, full control - and every token is transparently attributed.

The problem: ungoverned AI sprawl in larger enterprises

The more teams, tools and agents use AI, the faster you lose oversight, control and budget.

30–50%
typical budget overrun without central control
3+
AI provider credentials per developer – scattered & unprotected
~80%
of model costs are avoidable with the right model choice

Key sprawl & shadow AI

API keys live scattered across .env files, scripts and CI pipelines. Nobody knows who uses which provider – and every leaked key is a security and cost risk.

Costs with no attribution

At month's end a lump-sum bill arrives from OpenAI, Anthropic & Google. Which team, project or customer caused it? No one can answer cleanly.

No governance, compliance risk

Which models may who use? Is GDPR respected? Is there an audit log? Without a central layer this stays open – a real risk under the EU AI Act.

Agents scale the costs

Autonomous AI agents fire thousands of requests. Without hard per-agent budgets, a single loop can burn an entire monthly budget overnight.

That's exactly what the AI Gateway is for – a central layer that every AI request flows through.

How your data flows through the gateway

The gateway is your central enterprise security and governance layer for AI: users, agents and coding tools speak one OpenAI-compatible API. The gateway authenticates, budgets, applies guardrails, logs — and routes to the right model.

AI Gateway architecture diagramUsers, AI agents, API keys and coding tools flow through the central AI Gateway to the AI providers.Users, agents & toolsAI providers & LLMsAI GatewayRouting · budgetsGuardrails · auditUsers & teamsAI agentsAPI-KeysCoding toolsAzure OpenAIAnthropic ClaudeGoogle GeminiMistralAWS BedrockSTACKITOllama & more
Smart routing & aliases Budgets & rate limits Guardrails & audit log Cost centers & metering

Why your own AI Gateway?

Cost control, governance and speed - without vendor lock-in

AI FinOps

Every request is captured, priced and attributed to a cost center. Finally know which team, user and agent drives which AI cost.

Rust performance

Built in Rust - the same language AWS uses for many of its high-load services: a single binary, P95 overhead under 50 ms and over 700 requests per second per instance. Speed that won't slow your agents down.

Hierarchical budgets

Set limits at organization, team, user, key and agent level. When a budget is exhausted, the gateway stops automatically - no nasty surprises.

Multi-provider, one API

One OpenAI-compatible interface for 9+ providers. Switch models via alias without reconfiguring your tools or agents.

Enterprise security layer

The gateway is the central security layer between tools and LLMs: policy-based guardrails inspect requests and responses, role-based access via Microsoft Entra ID, provider keys secured in Azure Key Vault and a complete audit log for every activity.

Sovereign & GDPR-compliant

Run in your cloud or in German sovereign data centers (STACKIT). Your prompts and data stay under your control - EU AI Act included.

AI Gateway - model overview with transparent token pricing in Euro

Transparent pricing - in Euro, always up to date

Models are auto-fetched from your providers. Token prices are based on our European pricing catalog ai-prices.eu and can be refreshed with one click or overridden individually at any time.

  • Input, cache and output prices per million tokens - visible per model
  • Real costs in Euro, not estimated in US dollars
  • Price updates directly from provider APIs and ai-prices.eu
  • Custom markups and internal transfer prices per cost center
Explore ai-prices.eu

AI FinOps: Cost Centers & Code Profiles

Token FinOps in practice: attribute AI costs to where they arise - down to the individual customer

AI Gateway cost centers - spend per customer and project with monthly and weekly caps

Every token lands on the right cost center

Create cost centers for departments, projects or individual customers - e.g. companygpt, media-innfactory26 or Customer A. Every API key bills against exactly one cost center, and an optional monthly or weekly cap stops further usage automatically.

  • Pool or per-user mode per cost center
  • Monthly & weekly caps with live spend display
  • Per-model caps and sub-limits for fine-grained control
  • Clean re-billing to customers - traceable down to the token

Cost centers, not guesswork

Create cost centers for departments, projects or customers. Every token is attributed automatically - the basis for clean internal allocation and re-billing to customers.

Code profiles for developers

Developers work with code profiles that book directly onto a cost center. Development tokens land automatically on the right customer project - AI FinOps without manual bookkeeping.

User scoping - unlike LiteLLM

Users authenticate via Microsoft Entra ID and are scoped granularly: allowed models, budgets and rate limits per person. No more shared catch-all key for the whole team.

Budgets for AI agents

In the agentic enterprise, not only humans consume tokens, but autonomous agents too. Give every agent its own budget, models and limits - secure and traceable.

Token budgets that cascade

Limits apply top-down - the gateway blocks before it gets expensive

Organization Monthly total budget - stops everything when reached
Team / cost center Own limits per department, project or customer
User Scoped per person - models, budget, rate limits
Key & agent Own budget per API key, automation or AI agent

One gateway in front of all AI desktop tools

Put OpenCode, Claude Code, Codex, Cowork and VS Code behind the gateway - centrally controlled, budgeted and assigned to a cost center. Ready-made setups, configured in minutes.

OpenCode

Terminal coding agent. Bundle providers, models, aliases and agents into one opencode.json.

Claude Code

Anthropic's agentic CLI. Map gateway models onto Claude's role slots (main, background, opus, sonnet, haiku).

Codex

OpenAI Codex CLI. Pick one Responses-API-capable model to drive the agent.

Cowork

Cowork desktop agent. Auto-discover models, choose the auth scheme and ship macOS / Windows config.

VS Code · Continue

Continue.dev extension. Map models onto chat, edit, apply and autocomplete roles.

VS Code · Copilot

GitHub Copilot custom (BYOK) models. Choose the API type per model (chat / responses / messages).

Plus MCP server registry, agent-to-agent protocol (A2A) and passthrough targets for custom endpoints.

AI Gateway Playground - test a model, alias or router with one API key

Built-in Playground - test instantly

Try any model, alias and router right inside the gateway - with a concrete API key, so permissions, budget and cost center apply exactly as in production.

  • Chat, image and TTS mode in one interface
  • Set system prompt, temperature and max tokens live
  • Export as cURL and drop it straight into your app

Why not just LiteLLM or OpenRouter?

Many tools handle routing and model access. What matters in the enterprise is the control, governance and FinOps layer on top - exactly where the AI Gateway comes in.

What matters in the enterpriseLiteLLM / OpenRouter & co.innFactory AI Gateway
One control point in front of all AI tools - Claude Code, Codex, Cowork, VS Code, OpenCode & your own apps run through the gatewaywire up per tool yourself✓ ready-made setups, 1 endpoint
Bill costs per customer, project & cost center cleanly and re-bill themnot designed for it✓ cost centers + code profiles
Budget & limit per team, user, API key and AI agentcoarse / partial✓ cascading, hard stop limit
Access & identity per employee (SSO, role-based)limited✓ Entra ID / Keycloak / Cognito
Audit log, governance & EU AI Act conformityyour own effort✓ complete, GDPR, sovereign in DE
Stable under thousands of agent requests - without becoming a bottleneckdepends on setup✓ < 50 ms overhead, 700+ req/s
Operations, support & accountabilityself-hosted / DIY✓ managed from Germany + rollout

Deployment & Integration

Standalone or natively in CompanyGPT - run as a container in any cloud

Standalone product

Use the AI Gateway as the central hub for all AI access of your engineering teams, CI/CD pipelines and agents - entirely without CompanyGPT. One URL, one key, full FinOps transparency.

Natively integrated into CompanyGPT

The gateway is natively integrated into our CompanyGPT: customers who use both work with it seamlessly - the same budgets, cost centers and guardrails for chat, agents and addons.

Runs anywhere - simply as a container

You're not locked to a vendor: the gateway runs as a Docker container in your own environment.

Container (Docker)

A single, lean Docker image - quickly rolled out, easily scaled and runnable anywhere containers run.

Database

Stores configuration and usage data in Azure Cosmos DB or MongoDB - you choose what fits your cloud.

Identity / SSO

Connects to your identity provider: Keycloak, Microsoft Entra ID or AWS Cognito - role-based and SSO-ready.

Any cloud

Runs on AWS, STACKIT, Microsoft Azure and Google Cloud - in the region of your choice, also GDPR-compliant and sovereign in Germany.

We help with the rollout in your cloud

Together with innFactory GmbH - an official Microsoft Cloud Solution Provider (CSP) - we set up the gateway in your environment. No cloud account of your own? We provide the matching Azure subscription for your company - including billing, support and optional maintenance.

Frequently Asked Questions

Everything important about the AI Gateway, availability and AI FinOps

What is an AI gateway?

An AI gateway is a central proxy between your applications, tools and AI agents and the AI providers (OpenAI, Anthropic, Google and others). It bundles all AI access behind a single, OpenAI-compatible API and handles authentication, routing, budgets, guardrails and cost accounting - the central enterprise security and FinOps layer for AI in the company.

When does a company need an AI gateway?

As soon as more than one team, several tools or autonomous AI agents use AI. Without a central layer you then lack cost control, attribution and governance. An AI gateway prevents shadow AI, budget overruns and compliance risks - and makes AI costs traceable per team, user, agent and customer.

Is the AI Gateway an enterprise security layer?

Yes. Every AI request goes through the gateway, where it is authenticated (Microsoft Entra ID), checked by guardrails, authorized role-based and logged completely. Provider keys are kept securely in Azure Key Vault, data stays in the EU. This makes the gateway the central security and governance layer between your tools and the LLMs.

When will the AI Gateway be available?

Launch is planned for August 2026. Before that, we grant a limited number of beta accesses to selected partners. Join the beta list to be among the first and help shape the product.

What does the AI Gateway cost?

Pricing will be announced at launch in August 2026. During the beta phase we work closely with pilot customers. We're happy to discuss your scenario in a no-obligation demo.

Do I need CompanyGPT to use the gateway?

No. The AI Gateway is a standalone product for all AI access of your tools, pipelines and agents. If you already use CompanyGPT, it integrates seamlessly as the shared cost-control layer.

How is it different from LiteLLM, OpenRouter & Portkey?

LiteLLM (open-source proxy), OpenRouter (large model catalog) and Portkey (control plane) are strong tools for routing and access. Our gateway adds the governance and FinOps layer on top: real user scoping via Entra ID, hierarchical budgets, cost centers & code profiles, pricing in Euro (ai-prices.eu) and a Rust-built, fast engine - as a managed service from Germany, natively integrated with CompanyGPT.

Which providers and models are supported?

More than 9 provider adapters: OpenAI, Azure OpenAI, Anthropic Claude, Google Gemini, Mistral, AWS Bedrock, STACKIT, Ollama and custom OpenAI-compatible endpoints. Models are auto-fetched, prices come from ai-prices.eu.

How fast is the gateway really?

Thanks to the Rust implementation, the additional latency overhead is below 50 ms at P95, budget checks below 10 ms. A single instance handles over 700 requests per second - your agents barely notice the gateway.

Where is my data processed?

The gateway runs in your cloud or in sovereign German data centers (STACKIT). Prompts and data stay under your control, GDPR and EU AI Act compliant. API keys are stored only as a hash, provider credentials reside in Azure Key Vault.

What does AI FinOps mean in practice?

AI FinOps means managing AI costs like cloud costs: measure, attribute, budget and optimize. With cost centers, code profiles and cascading budgets, you attribute every token to a team, user, agent or customer.

AI Gateway for Your Enterprise

What you get:

  • One OpenAI-compatible API for 9+ AI providers
  • Built in Rust - P95 overhead below 50 ms
  • AI FinOps: cost centers, code profiles, token attribution
  • Cascading budgets for org, team, user, key & agent
  • User scoping via Microsoft Entra ID (unlike LiteLLM)
  • Guardrails, audit log, MCP registry & A2A protocol
  • Pricing in Euro based on ai-prices.eu
  • Standalone or integrated into CompanyGPT

Secure Beta Access

Available from
August 2026

Limited beta slots for pilot customers. Pricing at launch.

Help shape the product.

Join the beta list

Ready for AI Cost Control?

Schedule a conversation and learn how the AI Gateway makes your AI costs transparent and controllable.