Anyone bringing an AI agent into their company in 2026 eventually hits the same wall: the model is good, the tools work – but the knowledge the agent needs is scattered across Confluence, SharePoint, Notion, code comments, BigQuery schemas, and one senior colleague’s head. This is precisely the problem the Open Knowledge Format (OKF) addresses, which Google Cloud published as an open specification on 12 June 2026. In this article we explain what OKF is, the concrete value it delivers, and how to adopt it in your company.
The Problem OKF Solves: Context Assembly
Before an AI agent can do a task, it has to assemble context – the relevant knowledge from all the systems where it lives. Google Cloud calls this the context-assembly problem, and it is the invisible bulk of the work behind every agent project. The internal knowledge a model needs is fragmented across incompatible systems:
- metadata catalogs with proprietary APIs
- wikis, third-party systems, and shared drives
- code comments, docstrings, notebook cells
- tribal knowledge that is documented nowhere
The authors at Google Cloud put the consequence well: “Every agent builder is solving the same context-assembly problem from scratch, every catalog vendor is reinventing the same data models, and the knowledge itself is locked behind whichever surface created it.” What is missing is simply a shared format in which knowledge can be written once and read by any agent.
What Is the Open Knowledge Format?
OKF is a vendor-neutral, open standard that represents metadata, context, and curated knowledge so that both humans and AI agents can read it. The central design decision: OKF is a format, not another service. There is no account, no SDK, no platform to lock you in.
Version 0.1 was published by Sam McVeety and Amir Hormati (Tech Leads for Data Analytics and BigQuery at Google Cloud) on the Google Cloud Blog. The specification, three sample bundles, and two reference implementations are openly available on GitHub.
The principle fits in three phrases:
- Just Markdown – readable in any editor, rendered on GitHub, indexable by any search tool.
- Just files – shippable as a tarball, hostable in any git repository, mountable on any filesystem.
- Just YAML frontmatter – for the small set of structured fields that need to be queryable.
No binary format, no database, no lock-in. If you have ever written a Markdown file, you can read and write OKF.
How an OKF File Is Structured
Each concept – a table, a metric, a process, a policy – is exactly one Markdown file with two parts: a YAML frontmatter block and free-form Markdown body. The only required field is type. Everything else is recommended but optional:
---
type: BigQuery Table
title: Orders
description: One row per completed customer order.
resource: https://console.cloud.google.com/bigquery?p=acme&d=sales&t=orders
tags: [sales, revenue]
timestamp: 2026-05-28T14:30:00Z
---
# Schema
...
# Examples
...
# Citations
...The recommended fields are title, description, resource (a URI identifying the underlying asset), tags, and timestamp (ISO 8601). The spec explicitly allows any additional fields – consumers must preserve unknown fields and must not reject a bundle just because an optional field is missing or a type is unknown. This tolerance is deliberate: it makes OKF extensible without needing a central registry.
A bundle is simply a directory of these files. Two filenames are reserved: index.md lists a folder’s contents (for progressive disclosure, so an agent can explore the hierarchy without opening every file), and log.md keeps the chronological change history.
sales/
├── index.md
├── datasets/
│ ├── index.md
│ └── orders_db.md
├── tables/
│ ├── index.md
│ ├── orders.md
│ └── customers.md
└── metrics/
├── index.md
└── weekly_active_users.mdConcepts link to each other with ordinary Markdown links. From the flat directory emerges a knowledge graph richer than the folder structure – the relationship itself lives in the prose, not in a rigid schema.
The Value OKF Delivers
The real value of OKF is not a single feature but the consequences of “just Markdown, just files”:
Portable. Knowledge survives the move between systems, teams, and tools. If you switch from one catalog vendor to the next, your OKF bundles simply come along. No export-import drama, no proprietary format.
Version-controlled. OKF lives in git – alongside the code it describes. Every change to knowledge is a commit: traceable, reviewable, revertible. Knowledge management finally gets the same discipline as software development.
Dual-readable. The same file is read by a human in an editor and by an agent as context – with no translation layer. What your team maintains is what the agent sees. What the agent uses, your team can review.
Interoperable. A bundle written by team A can be consumed by agent B without anyone building an integration. Producer and consumer are cleanly separated; the tooling on both sides is swappable.
No vendor lock-in. OKF is tied to no cloud provider, no database, no model, and no agent framework. That is the decisive difference from proprietary knowledge platforms – and for us at innFactory, a central argument.
In the Google blog, Andrej Karpathy captures why language models in particular can maintain this file-based knowledge: “LLMs don’t get bored, don’t forget to update a cross-reference, and can touch 15 files in one pass. The bookkeeping that causes humans to abandon personal wikis is exactly what LLMs are good at.”
OKF in Context: MCP, A2A, and agenticweb.md
OKF replaces none of the existing agent protocols – it complements them. Anyone who has read our Agentic AI whitepaper knows the map: the Model Context Protocol (MCP) governs how an agent accesses tools and data sources. A2A governs how agents communicate with each other. OKF answers the third question: what does the agent actually know – and in what format does that knowledge live?
Put differently: MCP is the socket, OKF is the content that flows through it. An MCP server can expose an OKF bundle as a knowledge source; an agent pulls from it exactly the context it needs for its task.
For us this is no surprise but confirmation of a trend we addressed early. With agenticweb.md we already described Markdown as the natural format in which websites hand their knowledge to AI agents. OKF carries the same idea inward – to knowledge inside the organization. Markdown is establishing itself as the shared language between humans and agents, out on the web as much as inside the company.
How to Adopt OKF in Your Company
OKF is deliberately low-barrier. Google Cloud ships the specification not as a dry document but with working tools:
- Enrichment Agent – a reference agent that walks a BigQuery dataset, drafts OKF documents for tables and views, and enriches them via an LLM pass with descriptions, schemas, join paths, and citations. This produces an initial body of knowledge automatically, rather than writing every file by hand.
- Static HTML Visualizer – turns any OKF bundle into an interactive graph view. A single self-contained HTML file, no backend, no data leaves the browser.
- Three sample bundles (GA4 e-commerce, Stack Overflow, Bitcoin public datasets) you can use to study the structure directly.
For a concrete start in your own company, we recommend this path:
- Start small. Pick a single, clearly bounded knowledge domain – your most important dataset, a core process, or a department’s central metrics.
- Generate or write the bundle. For data sources, use the enrichment approach; for processes and policies, write the concepts directly as Markdown. One concept per file, cleanly linked.
- Version it in git. Place the bundle next to the related code or in a dedicated knowledge repository. From now on, maintenance runs through pull requests and reviews.
- Connect it to the agent. Expose the bundle through an MCP server or directly as file context. Your agent – or your CompanyGPT instance with companyRAG – uses it as a curated, reviewed knowledge base instead of a raw document dump.
- Iterate. Extend the bundle domain by domain. Because the format is tolerant, it grows with you without forcing a full remodel.
OKF, Data Sovereignty, and GDPR
That OKF is a format and not a platform has a direct consequence for compliance and sovereignty: your knowledge never leaves your control. The files live in your git, on your infrastructure, in your jurisdiction. There is no vendor who must read along, no US cloud account through which your internal definitions have to pass. This is exactly what makes OKF a good fit for a sovereign, GDPR-compliant AI stack like the one we build at innFactory.
If you keep knowledge version-controlled in git anyway and connect it to agents through a self-hosted platform, you retain full authority – and can prove at any time which knowledge an agent used and when. For regulated industries, this traceability is not a nice-to-have but a prerequisite.
Conclusion and Recommendation
The Open Knowledge Format is not a loud product launch but a quiet, smart infrastructure decision. It solves a problem every agent project has, with the simplest conceivable means: Markdown, files, a little YAML. The value is portability, version control, interoperability and – decisively for European companies – independence from any single vendor.
Our recommendation: do not wait for version 1.0. OKF v0.1 is deliberately minimal and precisely for that reason usable today. Pick a knowledge domain, generate a first bundle, put it in git – and see how much better your agent works with curated rather than raw knowledge. The effort is small, the learning large.
Want to put Agentic AI on a clean knowledge base – GDPR-compliant and without lock-in? innFactory’s AI strategy consulting helps you combine OKF, MCP, and a sovereign AI stack into a viable concept. Talk to us – in an initial call we identify where the context-assembly problem costs your company the most and how OKF helps there concretely.
Frequently Asked Questions About the Open Knowledge Format (OKF)
What is the Open Knowledge Format (OKF)? OKF is an open, vendor-neutral standard from Google Cloud (published 12 June 2026) that represents organizational knowledge as a directory of Markdown files with YAML frontmatter – readable by humans and AI agents alike.
Which field is required in an OKF file?
Only the type field. Recommended but optional are title, description, resource, tags, and timestamp. Consumers must handle unknown fields and types tolerantly.
Does OKF replace the Model Context Protocol (MCP)? No. MCP governs an agent’s access to tools and data, while OKF describes the knowledge itself. They complement each other: an MCP server can expose an OKF bundle as a knowledge source.
Can OKF be used in a GDPR-compliant way? Yes. Because OKF is a pure file format with no vendor binding, your knowledge files stay in your git, on your infrastructure, and in your jurisdiction. This makes OKF a strong building block for a sovereign AI stack.
How do I get started with OKF? Pick a single knowledge domain, generate or write a bundle of Markdown concepts, version it in git, and connect it to your agent via MCP or as file context. Google Cloud provides an enrichment agent, a visualizer, and three sample bundles on GitHub as a starting point.
