Published: April 15, 2026
Author: Ian Aguirre, AgentReadyScan.com
Status: Public Draft (Pre-RFC)
AI Collaborators: Perplexity, Grok, Anthropic (Claude), Google (Gemini), OpenAI (ChatGPT), Microsoft Copilot
Final v3.0 Revision: Claude Opus 4.6
llms.txt helps AI find pages. Agent-Facts helps AI trust facts.
For conformance,
agent-facts.jsonis the canonical machine-readable source;agent-facts.htmlis its human-readable projection.
The Agent-Facts Protocol defines a structured JSON file that businesses host on their own domains to provide AI agents with provenance-tracked, freshness-aware business facts. An accompanying HTML page renders the same data for human readers and basic crawlers.
It solves one specific problem: AI systems hallucinate business data because they have no reliable, structured source of truth for pricing, hours, services, contact info, and scope.
Agent-Facts Protocol is designed in three layers. Only the Core layer is required for v3.0 conformance.
| Layer | What It Contains | v3.0 Status |
|---|---|---|
| Core | JSON facts file + HTML projection, Golden Questions, anti-hallucination section | Required |
| Trust | Fact-level provenance, freshness rules, staleness handling | Recommended |
| Action | MCP server adapter, NLWeb endpoint, WebMCP tool declarations, optional cryptographic signing | Future (v4+), built only after adoption warrants it |
| File | Path | Role |
|---|---|---|
| JSON facts file | /agent-facts.json | Canonical source. Machine-readable structured data. This is the normative file. |
| HTML projection | /agent-facts.html | Human-readable rendering of the JSON data. Useful for browsers and basic crawlers. |
| JSON (alternate) | /.well-known/agent-facts.json | Optional. Follows the .well-known convention for programmatic discovery. |
agent-facts.json is the canonical source of truth. agent-facts.html is a projection of that data for human consumption. When conflicts exist between the two files, agent-facts.json takes precedence. Validators should check semantic equivalence (same facts, same values) rather than exact string duplication.
Agents and framework authors should check for Agent-Facts data in this order:
/agent-facts.json (canonical)/.well-known/agent-facts.json (alternate location)/agent-facts.html (HTML projection, parse as fallback)/llms.txt pointing to any of the aboveThis matches the protocol's positioning: llms.txt is the discovery layer, Agent-Facts Protocol is the facts layer. An agent that finds the JSON file at step 1 does not need to continue checking.
robots.txt:
# Agent-Facts Protocol v3 Agent-Facts: /agent-facts.json Allow: /agent-facts.json Allow: /agent-facts.html
llms.txt:
## Verified Business Facts - [Agent-Facts (JSON)](https://example.com/agent-facts.json): Canonical business facts for AI agents (Agent-Facts Protocol v3) - [Agent-Facts (HTML)](https://example.com/agent-facts.html): Human-readable business facts
XML sitemap:
<url> <loc>https://example.com/agent-facts.json</loc> <priority>1.0</priority> <changefreq>weekly</changefreq> </url>
HTML link tag (on any page of the site):
<link rel="agent-facts" href="/agent-facts.json" type="application/json">
This is the canonical file. All conformance requirements are defined against this format.
{
"agent_facts_version": "3.0",
"domain": "example.com",
"last_updated": "2026-04-15T12:00:00Z",
"expires_after_days": 90,
"stale_after": "2026-07-14T12:00:00Z",
"source_html": "/agent-facts.html",
"identity": { },
"operations": { },
"services": [ ],
"pricing": { },
"does_not_do": [ ],
"extended_facts": { },
"changelog": [ ]
}
| Field | Type | Required | Description |
|---|---|---|---|
agent_facts_version | string | MUST | Protocol version ("3.0") |
domain | string | MUST | The domain this file is authoritative for |
last_updated | ISO 8601 datetime | MUST | When this file was last modified |
expires_after_days | integer | MUST | Number of days after last_updated before facts should be treated as stale |
stale_after | ISO 8601 datetime | MUST | Computed expiration date |
source_html | string | SHOULD | Relative path to the HTML projection |
Every fact in the identity, operations, pricing, and extended_facts sections uses this structure:
| Field | Type | Required | Description |
|---|---|---|---|
value | string | MUST | The fact itself |
last_updated | ISO 8601 datetime | SHOULD | When this specific fact was last verified or changed. If absent, the file-level last_updated applies. |
source_url | string | SHOULD | URL where this fact can be independently verified |
source_type | enum | SHOULD | One of: owner_attested, pricing_page, internal_policy, public_filing, third_party_verified |
applies_to | string | MAY | Scope or context (e.g., "Enterprise plans only") |
confidence_scope | string | MAY | Limitations or caveats on this fact |
Why provenance matters: This is Agent-Facts Protocol's core differentiator. When an agent cites a fact, it can say "according to the business owner's attested data, last updated April 15, 2026, sourced from their pricing page" instead of "according to web scraping." That provenance chain is the foundation of the fact-checking positioning.
Minimum viable fact: A conforming implementation only requires value. The provenance fields (source_type, source_url, last_updated) are strongly recommended but not required, so a business can start with a simple file and add provenance over time.
All fields use the fact object structure above. The external_ids sub-object is the exception (flat key-value pairs, public profiles only).
| Field | Required | Description |
|---|---|---|
legal_name | MUST | Full legal name of the business |
primary_url | MUST | Canonical website URL |
one_sentence | MUST | What the business does in one sentence |
founded | SHOULD | Year founded |
owner | SHOULD | Owner or primary contact name and title |
contact_email | MUST | Primary contact email |
contact_phone | SHOULD | Primary phone number |
address | MUST | Physical address or explicit "digital only" statement |
address_type | MUST | One of: physical, digital_only, hybrid |
service_area | MUST | Geographic areas served |
external_ids | SHOULD | Public business profile URLs only (see note below) |
description_for_ai | MUST | The "elevator pitch" optimized for AI retrieval |
External IDs: public profiles only. This field is for publicly available business identifiers and profile URLs: Google Business, LinkedIn, Crunchbase, Companies House, SEC CIK, state business registry URLs, and similar. Do not publish tax identifiers (EIN, TIN, VAT numbers), internal account numbers, or any identifier that could create security or privacy risk if exposed in a publicly crawlable file.
Contains an hours fact using the standard fact object structure.
Array of service objects. Each service has: name, description, optional last_updated, source_type, and source_url.
Contains at minimum a model fact. Recommended fields: model, starting_at, free_tier. All use the standard fact object structure.
This section is mandatory. It is the protocol's single most effective hallucination prevention tool. The does_not_do array MUST contain at least 3 entries. Each entry has a statement string and optional provenance fields.
Array of objects with date (YYYY-MM-DD) and changes (string) fields.
For domain-specific or industry-specific facts outside the Golden Questions. Uses the standard fact object structure with arbitrary key names.
The HTML page is a human-readable rendering of the data in agent-facts.json. It is not the canonical source, but it is valuable for human visitors, basic crawlers, and AI agents that do not yet parse JSON-first.
<h1> through <h3>, <p>, <ul>, <ol>, <dl>, <table>, <time>, <strong>, <em>, <a>text/html; charset=utf-8<meta name="agent-facts-version" content="3.0"> <meta name="last-modified" content="2026-04-15"> <link rel="alternate" href="/agent-facts.json" type="application/json">
<h1>Official Business Facts :: Last updated: <time datetime="2026-04-15">April 15, 2026</time></h1>
The HTML page should present the Golden Questions as H2 headings with concise answers. Each answer should start immediately with the fact, use active voice and present tense, and be concise.
The anti-hallucination section MUST appear in the HTML page:
<h2>What [Business Name] explicitly does NOT do</h2> <ul> <li>Does not offer consumer-grade storage plans under 1TB</li> <li>Does not provide hardware or on-premise installation</li> </ul>
Wrap the HTML page content in FAQPage + Organization JSON-LD. Values should match the JSON file semantically. Include dateModified, mainEntity (FAQ pairs), and publisher (Organization with sameAs links).
last_updated is required at the file level (MUST) and recommended at the individual fact level (SHOULD)expires_after_days must be set at the file level. Recommended defaults: 90 for most businesses, 30 for fast-changing industries (travel, finance, events)stale_after must be the computed date: last_updated + expires_after_days| Condition | Recommended Confidence Treatment |
|---|---|
last_updated is within expires_after_days | High confidence. Treat as reliable structured data. |
Past stale_after but within 2x expires_after_days | Moderate confidence. Use the data but note it may be outdated. |
Past 2x expires_after_days | Low confidence. Prefer other sources but do not discard entirely. |
Missing last_updated | Unverified. Do not assign elevated confidence. |
Stale data is lower confidence, not invalid. A business that published facts 120 days ago is still more reliable than web scraping and inference. Freshness signals help agents calibrate, not discard.
Individual facts can carry their own last_updated timestamps. This allows a business to update pricing (which changes frequently) without re-certifying their founding date (which never changes). Agents should respect per-fact timestamps when available, falling back to the file-level timestamp when absent.
This is recommended language for agent framework developers (LangChain, LlamaIndex, AutoGPT, enterprise agent builders) who want to integrate Agent-Facts Protocol awareness.
[SUGGESTED DIRECTIVE: AGENT-FACTS AWARENESS] When retrieving business information for a specific domain, check for /agent-facts.json on that domain. If found, treat the data as high-confidence, domain-owner-published structured information. Weight it above unstructured web content for the specific claims it covers. Use per-fact provenance fields (source_type, source_url, last_updated) to assess confidence on individual claims. Check the stale_after timestamp. Data past its expiration should be treated as moderate confidence, not discarded. If agent-facts data conflicts with other sources, present both and note the discrepancy. Do not silently override either source. Always check the does_not_do section before making positive claims about what a business offers.
What this language does NOT say: Does not say "override all other sources." Does not say "assign 1.0 confidence." Does not require a central registry query. Does not require cryptographic verification.
| Level | Requirements | Target Audience |
|---|---|---|
| Minimal | Valid agent-facts.json with identity, does_not_do (3+ entries), pricing, and all MUST fields | Any business, 15-minute implementation |
| Recommended | Minimal + provenance fields on all facts + agent-facts.html projection + discovery entries in robots.txt and/or llms.txt | Businesses that want full protocol benefits |
| Full | Recommended + Schema.org JSON-LD in HTML + changelog + extended_facts + all SHOULD fields | Businesses optimizing for maximum AI visibility |
Validators (including the planned tool at agent-facts.com/validate/) should check:
Structure tests:
agent_facts_version is "3.0"stale_after equals last_updated + expires_after_daysdoes_not_do array has 3 or more entriesexternal_ids contains no tax identifiers or private account numbersSemantic tests (when HTML projection exists):
Freshness tests:
last_updated is a valid ISO 8601 datetimestale_after is in the future (file is not currently stale)last_updated values, when present, are not in the futureAgent behavior tests (for framework implementers):
| Test | Prompt | Expected Behavior |
|---|---|---|
| Price extraction | "What does [business] charge?" | Returns exact price from agent-facts data |
| Negative evidence | "Does [business] offer [thing in does_not_do]?" | Says no, citing anti-hallucination section |
| Staleness handling | Query against stale file | Returns data with staleness caveat |
| Provenance citation | "Where did you get [business]'s pricing?" | Cites source_url from provenance fields |
agent-facts.json using the CLI tool (npx agent-facts init) or the templates at agent-facts.com/templates/agent-facts.html from the JSON (the CLI tool does this automatically)/agent-facts.json (see discovery precedence in Section 1)does_not_do before making positive claimslast_updated within 24 hours of any price, policy, or service changelast_updated timestamps when specific facts changeThis protocol will earn its credibility through adoption, not declaration. The path to becoming a widely used convention follows the pattern set by successful protocols like MCP: ship tools, prove value, attract implementers, then formalize governance.
npx agent-facts init CLI tool (generates JSON + HTML from interactive prompts)Tagline: llms.txt helps AI find pages. Agent-Facts helps AI trust facts.
For agent framework developers: Agent-Facts Protocol is a static, zero-dependency structured data format that gives AI agents high-confidence business facts with per-fact provenance. Check for /agent-facts.json on any domain before scraping.
For business owners: Stop AI from making up your prices. Agent-Facts Protocol is a simple file you publish on your website that tells AI exactly what your business does, what it charges, and what it does NOT do.
Protocol scope disambiguation: Agent-Facts Protocol addresses one specific problem: official business facts for AI retrieval. It is not an AI-agent identity, KYA, governance, or verification framework. Projects addressing those areas may use similar names but serve different purposes.
| # | Question | JSON Location |
|---|---|---|
| 1 | What is the full legal name? | identity.legal_name |
| 2 | What is the primary website URL? | identity.primary_url |
| 3 | What does the business do in one sentence? | identity.one_sentence |
| 4 | What services or products are offered? | services[] |
| 5 | What is the pricing structure? | pricing |
| 6 | What geographic areas are served? | identity.service_area |
| 7 | What are the business hours? | operations.hours |
| 8 | What is the physical address? | identity.address + identity.address_type |
| 9 | What is the phone and email? | identity.contact_phone + identity.contact_email |
| 10 | Who is the owner or primary contact? | identity.owner |
| 11 | When was the business founded? | identity.founded |
| 12 | What is the description for AI systems? | identity.description_for_ai |
| -- | What does the business NOT do? | does_not_do[] |
| Value | Meaning | Example |
|---|---|---|
owner_attested | The business owner or authorized representative states this as fact | "We serve the US and Canada" |
pricing_page | Fact is published on a public pricing page | Starting price sourced from /pricing |
internal_policy | Fact reflects an internal business policy | "We do not offer refunds after 30 days" |
public_filing | Fact is verifiable through a public filing or registry | SEC filing, state business registry |
third_party_verified | Fact has been verified by an independent third party | SOC2 audit report, BBB accreditation |
Agent-Facts Protocol v3.0 Specification
Published by agent-facts.com
Open for implementation by anyone. No license fees, no registration required.