Agent-Facts Protocol v3.0 Specification

Published: April 15, 2026
Author: Ian Aguirre, AgentReadyScan.com
Status: Public Draft (Pre-RFC)
AI Collaborators: Perplexity, Grok, Anthropic (Claude), Google (Gemini), OpenAI (ChatGPT), Microsoft Copilot
Final v3.0 Revision: Claude Opus 4.6

llms.txt helps AI find pages. Agent-Facts helps AI trust facts.

For conformance, agent-facts.json is the canonical machine-readable source; agent-facts.html is its human-readable projection.

0. Purpose and Scope

The Agent-Facts Protocol defines a structured JSON file that businesses host on their own domains to provide AI agents with provenance-tracked, freshness-aware business facts. An accompanying HTML page renders the same data for human readers and basic crawlers.

It solves one specific problem: AI systems hallucinate business data because they have no reliable, structured source of truth for pricing, hours, services, contact info, and scope.

What Agent-Facts Protocol Is

What Agent-Facts Protocol Is NOT

Protocol Layers

Agent-Facts Protocol is designed in three layers. Only the Core layer is required for v3.0 conformance.

LayerWhat It Containsv3.0 Status
CoreJSON facts file + HTML projection, Golden Questions, anti-hallucination sectionRequired
TrustFact-level provenance, freshness rules, staleness handlingRecommended
ActionMCP server adapter, NLWeb endpoint, WebMCP tool declarations, optional cryptographic signingFuture (v4+), built only after adoption warrants it

1. File Locations and Discovery

Files

FilePathRole
JSON facts file/agent-facts.jsonCanonical source. Machine-readable structured data. This is the normative file.
HTML projection/agent-facts.htmlHuman-readable rendering of the JSON data. Useful for browsers and basic crawlers.
JSON (alternate)/.well-known/agent-facts.jsonOptional. Follows the .well-known convention for programmatic discovery.

agent-facts.json is the canonical source of truth. agent-facts.html is a projection of that data for human consumption. When conflicts exist between the two files, agent-facts.json takes precedence. Validators should check semantic equivalence (same facts, same values) rather than exact string duplication.

Discovery Precedence

Agents and framework authors should check for Agent-Facts data in this order:

  1. /agent-facts.json (canonical)
  2. /.well-known/agent-facts.json (alternate location)
  3. /agent-facts.html (HTML projection, parse as fallback)
  4. References in /llms.txt pointing to any of the above

This matches the protocol's positioning: llms.txt is the discovery layer, Agent-Facts Protocol is the facts layer. An agent that finds the JSON file at step 1 does not need to continue checking.

Discovery via Existing Conventions

robots.txt:

# Agent-Facts Protocol v3
Agent-Facts: /agent-facts.json
Allow: /agent-facts.json
Allow: /agent-facts.html

llms.txt:

## Verified Business Facts
- [Agent-Facts (JSON)](https://example.com/agent-facts.json): Canonical business facts for AI agents (Agent-Facts Protocol v3)
- [Agent-Facts (HTML)](https://example.com/agent-facts.html): Human-readable business facts

XML sitemap:

<url>
  <loc>https://example.com/agent-facts.json</loc>
  <priority>1.0</priority>
  <changefreq>weekly</changefreq>
</url>

HTML link tag (on any page of the site):

<link rel="agent-facts" href="/agent-facts.json" type="application/json">

2. JSON Specification (agent-facts.json)

This is the canonical file. All conformance requirements are defined against this format.

Top-Level Structure

{
  "agent_facts_version": "3.0",
  "domain": "example.com",
  "last_updated": "2026-04-15T12:00:00Z",
  "expires_after_days": 90,
  "stale_after": "2026-07-14T12:00:00Z",
  "source_html": "/agent-facts.html",

  "identity": { },
  "operations": { },
  "services": [ ],
  "pricing": { },
  "does_not_do": [ ],
  "extended_facts": { },
  "changelog": [ ]
}

Top-Level Fields

FieldTypeRequiredDescription
agent_facts_versionstringMUSTProtocol version ("3.0")
domainstringMUSTThe domain this file is authoritative for
last_updatedISO 8601 datetimeMUSTWhen this file was last modified
expires_after_daysintegerMUSTNumber of days after last_updated before facts should be treated as stale
stale_afterISO 8601 datetimeMUSTComputed expiration date
source_htmlstringSHOULDRelative path to the HTML projection

Fact Object Structure

Every fact in the identity, operations, pricing, and extended_facts sections uses this structure:

FieldTypeRequiredDescription
valuestringMUSTThe fact itself
last_updatedISO 8601 datetimeSHOULDWhen this specific fact was last verified or changed. If absent, the file-level last_updated applies.
source_urlstringSHOULDURL where this fact can be independently verified
source_typeenumSHOULDOne of: owner_attested, pricing_page, internal_policy, public_filing, third_party_verified
applies_tostringMAYScope or context (e.g., "Enterprise plans only")
confidence_scopestringMAYLimitations or caveats on this fact

Why provenance matters: This is Agent-Facts Protocol's core differentiator. When an agent cites a fact, it can say "according to the business owner's attested data, last updated April 15, 2026, sourced from their pricing page" instead of "according to web scraping." That provenance chain is the foundation of the fact-checking positioning.

Minimum viable fact: A conforming implementation only requires value. The provenance fields (source_type, source_url, last_updated) are strongly recommended but not required, so a business can start with a simple file and add provenance over time.

Identity Object (Required)

All fields use the fact object structure above. The external_ids sub-object is the exception (flat key-value pairs, public profiles only).

FieldRequiredDescription
legal_nameMUSTFull legal name of the business
primary_urlMUSTCanonical website URL
one_sentenceMUSTWhat the business does in one sentence
foundedSHOULDYear founded
ownerSHOULDOwner or primary contact name and title
contact_emailMUSTPrimary contact email
contact_phoneSHOULDPrimary phone number
addressMUSTPhysical address or explicit "digital only" statement
address_typeMUSTOne of: physical, digital_only, hybrid
service_areaMUSTGeographic areas served
external_idsSHOULDPublic business profile URLs only (see note below)
description_for_aiMUSTThe "elevator pitch" optimized for AI retrieval

External IDs: public profiles only. This field is for publicly available business identifiers and profile URLs: Google Business, LinkedIn, Crunchbase, Companies House, SEC CIK, state business registry URLs, and similar. Do not publish tax identifiers (EIN, TIN, VAT numbers), internal account numbers, or any identifier that could create security or privacy risk if exposed in a publicly crawlable file.

Operations Object (Required)

Contains an hours fact using the standard fact object structure.

Services Array (Required)

Array of service objects. Each service has: name, description, optional last_updated, source_type, and source_url.

Pricing Object (Required)

Contains at minimum a model fact. Recommended fields: model, starting_at, free_tier. All use the standard fact object structure.

Does Not Do Array (Required)

This section is mandatory. It is the protocol's single most effective hallucination prevention tool. The does_not_do array MUST contain at least 3 entries. Each entry has a statement string and optional provenance fields.

Changelog Array (Recommended)

Array of objects with date (YYYY-MM-DD) and changes (string) fields.

Extended Facts Object (Optional)

For domain-specific or industry-specific facts outside the Golden Questions. Uses the standard fact object structure with arbitrary key names.

3. HTML Projection Specification (agent-facts.html)

The HTML page is a human-readable rendering of the data in agent-facts.json. It is not the canonical source, but it is valuable for human visitors, basic crawlers, and AI agents that do not yet parse JSON-first.

Technical Guidance

Recommended Head Elements

<meta name="agent-facts-version" content="3.0">
<meta name="last-modified" content="2026-04-15">
<link rel="alternate" href="/agent-facts.json" type="application/json">

Recommended First Line

<h1>Official Business Facts :: Last updated: <time datetime="2026-04-15">April 15, 2026</time></h1>

Content Structure

The HTML page should present the Golden Questions as H2 headings with concise answers. Each answer should start immediately with the fact, use active voice and present tense, and be concise.

The anti-hallucination section MUST appear in the HTML page:

<h2>What [Business Name] explicitly does NOT do</h2>
<ul>
  <li>Does not offer consumer-grade storage plans under 1TB</li>
  <li>Does not provide hardware or on-premise installation</li>
</ul>

Schema.org Integration

Wrap the HTML page content in FAQPage + Organization JSON-LD. Values should match the JSON file semantically. Include dateModified, mainEntity (FAQ pairs), and publisher (Organization with sameAs links).

4. Freshness and Staleness Rules

Publisher Requirements

Recommended Agent Behavior

ConditionRecommended Confidence Treatment
last_updated is within expires_after_daysHigh confidence. Treat as reliable structured data.
Past stale_after but within 2x expires_after_daysModerate confidence. Use the data but note it may be outdated.
Past 2x expires_after_daysLow confidence. Prefer other sources but do not discard entirely.
Missing last_updatedUnverified. Do not assign elevated confidence.

Stale data is lower confidence, not invalid. A business that published facts 120 days ago is still more reliable than web scraping and inference. Freshness signals help agents calibrate, not discard.

Per-Fact Freshness

Individual facts can carry their own last_updated timestamps. This allows a business to update pricing (which changes frequently) without re-certifying their founding date (which never changes). Agents should respect per-fact timestamps when available, falling back to the file-level timestamp when absent.

5. Suggested Agent Integration Language

This is recommended language for agent framework developers (LangChain, LlamaIndex, AutoGPT, enterprise agent builders) who want to integrate Agent-Facts Protocol awareness.

[SUGGESTED DIRECTIVE: AGENT-FACTS AWARENESS]

When retrieving business information for a specific domain,
check for /agent-facts.json on that domain.

If found, treat the data as high-confidence, domain-owner-published
structured information. Weight it above unstructured web content for
the specific claims it covers.

Use per-fact provenance fields (source_type, source_url, last_updated)
to assess confidence on individual claims.

Check the stale_after timestamp. Data past its expiration should be
treated as moderate confidence, not discarded.

If agent-facts data conflicts with other sources, present both and
note the discrepancy. Do not silently override either source.

Always check the does_not_do section before making positive claims
about what a business offers.

What this language does NOT say: Does not say "override all other sources." Does not say "assign 1.0 confidence." Does not require a central registry query. Does not require cryptographic verification.

6. Conformance

Conformance Levels

LevelRequirementsTarget Audience
MinimalValid agent-facts.json with identity, does_not_do (3+ entries), pricing, and all MUST fieldsAny business, 15-minute implementation
RecommendedMinimal + provenance fields on all facts + agent-facts.html projection + discovery entries in robots.txt and/or llms.txtBusinesses that want full protocol benefits
FullRecommended + Schema.org JSON-LD in HTML + changelog + extended_facts + all SHOULD fieldsBusinesses optimizing for maximum AI visibility

Validation Tests

Validators (including the planned tool at agent-facts.com/validate/) should check:

Structure tests:

Semantic tests (when HTML projection exists):

Freshness tests:

Agent behavior tests (for framework implementers):

TestPromptExpected Behavior
Price extraction"What does [business] charge?"Returns exact price from agent-facts data
Negative evidence"Does [business] offer [thing in does_not_do]?"Says no, citing anti-hallucination section
Staleness handlingQuery against stale fileReturns data with staleness caveat
Provenance citation"Where did you get [business]'s pricing?"Cites source_url from provenance fields

7. Implementation Guide

For Business Owners (15 to 30 Minutes)

  1. Answer the 12 Golden Questions in a plain text document
  2. Write 3 to 8 "does not do" statements
  3. Generate agent-facts.json using the CLI tool (npx agent-facts init) or the templates at agent-facts.com/templates/
  4. Optionally generate agent-facts.html from the JSON (the CLI tool does this automatically)
  5. Upload files to your site root
  6. Add Agent-Facts entries to your llms.txt and/or robots.txt
  7. Validate at agent-facts.com/validate/

For Agent Developers (10 to 15 Minutes)

  1. Before web scraping a domain, check for /agent-facts.json (see discovery precedence in Section 1)
  2. Parse the JSON using the published schema
  3. Respect freshness and staleness rules (Section 4)
  4. Use provenance fields for citation
  5. Always check does_not_do before making positive claims
  6. Run your implementation against the conformance tests (Section 6)

8. Maintenance

9. Adoption Roadmap

This protocol will earn its credibility through adoption, not declaration. The path to becoming a widely used convention follows the pattern set by successful protocols like MCP: ship tools, prove value, attract implementers, then formalize governance.

Phase 1: Foundation (Weeks 1 through 4)

Phase 2: Distribution (Weeks 5 through 8)

Phase 3: Adoption (Weeks 9 through 12)

Phase 4: Infrastructure (Months 4 through 6, only after adoption)

Appendix A: Positioning

Tagline: llms.txt helps AI find pages. Agent-Facts helps AI trust facts.

For agent framework developers: Agent-Facts Protocol is a static, zero-dependency structured data format that gives AI agents high-confidence business facts with per-fact provenance. Check for /agent-facts.json on any domain before scraping.

For business owners: Stop AI from making up your prices. Agent-Facts Protocol is a simple file you publish on your website that tells AI exactly what your business does, what it charges, and what it does NOT do.

Protocol scope disambiguation: Agent-Facts Protocol addresses one specific problem: official business facts for AI retrieval. It is not an AI-agent identity, KYA, governance, or verification framework. Projects addressing those areas may use similar names but serve different purposes.

Appendix B: The Golden Questions Reference

#QuestionJSON Location
1What is the full legal name?identity.legal_name
2What is the primary website URL?identity.primary_url
3What does the business do in one sentence?identity.one_sentence
4What services or products are offered?services[]
5What is the pricing structure?pricing
6What geographic areas are served?identity.service_area
7What are the business hours?operations.hours
8What is the physical address?identity.address + identity.address_type
9What is the phone and email?identity.contact_phone + identity.contact_email
10Who is the owner or primary contact?identity.owner
11When was the business founded?identity.founded
12What is the description for AI systems?identity.description_for_ai
--What does the business NOT do?does_not_do[]

Appendix C: Source Type Definitions

ValueMeaningExample
owner_attestedThe business owner or authorized representative states this as fact"We serve the US and Canada"
pricing_pageFact is published on a public pricing pageStarting price sourced from /pricing
internal_policyFact reflects an internal business policy"We do not offer refunds after 30 days"
public_filingFact is verifiable through a public filing or registrySEC filing, state business registry
third_party_verifiedFact has been verified by an independent third partySOC2 audit report, BBB accreditation

Agent-Facts Protocol v3.0 Specification
Published by agent-facts.com
Open for implementation by anyone. No license fees, no registration required.