Identity First Marketing
  • Home
  • About
  • Services
  • Blog
  • Podcast
  • Clips
  • Courses
  • Contact

Identity First Marketing

paul@identityfirstmedia.com

Princentuin 2, 4813 CZ, Breda

Pages

  • Home
  • About
  • Contact

Legal

  • Privacy Policy
  • Terms of Service
  • Imprint
  • Right of Withdrawal
  • KvK: 65821327

© 2026 Identity First Marketing

Powered by Identity First Media Platform

Making Your Website AI-Proof: llms.txt, schema.org and the 17 Entity Types LLMs Read
Home/Blog/Making Your Website AI-Proof: llms.txt, schema.org and the 17 Entity Types LLMs Read

Making Your Website AI-Proof: llms.txt, schema.org and the 17 Entity Types LLMs Read

Ring 1, your own domain, is the only layer of AI findability where you can declare what is true instead of letting AI guess. Four pillars make a website AI-proof: llms.txt as the direct line to twenty-four-plus AI crawlers, schema.org with seventeen entity types, entity consistency across every surface, and EntityRank as the compound score the four pillars produce.

April 26, 20265 min read

Table of Contents

  1. The only ring you can declare instead of guess
  2. llms.txt: the direct line to AI crawlers
  3. Schema.org and the 17 entity types LLMs read
  4. Entity consistency: making your domain read as one entity
  5. EntityRank: the compounding result of the four pillars

The only ring you can declare instead of guess

Ring 1 is the only layer where AI does not have to guess. Four pillars turn a website into a verifiable entity definition: llms.txt, schema.org, entity consistency, and EntityRank as their compounding result.
There is exactly one ring in the model where AI does not have to guess at all. Ring 1, your own domain, is the layer where you can declare what is true about you in a language the model already speaks. Every other ring is interpretation. This one is statement. The previous article in this cluster placed Ring 3 as a consequence and Ring 1 as the source. This article unpacks Ring 1 as a technical surface. Four pillars together turn a website from a marketing artifact into a verifiable entity definition: llms.txt as the direct line to AI crawlers, schema.org as the machine-readable grammar, entity consistency across pages as evidence, and EntityRank as the compounding result of all three. The rest of the cluster lives downstream of this work. The next article, Entity of One, applies these pillars to a single person becoming an entity. The closing article, Entity Gap Check, measures the gap that opens whenever Ring 1 is incomplete. Both are coming.

Identity First Marketing treats Ring 1 as the only ring where the work is fully declarable. The other three rings interpret what Ring 1 makes available.

llms.txt: the direct line to AI crawlers

llms.txt is a markdown file at the root of your domain that tells twenty-four-plus AI crawlers which pages are authoritative. It is the 2026 equivalent of robots.txt 1994. Adoption is low, the early-mover window is open.
robots.txt arrived in 1994 to tell search crawlers which pages they could index. llms.txt arrived in 2026 to tell language models which pages on your domain are authoritative, in their own language. Both files sit at the root of your domain. Both are five minutes of work. The second one is currently undeployed by most websites. Twenty-four-plus AI crawlers actively read llms.txt or check for it: GPTBot, ClaudeBot, PerplexityBot, Google-Extended, OAI-SearchBot, ChatGPT-User, and the rest. The file itself is a markdown document with one H1 (your name or organization), H2 sections for each authoritative cluster on your site (about, blog hub, services, signature frameworks), and one short description per link. No code. Five lines per section. Two consequences flow from low adoption today. First, every llms.txt placed in the root sends a clean signal in a near-empty space, and AI engines are calibrating their citation logic against early adopters. Second, the file is a place to declare canon: the authoritative version of your About page, the canonical framework page, the content you want quoted. The early-mover window will not stay open long.

Fact: Twenty-four-plus AI crawlers actively read or check for llms.txt in 2026, including GPTBot, ClaudeBot, PerplexityBot, Google-Extended, OAI-SearchBot, and ChatGPT-User. Adoption among websites remains under one percent. (Wikipedia: Robots.txt)

Lower adoption is currently the structural feature most expert websites can convert into a citation advantage. The cost of placing llms.txt is five minutes; the cost of competitors not placing it is months of head start.

Schema.org and the 17 entity types LLMs read

Schema.org is the shared vocabulary used by Google, Microsoft, Yahoo, and Yandex. JSON-LD markup hands the model a complete entity definition without prose-parsing. Schema markup triples AI citation rates.
Schema.org is a shared vocabulary co-built by Google, Microsoft, Yahoo, and Yandex. It defines how a page can declare what kind of thing it is, what it knows, what it links to, and how it relates to other things. The model reads JSON-LD, a small block of structured data in the head of your page, and pulls a complete entity definition from it without having to parse prose. Without schema, the model guesses. With schema, the model is handed a definition. That single difference triples AI citation rates on the same content. Seventeen types matter for an expert site: Person, Organization, Article, BlogPosting, FAQPage, HowTo, Review, Course, Event, Service, BreadcrumbList, VideoObject, ImageObject, AboutPage, ContactPage, Offer, and WebSite. You do not need all seventeen on day one. You need Person on the homepage and About page, Article on every blog post, FAQPage on the FAQ block, and BreadcrumbList on every page. Those four cover roughly ninety percent of what a model wants from a knowledge-led site. The single most important field on Person is sameAs. It accepts an array of URLs that link your domain identity to your LinkedIn profile, your YouTube channel, your Wikidata entry, your podcast feed, and any other verified surface. AI engines treat sameAs as identity proof: one entity confirmed across multiple sources. This is the field where Ring 1 starts pulling Ring 2 into a single coherent picture.

Fact: Schema.org markup is associated with roughly three times higher AI citation rates on the same content. JSON-LD hands the model a complete entity definition, so it does not have to parse prose to recover one. (Wikipedia: Schema.org)

Identity First Marketing implements four schema types as the baseline on every Ring 1 site (Person, Article, FAQPage, BreadcrumbList) and adds the rest as content surfaces appear.

Entity consistency: making your domain read as one entity

Schema markup tells the model what kind of thing a page is. Consistency tells the model that all pages describe the same thing. Markup without consistency reads as a labeled fragment.
Schema markup tells the model what kind of thing a page describes. Consistency tells the model that this page and every other page describe the same thing. Both are required. Markup without consistency reads as a well-labeled fragment. Concrete tests for consistency. Your full name written identically on every page, including punctuation. One canonical short description, ideally under twenty words, repeated verbatim wherever a description field exists: meta description on the homepage, About page opening line, Person Schema description, OpenGraph description, LinkedIn headline. The same author byline on every blog post and podcast episode. The same headshot, same crop, same alt text. The same set of sameAs URLs on every Person Schema instance. Inconsistency between Ring 1 and Ring 2 is the most common failure. The website calls you a leadership coach for B2B consultants. LinkedIn calls you a strategic advisor. Your podcast calls you a host. From a model point of view, those are three people who happen to share a name. Coherence across surfaces is what tells the model otherwise. Each consistent reference is a vote. This is editorial work, not technical work. A developer cannot write your canonical sentence for you. The pillar before this one (schema) is set up once and then ignored. This pillar runs every time you publish anything on any surface. That is exactly why most websites lose it.

The most common reason a Ring 1 audit fails is not missing schema. It is small drift in canonical sentences and headshots across surfaces, accumulated over years.

EntityRank: the compounding result of the four pillars

EntityRank is the AI-era successor to PageRank. PageRank counted links. EntityRank counts confirmations: how many authoritative sources independently say the same thing about an entity. The four pillars in this article feed EntityRank directly.
PageRank, introduced in 1996, measured authority by counting links: how many authoritative sites link to this page. It became the operating principle of Google search for two decades. Language models do not work on links, but they need an analogous signal. EntityRank is the name for it. EntityRank measures authority by counting confirmations: how many authoritative sources independently say the same thing about this entity. The math is roughly source quality multiplied by consistency across sources. A LinkedIn profile, a podcast bio, a Wikipedia paragraph, and a schema-marked About page that all say you are a leadership coach for B2B consultants compound. The same content with three out of four sources saying something subtly different cancels itself out. The model does not see one strong signal. It sees disagreement and lowers the score. This is not a metric Identity First Marketing invented. It is the implicit weighting that LLMs already use. Naming it makes it operable. Once you can name what the model is doing, you can build for it. The four pillars in this article all feed EntityRank directly. llms.txt makes you addressable. Schema markup makes you readable. Consistency makes you verifiable. The compounding of those three across a body of work is what the model rewards. For a single expert this is a couple of afternoons of work. For a growing body of content across blog, podcast, video, and social, consistency becomes weekly structural labor that has to ship with every new piece. That is the work the Identity First Media platform automates: schema, llms.txt and entity consistency applied to every output, every week, without drift. Whether you build it yourself or run it through a system, the four pillars are the same.

Identity First Marketing names EntityRank to make a dynamic that LLMs already use operable. The Identity First Media platform implements the four pillars at scale across every published surface.

Frequently Asked Questions

What is llms.txt and is it different from robots.txt?

llms.txt is a markdown file placed at the root of your domain (yourdomain.com/llms.txt) that tells AI language models which pages on your site are authoritative and how the site is structured. robots.txt, introduced in 1994, tells search-engine crawlers which pages they may index. Both files coexist. robots.txt controls access. llms.txt declares canon for the AI engines that read it. Twenty-four-plus AI crawlers (GPTBot, ClaudeBot, PerplexityBot, Google-Extended, and others) actively check for it. Adoption is still low, which makes early implementation a citation advantage.

Which schema.org types should I implement first?

Four types cover roughly ninety percent of what a language model wants from an expert-led website. Person Schema on the homepage and About page, with sameAs linking to your LinkedIn, YouTube, Wikidata, and podcast feed. Article schema on every blog post. FAQPage on any page that contains a FAQ block. BreadcrumbList on every internal page. The other thirteen types in the seventeen-type list (Organization, BlogPosting, HowTo, Review, Course, Event, Service, VideoObject, ImageObject, AboutPage, ContactPage, Offer, WebSite) are added as the relevant content appears.

What does sameAs do in Person Schema?

sameAs is the field that links your domain identity to other verified profiles of the same person or organization. It accepts an array of URLs: LinkedIn, YouTube channel, Wikidata entry, podcast feed, X profile, GitHub, anywhere your name is independently verifiable. AI engines treat the array as identity proof: one entity confirmed across multiple sources. Without sameAs, your domain identity sits alone. With sameAs, every Ring 2 surface (own LinkedIn, own YouTube, own podcast) reinforces Ring 1.

What is EntityRank?

EntityRank is the implicit authority score that language models already assign to entities, based on consistency across multiple authoritative sources. It is the AI-era successor to PageRank. PageRank counted links: how many authoritative sites link to this page. EntityRank counts confirmations: how many authoritative sources independently say the same thing about this entity. Identity First Marketing introduces the term to make the dynamic operable. Once you can name what the model is doing, you can build for it.

Can I do this without a developer?

Most of it. llms.txt is a markdown file you can write and upload yourself. Schema markup is generated by plugins on every major CMS: Yoast or Rank Math on WordPress, native fields on Webflow and Framer, plugins on Squarespace and Wix. The plugin asks you for the type and the field values. The JSON-LD code is generated for you. Entity consistency is the editorial pillar that no plugin can do for you: writing one canonical short description and using it identically on every surface is a writing job, not a technical one.

Read the blog article

What is AI findability and why classical SEO no longer cuts it

Read the blog article

Where ChatGPT gets its information: the three sources that decide if you're mentioned

Read the blog article

Rings of Entity: from your own domain to external citations

Read the blog article

How a person becomes an entity: the Entity of One formula

Read the blog article

Podcasts, Reddit and Wikipedia: why external ecosystem decides half your AI findability

Read the blog article

Does ChatGPT know you? The 5-prompt Entity Gap Check for your brand