Making Your Website AI-Proof: llms.txt, schema.org and the 17 Entity Types LLMs Read

Ring 1, your own domain, is the only layer of AI findability where you can declare what is true instead of letting AI guess. Four pillars make a website AI-proof: llms.txt as the direct line to twenty-four-plus AI crawlers, schema.org with seventeen entity types, entity consistency across every surface, and EntityRank as the compound score the four pillars produce.

April 26, 20265 min read

Schema.org and the 17 entity types LLMs read

Schema.org is the shared vocabulary used by Google, Microsoft, Yahoo, and Yandex. JSON-LD markup hands the model a complete entity definition without prose-parsing. Schema markup triples AI citation rates.

Schema.org is a shared vocabulary co-built by Google, Microsoft, Yahoo, and Yandex. It defines how a page can declare what kind of thing it is, what it knows, what it links to, and how it relates to other things. The model reads JSON-LD, a small block of structured data in the head of your page, and pulls a complete entity definition from it without having to parse prose.

Without schema, the model guesses. With schema, the model is handed a definition. That single difference triples AI citation rates on the same content.

Seventeen types matter for an expert site: Person, Organization, Article, BlogPosting, FAQPage, HowTo, Review, Course, Event, Service, BreadcrumbList, VideoObject, ImageObject, AboutPage, ContactPage, Offer, and WebSite. You do not need all seventeen on day one. You need Person on the homepage and About page, Article on every blog post, FAQPage on the FAQ block, and BreadcrumbList on every page. Those four cover roughly ninety percent of what a model wants from a knowledge-led site.

The single most important field on Person is sameAs. It accepts an array of URLs that link your domain identity to your LinkedIn profile, your YouTube channel, your Wikidata entry, your podcast feed, and any other verified surface. AI engines treat sameAs as identity proof: one entity confirmed across multiple sources. This is the field where Ring 1 starts pulling Ring 2 into a single coherent picture.

EntityRank: the compounding result of the four pillars

EntityRank is the AI-era successor to PageRank. PageRank counted links. EntityRank counts confirmations: how many authoritative sources independently say the same thing about an entity. The four pillars in this article feed EntityRank directly.

PageRank, introduced in 1996, measured authority by counting links: how many authoritative sites link to this page. It became the operating principle of Google search for two decades. Language models do not work on links, but they need an analogous signal. EntityRank is the name for it.

EntityRank measures authority by counting confirmations: how many authoritative sources independently say the same thing about this entity. The math is roughly source quality multiplied by consistency across sources. A LinkedIn profile, a podcast bio, a Wikipedia paragraph, and a schema-marked About page that all say you are a leadership coach for B2B consultants compound. The same content with three out of four sources saying something subtly different cancels itself out. The model does not see one strong signal. It sees disagreement and lowers the score.

This is not a metric Identity First Marketing invented. It is the implicit weighting that LLMs already use. Naming it makes it operable. Once you can name what the model is doing, you can build for it.

The four pillars in this article all feed EntityRank directly. llms.txt makes you addressable. Schema markup makes you readable. Consistency makes you verifiable. The compounding of those three across a body of work is what the model rewards.

For a single expert this is a couple of afternoons of work. For a growing body of content across blog, podcast, video, and social, consistency becomes weekly structural labor that has to ship with every new piece. That is the work the Identity First Media platform automates: schema, llms.txt and entity consistency applied to every output, every week, without drift. Whether you build it yourself or run it through a system, the four pillars are the same.

Frequently Asked Questions

What is llms.txt and is it different from robots.txt?

llms.txt is a markdown file placed at the root of your domain (yourdomain.com/llms.txt) that tells AI language models which pages on your site are authoritative and how the site is structured. robots.txt, introduced in 1994, tells search-engine crawlers which pages they may index. Both files coexist. robots.txt controls access. llms.txt declares canon for the AI engines that read it. Twenty-four-plus AI crawlers (GPTBot, ClaudeBot, PerplexityBot, Google-Extended, and others) actively check for it. Adoption is still low, which makes early implementation a citation advantage.

Which schema.org types should I implement first?

Four types cover roughly ninety percent of what a language model wants from an expert-led website. Person Schema on the homepage and About page, with sameAs linking to your LinkedIn, YouTube, Wikidata, and podcast feed. Article schema on every blog post. FAQPage on any page that contains a FAQ block. BreadcrumbList on every internal page. The other thirteen types in the seventeen-type list (Organization, BlogPosting, HowTo, Review, Course, Event, Service, VideoObject, ImageObject, AboutPage, ContactPage, Offer, WebSite) are added as the relevant content appears.

What does sameAs do in Person Schema?

sameAs is the field that links your domain identity to other verified profiles of the same person or organization. It accepts an array of URLs: LinkedIn, YouTube channel, Wikidata entry, podcast feed, X profile, GitHub, anywhere your name is independently verifiable. AI engines treat the array as identity proof: one entity confirmed across multiple sources. Without sameAs, your domain identity sits alone. With sameAs, every Ring 2 surface (own LinkedIn, own YouTube, own podcast) reinforces Ring 1.

What is EntityRank?

EntityRank is the implicit authority score that language models already assign to entities, based on consistency across multiple authoritative sources. It is the AI-era successor to PageRank. PageRank counted links: how many authoritative sites link to this page. EntityRank counts confirmations: how many authoritative sources independently say the same thing about this entity. Identity First Marketing introduces the term to make the dynamic operable. Once you can name what the model is doing, you can build for it.

Can I do this without a developer?

Most of it. llms.txt is a markdown file you can write and upload yourself. Schema markup is generated by plugins on every major CMS: Yoast or Rank Math on WordPress, native fields on Webflow and Framer, plugins on Squarespace and Wix. The plugin asks you for the type and the field values. The JSON-LD code is generated for you. Entity consistency is the editorial pillar that no plugin can do for you: writing one canonical short description and using it identically on every surface is a writing job, not a technical one.

Making Your Website AI-Proof: llms.txt, schema.org and the 17 Entity Types LLMs Read

The only ring you can declare instead of guess

llms.txt: the direct line to AI crawlers

Schema.org and the 17 entity types LLMs read

Entity consistency: making your domain read as one entity

EntityRank: the compounding result of the four pillars

Frequently Asked Questions