
From PageRank to EntityRank: 25 Years of How Machines Decided Who Matters
Over 25 years, search engines moved from counting links to recognizing entities. Experts who understand this arc are the ones AI systems cite in 2026.
8 min read
What Was the Core Insight Behind PageRank?
PageRank ranked pages by counting inbound links the way academic citation indexes count references, treating each link as a vote of quality.
In 1998, two PhD students at Stanford wrote a paper describing a ranking algorithm they called PageRank. It counted links the way a citation index counts academic references, and it quietly rewired the internet.
The insight was clean: a page that many other pages link to is probably more valuable than a page nobody links to, the same way a scientific paper cited by many researchers is more important than one that sits unread. Each link was a vote. Links from high-authority pages counted more than links from low-authority ones.
Before this, search engines were keyword-matching systems. AltaVista, Excite, Lycos, Yahoo: they indexed which words appeared on which pages and served results where the words matched what you typed. Quality was not the signal. Presence was.
For almost a decade, PageRank worked beautifully. Then it did not. The web was not built on academic honesty. It was built on incentives. Once marketers discovered that links were the currency, they started manufacturing them. Link farms. Private blog networks. Paid links disguised as editorial. Guestbook spam. The signal decayed, and Google spent years fighting back with algorithmic updates like Panda in 2011 and Penguin in 2012.
The link-counting era was ending. Something had to replace it.
What Did Google's Knowledge Graph Actually Change?
The Knowledge Graph shifted Google's internal model from a collection of documents to a map of real-world entities and the relationships between them.
In May 2012, Google announced the Knowledge Graph with a slogan that said everything: 'things, not strings.' The idea was simple and radical. Instead of treating the web as a pile of documents that contain words, treat it as a map of entities: real people, places, organizations, books, films, and the relationships between them.
When you searched for 'Leonardo da Vinci' in 2011, Google returned pages. When you searched for 'Leonardo da Vinci' in 2013, Google returned a person: his dates, his paintings, his contemporaries, sidebar and all. The document-centric web had become an entity-centric web, at least inside Google's internal representation.
For marketers, this was the first tremor.
Schema.org structured data, launched in 2011 as a joint effort by Google, Bing, Yahoo, and Yandex, became the way to tell machines explicitly what kind of thing a page was about. Organizations started appearing as Knowledge Panels. Local businesses, previously invisible, suddenly had rich cards with hours, ratings, and photos.
Most SEO agencies, however, kept optimizing for keywords. The world had shifted underneath them, and the incentives of the industry: monthly retainers, tactical reports, rank tracking dashboards, kept the focus on the old game for years longer than was wise.
The semantic era arrived next and widened the gap further. In October 2019, Google rolled out BERT, a natural language model that interpreted queries the way a human would: not as a bag of keywords but as a question with context. 'Can you get medicine for someone at the pharmacy' became a query Google could parse with intent. Over the following four years, a page could rank for a query whose exact words it never contained, because the page's meaning matched the query's intent.
Keyword optimization, the backbone of SEO for two decades, had become dramatically less important. Topical depth began to matter more than surface matching. The industry mostly ignored this too.
How Is AI Search Fundamentally Different from Traditional Search?
AI search composes answers from an internal model of known entities rather than retrieving and ranking documents. If the model does not recognize you, you are absent from the answer.
In late 2022, ChatGPT launched. In early 2023, Bing integrated GPT-4 into its search experience. Perplexity, Google's Bard, and then Google AI Overviews followed. By 2024, a growing share of users were reading a composed answer at the top of their results, not scrolling a list of blue links.
The architecture underneath this is different in kind, not just degree.
Traditional search retrieves pages. It ranks them by relevance signals and hands the user a list. The user decides what to read. An AI-generated answer does not retrieve and rank. It composes. It draws from the model's internal representation of who and what exists in the world, assembles the answer, and sometimes attributes sources by name. Most sources never appear.
This is where the twenty-five-year arc lands.
Research into LLM citation behavior shows that 80% of sources cited by large language models do not rank in the top 100 of traditional Google search. AI is drawing from a different pool entirely. The sources that get cited are not the ones with the highest domain authority in a link-sense. They are the ones the model has built a coherent representation of.
The data on topical structure is equally sharp. Studies of AI citation patterns show that 86% of AI citations go to sites that build knowledge clusters around a single topic, not to scattered content archives. Sites with structured topic clusters receive 3.2 times more AI citations than disconnected content. Breadth, which was forgivable in a keyword era, is a direct disadvantage in this one.
The practical consequence: if you publish useful content but the model does not have a clear, corroborated picture of who you are and what you specifically know, you are invisible in the answer. Not penalized. Absent.
What Is EntityRank and Why Does It Describe What Is Happening Now?
EntityRank describes the implicit authority score AI systems assign to recognized entities based on consistent cross-source mentions, topical coherence, and structured identity signals.
The term EntityRank is not official. No engineer at Google or OpenAI uses it in their documentation. But it captures the underlying logic that all these systems share: a ranking, implicit or explicit, of entities by authority.
Where PageRank asked 'how many pages link to this page?', the modern system asks a more layered question: how many credible sources mention this entity in a consistent way, across media, and how well is this entity connected to related entities that are themselves credible?
This is a fundamentally different game.
Follower counts do not feed it. Keyword density does not feed it. Monthly blog output, in the abstract, does not feed it. What feeds it is a specific set of conditions that most experts have never deliberately built:
Consistent naming across sites, platforms, podcasts, and articles. Not 'John Smith' here and 'J. Smith Consulting' there. One coherent identity that machines can resolve to a single entity.
Authoritative third-party mentions. A guest appearance on a credible podcast, an interview in a trade publication, a quoted opinion in an industry article. Outside confirmation is weighted heavily because it is harder to manufacture than self-published content.
A coherent topic territory. Fifteen connected pieces on one subject build a topical cluster that AI systems recognize and trust. Fifty pieces across five subjects, without a clear center, build nothing the model can anchor to.
Structured data. Schema markup that confirms who the entity is, what they do, and how they connect to verifiable identifiers.
Long-standing, stable identifiers. A domain. A LinkedIn profile. An active presence on platforms that AI systems index and trust.
This is a slower game than link-building ever was. It is also a fairer one, because it is harder to fake at scale.
What Should Experts and Founders Actually Do Differently in 2026?
Stop optimizing individual pages and start building a recognizable entity: a narrow topic territory, consistent naming, and third-party mentions that corroborate who you are.
Three conclusions follow from the full arc, and they are not incremental improvements to a familiar strategy. They are a different orientation.
First: stop treating content as a collection of standalone assets and start treating every piece as a vote for the entity your name represents. A blog post that ranks for a narrow keyword contributes almost nothing to your entity authority. A podcast interview on a credible show, a guest article in a respected publication, a quoted insight in a trade story: these contribute a great deal, because they are corroborations from sources the model already trusts.
The conversion data makes the stakes concrete. AI-referred sessions convert at roughly 14.2% versus 2.8% for Google organic traffic. Visitors arriving from an AI-generated answer are approximately five times more qualified than visitors arriving from a standard search result. The audience that AI systems send to you is small, specific, and ready to act. That audience is worth building for.
Second: invest in third-party proof before you invest in more owned content. One credible external mention is worth dozens of posts on your own channel, because the model uses outside confirmation to build its internal picture of who you are. If every signal about your expertise comes from your own site, the model has no corroboration.
Third: choose a narrow topic and occupy it. This is the hardest one for generalist consultants and multi-service agencies. Breadth served keyword strategies well, because more content meant more keyword coverage. In an entity model, breadth is dilution. A clear, specific, corroborated claim to one topic territory is more valuable than a broad, uncorroborated presence across many.
This does not mean publishing less. It means publishing with direction. Every piece pointing back to the same center, building the same cluster, reinforcing the same identity.
What Does the Full Arc from 1998 to 2026 Actually Mean?
The twenty-five-year march from PageRank to entity recognition is a single story: machines slowly learned to see the web the way humans always did, as a world of people and ideas with relationships between them.
The story from 1998 to 2026 is not a story of new tools replacing old ones. It is a story of search engines slowly, relentlessly learning to treat the web the way humans always treated reality: as a world of people, organizations, and ideas, with relationships between them, not as a database of strings.
PageRank approximated authority through citation. The Knowledge Graph approximated identity through entity recognition. BERT approximated understanding through language modeling. AI Overviews approximate synthesis through answer composition. Each step moved further from the document and closer to the entity.
What changed between 1998 and 2026 is that the machines caught up.
In the old game, visibility was a matter of optimization. You could reverse-engineer the signal, apply tactics, and move up the list. In the new game, visibility is a matter of identity. The machine asks who you are, what you know, and who else confirms it. There is no shortcut to that question. There is only the slow, consistent work of being a real thing that credible sources acknowledge.
The experts and founders who understand this arc are not scrambling for tactics. They are building something the machines can recognize. That is the advantage the next decade rewards.
Frequently Asked Questions
What was PageRank and does it still matter in 2026?
PageRank was a link-counting algorithm developed by Google's founders in 1998 that ranked pages by the number and quality of inbound links. It still contributes to traditional search ranking, but its influence has diminished significantly as Google's systems shifted toward entity recognition, semantic understanding, and AI-composed answers that draw from a different pool of sources entirely.
What is Google's Knowledge Graph?
Google's Knowledge Graph, launched in 2012, is an internal database of real-world entities: people, places, organizations, and the relationships between them. It allows Google to return information about a known entity directly, rather than just listing pages. Being recognized as an entity in the Knowledge Graph is one of the foundational signals that AI systems use when composing answers.
How is AI search different from traditional search?
Traditional search retrieves and ranks documents. AI search composes answers from the model's internal representation of known entities and their associations. A source that ranks in the top ten of Google may never appear in an AI-generated answer, while a source the model has a strong entity representation of may be cited even if it ranks nowhere near the top of traditional results.
What does 'entity' mean in the context of search and AI visibility?
An entity is a real-world thing that a machine can recognize and distinguish from other things: a person, an organization, a topic, a product. In search terms, being an entity means the system has enough consistent, corroborated signals about who you are and what you know to represent you internally. Without that, you are a collection of pages, not a recognized thing.
Is SEO dead in 2026?
Traditional keyword-focused SEO has lost most of its leverage. What replaced it is entity building: establishing a clear, consistent, corroborated identity around a specific topic territory. Technical fundamentals like page speed and structured data still matter. But the competitive advantage no longer comes from on-page optimization. It comes from how clearly and credibly the machines know who you are.