Services Pricing Work Studio Team Blog Contact Start a project
AI Workflows

The New Librarians: Getting Cited When the Machines Answer First

Library catalogue cards and source pages being selected into an AI answer citation interface

To get cited by ChatGPT, Perplexity and Google’s AI Overviews, your pages need three things: a direct answer in the first few sentences, structured data that lets a machine verify who is speaking, and a site that AI crawlers are explicitly permitted to read. That is the working core of AI search optimization. Everything else — the audits, the acronyms, the panic — is commentary on those three requirements.

The panic, though, deserves a moment of sympathy, because something genuinely has changed.

For twenty years, a search engine was a card catalogue. You asked it for plumbers in Austin or how to write a welcome email, and it handed you a drawer of index cards — ten blue links — and sent you off into the stacks to do your own reading. The whole discipline of SEO grew up around one ambition: get your card near the front of the drawer.

The new librarians don’t hand you the drawer. They have read the stacks already. You ask your question and they answer it themselves, at the desk, in complete sentences — synthesising a dozen books into a paragraph. Sometimes they name the books they drew from. Often they don’t. And a business that spent two decades learning to be findable now discovers that the game has quietly become being quotable.

Those are not the same skill. This essay is about the difference.

The library stopped lending books

When an assistant answers directly, the visit you used to win never happens. The searcher gets their answer at the desk and leaves — no click, no pageview, no pixel fired. What survives the synthesis is the citation: the moment the librarian says according to this source.

That inversion changes what “winning” means. In the card-catalogue era, position was everything and the content merely had to justify the click it had already earned. In the answering era, the machine has no reason to cite you unless your page did specific work in its reading: stated a fact cleanly, defined a term precisely, answered the exact question asked. Rankings and citations still overlap, but the overlap is loose and getting looser — a page can sit comfortably on the first page of Google and be invisible to every assistant, because nothing in it can be lifted out and quoted whole.

The uncomfortable truth underneath: most business websites were written to persuade a human who was already on the page. Almost nothing on them is liftable. The paragraphs warm up slowly, the claims arrive without evidence, the important sentence is buried four paragraphs deep beneath a throat-clearing introduction. A human skims past this. A machine, deciding in milliseconds which passages deserve to be quoted, simply chooses someone else’s cleaner sentence.

Diagram comparing findable classic search with quotable AI search citation flow

Why doesn’t AI cite your website?

Usually for one of four reasons, and none of them is a mystery.

The crawlers were never let in. Every major assistant reads the web through named crawlers — GPTBot, ClaudeBot, PerplexityBot, Google-Extended and their relatives — and every one of them checks robots.txt before entering. A surprising number of businesses block these bots by default, or through a security plugin they forgot they installed, and then wonder why they don’t exist in AI answers. A librarian cannot quote a book that was never allowed into the library.

There is no direct answer to lift. If someone asks how much does local SEO cost? and your pricing page opens with “In today’s competitive digital landscape…”, there is nothing for the machine to take. The page gestures at the topic without ever committing to a sentence that answers it.

The machine can’t verify who is speaking. Assistants prefer sources they can identify: a named organisation with an address, real authors with roles, dates that say when a claim was made. A page with no schema markup, no author and no date is an anonymous pamphlet. It might be right, but the librarian would rather quote the book with a spine.

The claims are unverifiable. “We’re the leading agency” is not a citable fact — it is a mood. Machines quote specifics: prices, definitions, processes, checklists, named findings. This is the same discipline we describe in The Proof Machine: a skeptical reader and a language model both want evidence they can weigh, and both walk past adjectives.

Four blockers that prevent AI systems from citing a website

What do AI engines actually quote?

Watch the citations in any Perplexity answer or AI Overview for a week and a pattern emerges. The quoted passages are almost always one of a small family of shapes:

  • The clean definition. One or two sentences that say what a thing is, without hedging.
  • The direct answer under a question heading. A heading that matches the query, followed immediately by the answer — not by context, the answer.
  • The specific fact. A price, a timeline, a requirement, a step — stated by a source with a reason to know it.
  • The list. Steps, criteria, checklists: pre-chunked knowledge the machine can reuse with the structure intact.
  • The primary claim. Something true of you that only you can state: your pricing, your process, your service area. On these facts, your site is not competing with the whole internet — it is the authoritative source, if it states them plainly.

Five passage shapes that AI engines can quote and reuse

That last shape matters most for small businesses, and it is why this discipline connects directly to local visibility. When someone asks an assistant what does a social media agency in Austin charge?, the machine wants exactly the kind of first-party fact a transparent pricing page provides. Publishing your prices, your process and your terms in plain language is no longer just a trust move for human visitors — it makes you the citable source for questions about your own business. We wrote about the local version of this shift in The Living Profile: the profile that answers questions gets chosen; the one that merely exists gets skipped.

Writing for the reading machine

The good news is that writing for the new librarians turns out to be writing better, not writing differently.

Answer first, then elaborate. Put the direct answer in the first two or three sentences of the page and of every major section. Humans reward this too — nobody has ever complained that a page got to the point too quickly. The lyricism, the story, the persuasion: all of it still belongs, after the answer, not instead of it.

Phrase headings as questions when a question is what they answer. “Why doesn’t AI cite your website?” is a heading a machine can match against a query and a human can scan in half a second. Poetic headings still have their place — but a page with no question-shaped entry points offers the machine no handles to grip.

One idea per paragraph. Passages get lifted whole. A paragraph that braids three ideas together cannot be quoted without dragging the other two along, so it doesn’t get quoted at all.

Date your claims, name your authors. A dated page by a named person with a stated role is more citable than an undated page by nobody. This is also, not coincidentally, what Google’s own guidance on helpful content has asked for all along: content by someone, for someone, demonstrating experience.

Keep the FAQ real. Three or four genuine questions with answer-first responses give the machine exactly the question-answer pairs it is hunting for — and give the human who scrolled to the bottom one more chance to be convinced.

The credentials desk

Behind the reading room, every library has a credentials desk — the place where a source’s identity is checked before it is trusted. On the web, that desk reads structured data.

Schema.org markup is how a machine confirms the facts around your content: that the organisation has a name, an address, opening hours, real people, services with prices. None of it is visible to human visitors, and all of it is legible to the systems deciding whether you are a verifiable business or an anonymous pamphlet. If your site runs on a modern platform, most of this is a configuration exercise rather than a rebuild — it is part of what we wire into every website we build, because bolting it on later is always more expensive than including it from the start.

Two smaller pieces complete the desk. First, robots.txt: audit it and explicitly allow the AI crawlers you want reading you — the machines check permission before they check quality. Second, the emerging llms.txt convention: a plain-text summary of your site written for language models. Adoption is uneven and no one should promise miracles from it, but it costs an hour and functions as a courtesy card for the new librarians: here is who we are, here is what we offer, here is where to look. Google has likewise published its own documentation on AI features in Search, and its consistent message is that there is no separate trick — pages that are indexable, helpful and verifiable are the pool AI answers draw from.

How to apply this week

  1. Read your robots.txt. Confirm GPTBot, ClaudeBot, PerplexityBot and Google-Extended are allowed — not blocked by an old plugin or a default deny.
  2. Take your five most important pages and answer-first them. Move the direct answer into the opening sentences of each. Cut the throat-clearing.
  3. Turn one heading per page into the question it actually answers. Then make the first sentence beneath it the answer.
  4. Publish your first-party facts. Prices, process, timelines, service area — stated plainly, on pages you control, so machines quoting facts about you quote you.
  5. Check your structured data. Organisation details, author names with roles, dates on articles. If none exists, that is the week’s biggest win.
  6. Add a real FAQ to your most-asked-about page. Three questions customers genuinely ask, answered in two to four sentences each.
  7. Ask the assistants about yourself. Put your own business questions to ChatGPT and Perplexity and read what comes back. What they get wrong is your content roadmap.

Seven-step weekly AI citation sprint for improving machine-readable visibility

Frequently asked questions

Is AI search optimization different from SEO?

It is an extension, not a replacement. The foundations are identical — crawlable pages, helpful content, verifiable entities — which is why sites with strong SEO tend to start with an advantage. The difference is emphasis: classic SEO optimises for the click, AI search optimisation optimises for the quote, and that shifts weight toward answer-first writing, question headings and machine-readable identity.

Does an llms.txt file actually help?

Treat it as cheap insurance rather than a lever. It is a proposed convention, not a ranking system, and adoption across AI platforms is still uneven. But it takes an hour, it cannot hurt, and it puts a clean, canonical summary of your business where a machine can find it — which is more than most competitors have done.

Should I block AI crawlers instead of inviting them?

For a publisher whose product is the content itself, blocking can be a defensible commercial position. For a business whose website exists to win customers, blocking AI crawlers means volunteering for invisibility in the fastest-growing answer channel. You are not protecting an asset; you are declining to be recommended.

How do I measure AI search visibility?

Directly and manually, for now: ask the major assistants the questions your customers ask — best options, costs, comparisons, “near me” queries — and record whether you appear, what is claimed about you, and who gets cited instead. Watch your analytics for referrers like ChatGPT and Perplexity. The measurement tooling is young; the habit of checking is what matters.

References and further reading

Where to go next

  • The Citation Ledger — why mentions and citations became the web’s real currency of trust.
  • The Machine-Readable Shopfront — how structured data turns your business into something machines can recommend.
  • Want your business legible to the new librarians? Our AI workflows practice builds the machine-readable layer — and the answers worth quoting.
Work with us

Turn attention into a system.

We build branding, search, content and web development into one compounding digital identity. Tell us where you want to grow.

Start a project Explore services