How to Get Cited by ChatGPT & AI Overviews

AI search optimization has no guaranteed citation formula. The durable work is familiar: make pages crawlable and indexable, publish useful original information, and keep important facts accurate. Platform-specific crawler controls can make content eligible to be accessed, but they cannot promise a mention, a citation, or a rank.

For twenty years, a search engine behaved like a card catalogue. You asked for plumbers in Austin or how to write a welcome email, received a drawer of links, and went into the stacks to do your own reading. SEO grew up around the position of the card.

Some search experiences now place a composed answer on the librarian’s desk. A source may be named, linked, summarised, or absent from that answer. That changes what teams need to observe, but it does not repeal the old library rules or create a secret syllabus for machines.

The catalogue did not burn

Google’s guidance for AI features in Search is explicit: normal SEO foundations still apply. A page still needs to be accessible, indexable, useful to its intended audience, and supported by the same technical basics that help Google understand ordinary search pages.

Google requires no llms.txt file. Google requires no special schema, content chunking, answer-first opening, question headings, or ideal page length for those features. Those devices may be proposed as tactics elsewhere, but they are not admission tickets to Google’s AI experiences. A clear opening, descriptive heading, short paragraph, or FAQ can help a reader when the subject calls for it. They are editorial choices, not machine tolls.

Google’s people-first content guidance makes the more useful distinction. It asks whether a page serves an existing audience, demonstrates first-hand knowledge or depth, provides clear sourcing, and leaves the reader feeling they learned enough to pursue their goal. It also warns against writing to a preferred word count. A page should be as long as the work requires and no longer.

That is less glamorous than a new acronym, but it is sturdier. The claims still need receipts. The advice in The Proof Machine applies here too: replace unsupported superlatives with facts a skeptical reader can inspect.

Two library cards, two different permissions

OpenAI separates search access from training access. Its ChatGPT Search publisher guidance says that sites should allow OAI-SearchBot if they want their content to be discoverable and linked in ChatGPT search. OAI-SearchBot access matters for ChatGPT Search inclusion, while top placement is not guaranteed. OpenAI says ranking uses multiple factors; it does not publish a lever that buys the first citation.

The distinction matters because the names on the cards are easy to confuse. OpenAI’s crawler documentation says GPTBot is for training, and its controls are independent from OAI-SearchBot. A publisher can make one decision about possible search inclusion and another about whether content may be used to improve generative models.

Crawler permission is access, not a citation guarantee. Allowing OAI-SearchBot removes one access barrier for ChatGPT Search; it does not establish that a particular page will be fetched for a particular query, ranked above another source, quoted, or cited. Blocking it can remove that route to inclusion, but allowing it is still only an eligibility decision.

That is the sober way to read robots.txt: as a permissions ledger. Decide deliberately which systems may enter, document why, and revisit the choice when the business model changes. Do not turn an unlocked door into a promise about what a visitor will do inside.

The credentials desk has no secret handshake

Structured data remains useful when it accurately represents visible page content. Structured data can support ordinary rich-result eligibility, but it is not required for generative AI. There is no special schema that switches on AI citations.

Use supported markup for its real job: helping search systems interpret eligible page types and qualify them for relevant conventional search treatments. Keep names, dates, prices, locations, and other marked-up facts consistent with what a person can read on the page. Test the implementation, fix errors, and avoid marking up claims the page does not substantiate.

The same honesty applies to bylines. Name an author or reviewer only when that person actually wrote or reviewed the work and their role is verified. A fabricated expert is not a credibility layer; it is a false label on the spine. If the organisation is the accountable source, say so plainly.

Technical care still matters. Canonicals, internal links, crawlability, indexability, page experience, and accurate visible content all belong in a sound website build. They are good library maintenance. None should be sold as a private corridor into an AI answer.

Stock the shelves with something worth finding

The strongest content opportunity is rarely a formatting trick. It is information the business can publish with a genuine reason to know it: current prices and their conditions, service areas, product constraints, a documented process, original observations, or a comparison grounded in real delivery experience.

First-party knowledge is not automatically worthy of trust. It becomes useful when readers can see its limits. If you publish original data, state when it was collected, how it was gathered, what was included, and what the sample cannot establish. If you make a factual claim about the wider world, use a primary source where one exists. If a number cannot be verified, remove it rather than decorating the paragraph with borrowed precision.

Structure that information for the person who came to use it. Put definitions where ambiguity would otherwise slow the reader. Use question headings when people genuinely ask those questions. Add an FAQ when it resolves distinct objections that the main article does not already answer. Start with a conclusion when urgency makes that kind; begin with a story when context does more work. No one arrangement guarantees how a generative system will treat the page.

For a local business, this often means maintaining the facts customers repeatedly need: location, availability, scope, pricing approach, and evidence of completed work. Our guide to local visibility treats those details as an operating system rather than a one-time optimisation. The Living Profile makes the same case for business listings: accuracy is ongoing work, not a plaque hung once.

Use this reference-desk audit

Treat AI visibility as an observation practice, not a ranking promise.

Check the ordinary search foundation. Confirm that important pages are crawlable, indexable, internally linked, canonicalised correctly, and useful for the intent they claim to serve.
Separate crawler decisions. Review OAI-SearchBot and GPTBot independently. Record the business reason for allowing or blocking each instead of copying a generic robots.txt recipe.
Inventory first-party evidence. Find claims only your organisation can substantiate, then add the dates, scope, method, examples, or source material a reader needs to evaluate them.
Edit for people. Remove throat-clearing, padding, duplicate sections, and vague boasts. Add headings, summaries, lists, or questions only where they improve comprehension.
Validate structured data for its ordinary purpose. Use supported types that match visible content and evaluate them against the rich-result feature they are meant to support.
Measure a stable set of queries. Record the system, date, prompt, response, and landing-page analytics on a regular cadence. Keep screenshots or exports when decisions depend on the observation.

A useful scorecard records mention, citation, cited URL, factual accuracy, referral traffic, and lead outcomes; it does not promise rankings. A mention without a link is different from a citation. A citation to the wrong URL is different from one to the intended page. An accurate answer that sends no visit is different from a referral that becomes a qualified lead. Keeping those fields separate prevents a flattering anecdote from becoming a fictional KPI.

The results will move because answers, indexes, queries, locations, and products move. Report the sample and date, describe what changed, and resist turning one response into a universal law.

Questions from the reference desk

Does Google require llms.txt for AI features?

No. Google’s current guidance says no new machine-readable file is required beyond the normal technical foundations for Search. A team may evaluate other file conventions for other purposes, but it should not describe llms.txt as a Google AI requirement or a citation lever.

Is structured data still worth maintaining?

Yes, when it accurately supports an eligible ordinary search feature or helps keep page information consistent. Its value does not depend on claiming that generative systems require it, because they do not.

Should GPTBot be allowed for ChatGPT Search inclusion?

GPTBot is the training crawler, not the search crawler. The relevant control for possible ChatGPT Search inclusion is OAI-SearchBot, and OpenAI documents the two controls as independent. Choose each policy on its own merits.

Where to go next

The Citation Ledger — a framework for separating a brand mention from a source citation.
The Machine-Readable Shopfront — how consistent business facts travel across the systems that use them.
Need help turning operational knowledge into useful customer journeys? Explore our AI workflows practice.

The New Librarians: Getting Cited When the Machines Answer First

The catalogue did not burn

Two library cards, two different permissions

The credentials desk has no secret handshake

Stock the shelves with something worth finding

Use this reference-desk audit

Questions from the reference desk

Does Google require llms.txt for AI features?

Is structured data still worth maintaining?

Should GPTBot be allowed for ChatGPT Search inclusion?

Where to go next

Turn attention into a system.

Your analytics choice

The catalogue did not burn

Two library cards, two different permissions

The credentials desk has no secret handshake

Stock the shelves with something worth finding

Use this reference-desk audit

Questions from the reference desk

Does Google require llms.txt for AI features?

Is structured data still worth maintaining?

Should GPTBot be allowed for ChatGPT Search inclusion?

Where to go next

Turn attention into a system.