Citorra.Free scorecard
← Back to writing
How-to·14 May 2026·11 min read

How LLMs decide what to cite. The 6 signals that drive AI recommendations

The Princeton GEO paper plus 18 months of practitioner data, distilled into the actual signals ChatGPT, Claude, Gemini, and Perplexity weigh when picking sources.

Why does ChatGPT cite some brands and not others for the same query? The mechanism is more knowable than "AI is a black box" suggests. This piece walks through the actual decision pipeline LLMs run when answering a real-world recommendation query, plus the six signals that determine which sources end up in the response.

The framework synthesizes the 2024 Princeton GEO paper (which empirically tested signal effects across 3 LLMs and 40K queries) with ~18 months of practitioner data from real client audits. It's not theory. It's the model that holds up when you reverse-engineer hundreds of cited responses across categories.

The 4-stage decision pipeline

Every LLM that produces a recommendation runs roughly the same pipeline, with implementation details varying per engine. Understanding the pipeline is the precondition for understanding the signals.

Stage 1: Query understanding

The LLM parses your input into intent + entities + constraints. It decides whether to run a real-time web search or rely on training data. Triggers for web search include:

If web search is triggered, the LLM moves to retrieval. If not. The response comes purely from training data, and your brand's visibility depends on what was in the model's training corpus.

Stage 2: Retrieval

The LLM runs one or more web searches (Bing for ChatGPT/Copilot, Google for Gemini, Perplexity's own index for Perplexity, etc.). Typically retrieves 10–30 results, optionally re-ranks them by semantic match to the original query.

Critical implication:to be available for citation, you must first be retrievable. That means good SEO underpins GEO. Pages that don't rank in the underlying search engine's top results almost never make it into the LLM's candidate set.

Stage 3: Source selection

From the retrieved candidates, the LLM picks 3–7 sources to ground its response. This is where the six signals (below) actually operate. Source selection optimizes for:

Stage 4: Synthesis

The LLM generates the response, weaving information from the selected sources. Attribution is shown as inline citations, numbered footnotes, or a sources panel depending on engine. Brand mentions can be:

Your GEO work targets Stage 3 (source selection). That's the addressable point in the pipeline.

The 6 signals that move citation

The Princeton paper tested ~20 signal candidates and found six that consistently moved citation rate by 30–40% per signal. Practitioner audits since have confirmed and refined these. In rough order of leverage:

1. Authoritative quoting (cite others to be cited)

Pages that themselves cite credible external sources get cited moreby LLMs. This is counterintuitive but empirically robust. LLMs use "does this source cite others" as a proxy for "is this source intellectually honest."

Tactical: every content page should include 2–5 external citations to authoritative sources (research, government data, established industry reports). It costs nothing and meaningfully lifts citation eligibility.

2. Statistics and data

Pages with original numbers (research, surveys, internal data) are cited 30–40% more than narrative pages on the same topic. LLMs strongly prefer sources where they can pull a number to anchor the response.

Tactical: every brand should publish at least one "X by the numbers" page per quarter. Doesn't have to be huge research. Even a small internal data point + clear visualization beats a fluffy blog post 10× for citation purposes.

3. Fluency and clarity

Well-written, scannable content beats walls of text. The LLM has to extract usable text. If your prose is dense, ambiguous, or full of marketing fluff, it's harder to extract. Pages that score well on Flesch readability metrics also score well on citation rate.

Tactical: short paragraphs (3–4 sentences), descriptive subheads, bullets over walls of prose, plain language over marketing copy.

4. Comprehensiveness

Pages that cover a topic fully are cited over pages that cover only part of it. LLMs prefer one comprehensive source they can extract from over multiple fragments they have to combine. This is why "definitive guide"-style content outperforms short blog posts in GEO measurement.

Tactical: pillar pages that answer the entire question. Not a teaser leading to gated content.

5. Freshness

Heavy weight for time-sensitive queries. A page updated 2 months ago beats a page updated 18 months ago on the same topic. For evergreen topics, freshness matters less but still measurably.

Tactical: review and update top-traffic pages quarterly. Update the <lastmod> in the sitemap. Add an explicit "updated MM/YYYY" line in the article.

6. Semantic uniqueness

If your content says the same thing as everyone else's content, you're fungible. The LLM picks the highest-authority source. If your content says something the others don't, that's where you get specifically cited.

Tactical: pick one strong opinion or framing per article that nobody else in your category has stated as clearly. That's the citation hook. Bland consensus content is invisible.

Stack effects

These signals stack multiplicatively, not additively. A page that has all six gets cited far more than 6× a page with one. From practitioner data:

Signals presentRelative citation rate
0–1Baseline (1×)
2–3~2× baseline
4–5~5× baseline
All 6~9× baseline

The implication: the marginal value of moving from "some signals" to "all signals" is enormous. Most pages that miss citation are missing 4+ signals, not just one. The fix is rarely a single tweak. It's a coordinated rebuild.

What about the off-site signals?

The six above are on-page signals. Off-site signals. Third-party citations, comparison articles, Reddit/Quora presence, G2/Capterra rows. Operate through a separate mechanism: they affect which pages and sources show up in the retrieval set in the first place.

That's why the 5 gap patterns include both kinds. Three of them (Missing stats page, Comparison gap, Unparseable site) are on-page. Two (Reddit vacuum, Empty third-party listings) are off-site. The audit catches both because both affect citation rate, just at different stages of the pipeline.

How to use this in practice

Pick any page on your site you'd like to be cited for a specific query. Score it 0–6 on the signals above. Be honest. Most pages score 1–2.

For each missing signal, you have a specific, addressable lever. That's the real product of understanding the mechanism. Clarity on what to actually change.

The black-box framing of AI is partly true (we don't know the exact internal weights) but mostly a marketing trope. The mechanism is observable, the signals are testable, and the work is real engineering. The agencies that win this category over the next 24 months will be the ones operating at this level of mechanism awareness. Not the ones selling "AI optimization" as a vague service.

Ready to measure

Get your free AI visibility scorecard.

See exactly how often ChatGPT, Claude, Gemini, and Perplexity cite your brand for your buyers' questions. Free 30-min discovery call. The audit is yours either way.

Request the scorecard

Tagged: #GEO#LLMs#citations#research