Study · May 2026

May 5, 2026 · 400 probes · 4 models · 10 brands · 10 queries

GEO Citation Study — how LLMs see your brand today

Empirical analysis across Claude, ChatGPT, Perplexity and Gemini: which brands get cited, which disappear, and why? Including a reproducible probing setup so you can run the same test on your own brand.

We tested 10 Hidden Champions from the German Mittelstand across 10 query types — from direct brand questions to recommendation prompts. The hypothesis that B2B brands systematically vanish in LLMs is not supported by this sample. What we find instead: four models with very different answer styles, a consistent top-3, and individual brands that drop only on specific queries.

See findings Data CSV Run it yourself

01 · Top line

The numbers in four sentences

What 400 probes across 4 models and 10 brands consistently show.

Overall citation rate

98%

Average across all models. The high rate shows: these 10 Hidden Champions are present — they have not "disappeared".

Mention density

11.3

Average brand mentions per response. Range 4.9 to 17.6.

Visibility index

88/100

Composite score from citation rate, mention density and list position.

Avg. response length

503 w.

Average word count per response. Range: 263 (Perplexity) to 814 (Gemini).

Headline finding

The Hidden Champions in this study do not disappear from the models — they show up in over 95% of relevant queries. What does differ strongly is how models answer: how deeply they embed a brand, and which sources they pull from.

02 · Brand ranking

Who is visible — and who is less so

Visibility index per brand, averaged across all 4 models. 100 = perfectly cited, high mention density, frequently first in lists.

#	Brand	Type	Visibility index	Citation rate	Density	Best model
1	Knauf	B2B	95	100%	13.6	ChatGPT (GPT-5)
2	Stihl	Consumer	92	100%	11.8	ChatGPT (GPT-5)
3	Miele	Consumer	91	100%	11.7	Gemini 2.5 Pro
4	Festo	B2B	90	100%	12.9	ChatGPT (GPT-5)
5	Kärcher	Mixed	88	100%	8.8	Gemini 2.5 Pro
6	Würth	B2B	87	95%	9.9	Gemini 2.5 Pro
7	Sennheiser	Consumer	87	98%	11.3	ChatGPT (GPT-5)
8	Trumpf	B2B	86	98%	11.1	ChatGPT (GPT-5)
9	Liebherr	B2B	86	95%	10.6	Gemini 2.5 Pro
10	Hilti	B2B	82	90%	11.3	ChatGPT (GPT-5)

03 · Model divergence

The four models don't think alike

Same brands, same queries — four very different answer styles.

ChatGPT (GPT-5)

Visibility index 96

Citation rate: 98.0% · Mention density: 17.6 · Avg. words: 577

High-volume profiler. GPT-5 writes long-form (577 words on avg.), repeats brand names often (density 17.6) and tends to produce structured top-N lists.

Gemini 2.5 Pro

Visibility index 95

Citation rate: 99.0% · Mention density: 16.2 · Avg. words: 814

Longest output. Gemini 2.5 Pro produces the most verbose answers (814 words, density 16.2). Reasoning tokens must be budgeted generously.

Claude Sonnet 4.5

Visibility index 82

Citation rate: 100.0% · Mention density: 6.5 · Avg. words: 359

Structured and concise. Claude Sonnet 4.5 answers more compactly (359 words, density 6.5) — high hit-rate, less wordy noise.

Perplexity Sonar Pro

Visibility index 80

Citation rate: 93.0% · Mention density: 4.9 · Avg. words: 263

Source-anchored. Perplexity Sonar Pro stays brief (263 words, density 4.9) and often produces inline citation markers — closest to a classic search experience.

Take-away

The models do not differ primarily in whether they know a brand, but in how they embed it. GPT-5 and Gemini generate verbose, dense profiles (>500 words, high mention density). Claude and Perplexity answer more concisely with a clear focus on sources. If you want to "rank" a brand, you should not just measure "am I mentioned", but "in what style, with what depth, with which sources".

04 · B2B vs. consumer

The expected bias — does not exist in this sample

Hypothesis going in: B2B Hidden Champions disappear more than consumer brands. The data says otherwise.

Consumer brands (n=12)

89.6 visibility

Citation rate99.2%

Mention density11.6

ExamplesSennheiser, Stihl, Miele

B2B brands (n=24)

87.5 visibility

Citation rate96.3%

Mention density11.6

ExamplesTrumpf, Hilti, Würth, Festo, Knauf, Liebherr

Difference

2.1 points

Visibility gap+2.1 consumer

Citation gap2.9 pts

AssessmentWithin noise — no systematic bias against B2B in this sample.

What this means: The common assumption "B2B has a visibility problem in LLMs" does not hold for established Hidden Champions with strong Wikipedia/press footprint. The brands probed here are global market leaders with decades-long PR trails — and that is exactly the trail the models read.

The flip side: brands without Wikipedia entries, without German business-press coverage, without strong industry-media backlinks — those could in fact disappear from the answers. This study cannot test that, because it deliberately targets established brands. The next iteration should explicitly include "mid-tier" Mittelstand companies.

05 · Disappearance points

Where a brand drops out — and why

Concrete brand × model combinations with citation rate below 50%. With few hits, more the exception than the rule.

This sample contains no real disappearers — no brand falls below 50% citation rate on any model. That is itself a finding: established Hidden Champions are reliably visible across all 4 major LLMs.

The interesting pattern is not "disappearance" but list position: a brand that lands at position 6 or 7 in a "top providers" list is essentially invisible to the end user. This study captures position rank for list queries — see the CSV for details.

06 · Source clusters

Where models draw their answers from

Distribution of cited source clusters across all 400 probes.

Company website

179

Other

152

Social (LinkedIn / X)

Wikipedia

Government / research

German business press

Industry / trade press

International business media

Top sources across all brands

kununu.com 22× cited 10 brands

liebherr.com 21× cited 2 brands

festo.com 17× cited 1 brands

sennheiser.com 17× cited 1 brands

trumpf.com 17× cited 1 brands

glassdoor.de 16× cited 9 brands

hilti.group 16× cited 2 brands

stihl.de 14× cited 1 brands

wuerth.com 13× cited 2 brands

knauf.de 13× cited 1 brands

Observation

Across all models, Wikipedia and the respective company domains dominate as primary sources. German business and industry press is visibly present — international outlets like Reuters/Bloomberg appear noticeably less often than they would for a US-only probe.

07 · Query types

Which questions surface a brand best

Citation rate per query type. Helps you understand which user searches your brand actually competes in.

1·brand_direct Direct brand profile

100%

2·brand_leadership Brand leadership / market position

100%

4·comparison Brand-vs-brand comparison

100%

5·innovation Innovation track record

100%

6·reputation Reputation / customer perception

100%

8·recommendation Personal recommendation

100%

10·future_outlook 5-year outlook

100%

9·news_recency Recent news / changes

98%

3·category_leader Top providers in category

95%

7·hidden_champion Hidden Champions of German Mittelstand

83%

08 · Reproduce

Measure your own brand

The full probing setup is open. Clone the repo, add your brand, run it — and compare your brand to the 10 Hidden Champions here.

What you need

Node.js 20+
API keys: Anthropic, OpenAI, Perplexity, Google AI (Gemini)
An .env file with the four keys
~$5 of API credits per 100 probes

Setup in 3 steps

$ git clone https://github.com/craid/geo-citation-study
$ cd geo-citation-study && npm install
$ cp .env.example .env <- add your keys
$ node probe.js --brand=your_brand --smoke <- 1 brand × 4 models × 10 queries
$ node analyze.js <- aggregation + CSV export

Data files for this study

probes.csv — all 400 individual probes with sentiment, mentions, source clusters
aggregated.csv — brand × model rollups
sources.csv — top sources with frequency per cluster

Methodology in two minutes

Models: Claude Sonnet 4.5 · ChatGPT GPT-5 · Perplexity Sonar Pro · Gemini 2.5 Pro
Brands: Trumpf, Sennheiser, Hilti, Würth, Stihl, Miele, Festo, Knauf, Liebherr, Kärcher
Queries: 10 templates (brand-direct, category leader, Hidden Champion, comparison, recommendation, recency, outlook ...)
Probes: 10 brands × 10 queries × 4 models = 400 requested probes; 400 successfully evaluated (0 errors).
Parameters: Temperature 0.3 (reasoning models: default), max_tokens 1,500 (16k for reasoning models), German system prompt, factual research instruction.
Sampling: 1 run per brand × query × model. Limitation: no multi-run for stability — next iteration with n=3.
Visibility index: Composite (0-100): 60% citation rate, 30% mention density (cap 10), 10% inverse list position. Higher is better.

CRAiD

Want a deep probe for your brand?

We build GEO measurement setups for brand teams that want to know how LLMs describe them today — and how that shifts when new sources, new models or new competitors enter the picture.

hello@craid.de craid.de/contact

← Back to Insights