How to Structure Your Pages to Get Cited by LLMs
Learn how to structure your web pages to maximize your chances of being cited by ChatGPT, Perplexity, and Gemini: format, information density, and semantic markup.
LLMs don’t read pages like humans do: they look for dense answers, precise definitions, and verifiable information. Sandro, cofounder of Gemeos Agency, shares the structuring rules that can materially increase your chances of being cited in AI answers.
Understanding what LLMs look for on a page
When an LLM or AI search engine crawls a page, it tries to extract clear, self-contained units of information. An “information unit” is a block of text that answers a specific question, contains verifiable facts, and can be understood without the context of the whole page.
Long, flowing, narrative pages are hard to extract. Pages structured into independent sections, with explicit headings and direct answers, are much easier to cite.
1. Use the “Question > Direct Answer > Explanation” structure
For each section of your page, apply this framework:
- The H2 or H3 heading asks the question (or states the main claim)
- The first paragraph answers it directly in 1 to 2 sentences
- The following paragraphs expand with examples, data, and context
This format mirrors exactly what LLMs extract to build their answers.
2. Add explicit definitions
Define the key terms in your field directly in your content. LLMs are trained on millions of definitions: when your page includes a clear one on a topic, it becomes a potential source for every question related to that term.
Ideal definition format:
3. Include numbers and specific facts
LLMs place more trust in content that includes statistics, percentages, and verifiable figures. Replace vague wording with concrete data.
- Vague: “Fast loading improves conversion rate”
- Specific: “A site that loads in under 2 seconds converts, on average, 15% better than a site that loads in 5 seconds”
Cite your sources in the text (“According to a 2023 Google study”). Even approximate figures or sourced industry estimates increase LLM confidence in your content.
4. Use lists and tables
Bulleted lists, numbered lists, and tables are the easiest formats for an LLM to extract. They represent information units that are already structured and ready to be folded into an answer.
Rules to follow:
- Each list item must stand on its own and make sense without the context of the other items
- Tables should have clear column headers
- Prefer short lists (3 to 7 items) over long, exhaustive ones
5. Structure your HTML semantically
The heading hierarchy (H1, H2, H3) is read by AI crawlers exactly the same way SEO bots read it. Stick to a strict hierarchy: one H1 per page, H2s for main sections, H3s for subsections.
Conclusion
Structuring your pages for LLMs doesn’t mean rewriting everything. It’s mostly a matter of editorial discipline: direct answer first, precise data, explicit headings, lists instead of prose.
- Use case 1: redesign an agency’s service pages so they get cited for “[specialty] agency” queries in Perplexity
- Use case 2: optimize existing blog articles to increase their chances of being used as a source by ChatGPT
- Use case 3: create an industry glossary page to capture definitional LLM queries
Lorem ipsum
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.















