Skip to content

GraphRAG

Abstract

GraphRAG extends retrieval-augmented generation by building a knowledge graph — entities, relationships, and community summaries — on top of your document corpus. This page covers the architecture, indexing pipeline, local vs global query modes, available implementations, and the real cost and complexity tradeoffs before you commit to it.

Warning

GraphRAG is significantly more expensive to index than standard RAG. Every document requires multiple LLM calls for entity extraction, relationship extraction, and community summarization. Read the Limitations section before starting a project with it.


Why Standard RAG Falls Short

Standard RAG retrieves text chunks based on semantic similarity. That works well for direct factual lookups — "What is the refund policy?" or "What does function X do?" — but breaks down for questions that require synthesizing information across many documents or reasoning about relationships.

Questions standard RAG handles poorly:

  • "What are the main themes across all 200 research papers in this corpus?"
  • "How does the acquisition of Company X relate to the restructuring described in the annual reports?"
  • "What are all the risk factors mentioned across these insurance filings, and how do they cluster?"

The core problem is multi-hop reasoning: answering these questions requires connecting information from multiple, non-adjacent chunks that wouldn't rank highly for any single similarity search.

Question Type Standard RAG GraphRAG
Direct factual lookup Strong Strong
Single-document summary Strong Strong
Cross-document theme synthesis Weak Strong
Entity relationship mapping Weak Strong
Holistic corpus overview Fails Strong
Real-time or frequently updated data Strong Weak
Small corpus (<50 docs) Adequate Overkill

What GraphRAG Adds

Where standard RAG stores text chunks + vectors, GraphRAG builds a richer intermediate representation:

  • Entities — named things extracted from your documents (people, organizations, locations, concepts)
  • Relationships — typed connections between entities ("Company X acquired Company Y", "Dr. Smith authored Study Z")
  • Communities — groups of densely connected entities, detected algorithmically (Leiden algorithm)
  • Community summariesLLM-generated natural language summaries of what each community represents

This knowledge graph sits between your raw documents and the LLM, enabling queries that traverse relationships rather than just matching text similarity.


Microsoft GraphRAG Architecture

Phase 1 — Indexing Pipeline

The indexing pipeline transforms raw documents into a queryable knowledge graph. This is the expensive part.

flowchart TD
    A([Document Corpus]) --> B[Chunk Documents]
    B --> C[Entity Extraction\nLLM per chunk]
    B --> D[Relationship Extraction\nLLM per chunk]
    C --> E[(Knowledge Graph)]
    D --> E
    E --> F[Community Detection\nLeiden Algorithm]
    F --> G[Community Summarization\nLLM per community]
    G --> H([Query Ready])

    style A fill:#0284c7,color:#fff
    style B fill:#0d9488,color:#fff
    style C fill:#0d9488,color:#fff
    style D fill:#0d9488,color:#fff
    style E fill:#14b8a6,color:#fff
    style F fill:#0284c7,color:#fff
    style G fill:#d97706,color:#fff
    style H fill:#16a34a,color:#fff

Entity Extraction — For each document chunk, the LLM is prompted to extract named entities (people, organizations, locations, concepts) and their descriptions. This is one LLM call per chunk.

Relationship Extraction — A second LLM call per chunk extracts typed relationships between the entities found. Result: a directed graph of (entity_a) -[relationship]-> (entity_b).

Community Detection — The Leiden algorithm runs on the entity-relationship graph to identify clusters of densely connected entities. These become "communities" — coherent topical groupings.

Community Summarization — Each community gets a natural language summary generated by the LLM. These summaries are what global queries actually search over.

Note

For a 1,000-document corpus you might expect 5,000–20,000 LLM calls during indexing, depending on document length and entity density. At GPT-4o pricing, this can cost $50–500+ per full index build. Plan accordingly and test on a sample first.

Phase 2 — Query Pipeline

GraphRAG supports two fundamentally different query modes:

flowchart TD
    A([User Query]) --> B{Query Mode}
    B -- Local --> C[Entity Match\nVector Search]
    B -- Global --> D[Community Summary\nAggregation]
    C --> E[Graph Traversal\nNeighborhood Context]
    E --> F[Entity + Relationship\n+ Chunk Context]
    D --> G[Map: Score summaries\nper community]
    G --> H[Reduce: Synthesize\nacross top communities]
    F --> I[LLM Response]
    H --> I

    style A fill:#0284c7,color:#fff
    style B fill:#d97706,color:#fff
    style C fill:#0d9488,color:#fff
    style D fill:#0d9488,color:#fff
    style E fill:#14b8a6,color:#fff
    style F fill:#14b8a6,color:#fff
    style G fill:#0284c7,color:#fff
    style H fill:#0284c7,color:#fff
    style I fill:#16a34a,color:#fff

Local search starts with a vector similarity lookup against entities, then traverses the graph neighborhood — pulling in related entities, relationships, and the source text chunks. This gives targeted, relationship-aware context for specific entity questions.

Global search ignores local graph traversal entirely. Instead, it runs a map-reduce across community summaries: each community is scored for relevance to the query, then the top communities are synthesized into a final answer. This is the mode that enables holistic corpus-level questions.


Local vs Global Query Modes

Query Type Mode Example
Specific entity question Local "What did Company X announce in Q3?"
Relationship question Local "How is Dr. Smith connected to the vaccine trial?"
Cross-corpus theme synthesis Global "What are the main risks discussed across all reports?"
Holistic overview Global "Summarize the key findings across this research corpus"
Entity comparison Local "How do the approaches of Organization A and Organization B differ?"
Trend analysis across documents Global "How has the sentiment around Topic Y changed over the corpus?"

When in doubt: if your question names a specific entity, use local. If your question is about patterns across the whole corpus, use global. Global queries are more expensive — they scale with the number of communities, not the size of the query.


GraphRAG vs Standard RAG Comparison

Dimension Standard RAG GraphRAG
Indexing cost Low (embedding only) High (many LLM calls)
Indexing time Minutes Hours to days
Query cost Low Medium (local) to High (global)
Query latency Low (ms–seconds) Medium (local) to High (global, 10s+)
Setup complexity Low High
Cross-document reasoning Weak Strong
Direct factual lookup Strong Strong
Real-time / live data Works Does not work well
Corpus size minimum Any >50–100 documents to be useful
Hallucination risk Low (grounded in chunks) Medium (entity/relationship extraction can hallucinate)

Implementations

Microsoft GraphRAG — The reference Python implementation from Microsoft Research. Open source. Supports the full indexing pipeline, both query modes, and incremental indexing (partial). Configuration-heavy but flexible. Best documentation of any implementation. Start here if you're seriously evaluating GraphRAG.

LightRAG — A simpler, faster-indexing alternative that approximates the GraphRAG approach with fewer LLM calls. Less mature than Microsoft GraphRAG, smaller community. Worth considering if indexing cost is the primary constraint, but expect rougher edges.

Neo4j GraphRAG — Uses Neo4j as the graph database backend with LangChain integration. Good fit for teams already running Neo4j or who need to expose the knowledge graph for use beyond RAG (e.g., analytics, visualization). The graph is a first-class queryable artifact, not just a RAG index.


Limitations

Expensive indexing — The cost scales with corpus size and entity density. Every new document that enters the corpus requires re-extraction and may trigger community re-detection. GraphRAG is not suitable for corpora that change frequently.

Static corpus assumption — GraphRAG is designed for batch indexing of a stable document set. If your documents change daily, the graph goes stale and re-indexing is costly. Standard RAG handles incremental updates far better.

Hallucinations in extractionLLMs extract entities and relationships, and they make mistakes. Entities get merged that shouldn't be, relationships get invented, and proper names get mangled. The quality of the knowledge graph depends directly on the quality of the extraction prompts and the LLM used.

Cold start requirement — A knowledge graph over 10 documents is not useful. You need a substantial corpus — typically 50+ documents, ideally hundreds — for community detection to produce meaningful groupings and for global search to return coherent answers.

Operational complexity — The Microsoft GraphRAG pipeline generates a large number of parquet files and intermediate artifacts. Understanding what failed during a large indexing run requires digging into logs and pipeline outputs. It is not a simple deployment.


Tip

Start with standard RAG. It covers 80–90% of enterprise use cases, costs a fraction of GraphRAG to build and operate, and is far easier to debug. Add GraphRAG only when you have a concrete, validated requirement for multi-hop reasoning or holistic corpus queries that standard RAG demonstrably cannot answer.


References


Next Steps

  • RAG Fundamentals — core concepts that GraphRAG builds on top of
  • RAG Evaluation — how to measure whether GraphRAG actually improves answer quality for your use case
  • Vector Databases — the underlying storage layer GraphRAG still relies on