Skip to content

Microsoft GraphRAG

Status: ✅ Available Now

Microsoft GraphRAG uses hierarchical community detection (Leiden clustering) and LLM-generated community reports for deep thematic analysis and global reasoning. Ported from nano-graphrag.

Overview

Based on Microsoft's research, this algorithm builds a hierarchical understanding of document collections through:

  1. Entity extraction with gleaning - Identify entities and relationships, re-extracting to catch missed ones
  2. Leiden clustering - Detect communities at multiple hierarchical levels
  3. Community reports - LLM-generated summaries of each community
  4. Multi-level retrieval - Query at different levels of abstraction

Key Innovation: Instead of just storing entities, the algorithm creates hierarchical community reports that capture themes and patterns across the entire knowledge graph.

Installation

bash
pnpm add @graphrag-js/microsoft

Quick Start

typescript
import { createGraph } from '@graphrag-js/core';
import { microsoftGraph } from '@graphrag-js/microsoft';
import { openai } from '@ai-sdk/openai';

const graph = createGraph({
  model: openai('gpt-4o-mini'),
  embedding: openai.embedding('text-embedding-3-small'),
  provider: microsoftGraph({
    entityTypes: ['organization', 'person', 'geo', 'event'],
    entityExtractMaxGleaning: 1,
    maxGraphClusterSize: 10,
  }),
});

await graph.insert('Your documents...');

// Local search (entity neighborhoods + community context)
const local = await graph.query('What do we know about Sarah Chen?', { mode: 'local' });

// Global search (map-reduce over community reports)
const global = await graph.query('What are the overarching themes?', { mode: 'global' });

// Naive search (direct vector search on chunks)
const naive = await graph.query('Tell me about LinguaAI', { mode: 'naive' });

Configuration

microsoftGraph(config)

typescript
interface MicrosoftGraphConfig {
  entityTypes?: string[];                   // default: ["organization", "person", "geo", "event"]
  entityExtractMaxGleaning?: number;        // default: 1
  entitySummaryMaxTokens?: number;          // default: 500
  graphClusterAlgorithm?: 'leiden';         // default: "leiden"
  maxGraphClusterSize?: number;             // default: 10
  graphClusterSeed?: number;                // default: 0xDEADBEEF
  communityReportMaxTokens?: number;        // default: 12000
  communityReportLlmKwargs?: Record<string, any>;
  concurrency?: number;                     // default: 8
  forceSubCommunities?: boolean;            // default: false
}

Parameters:

  • entityTypes (default: ["organization", "person", "geo", "event"])

    • Types of entities for the LLM to extract
    • Customize based on your domain
  • entityExtractMaxGleaning (default: 1)

    • Extra extraction passes per chunk to find missed entities
    • Higher = better recall, more LLM calls
    • 0 = no gleaning, 1-3 = recommended range
  • entitySummaryMaxTokens (default: 500)

    • When merged entity descriptions exceed this, LLM summarizes them
  • maxGraphClusterSize (default: 10)

    • Controls Leiden clustering granularity (max hierarchy levels)
  • communityReportMaxTokens (default: 12000)

    • Max context size for community report generation
  • concurrency (default: 8)

    • Max concurrent LLM calls during extraction

Query Parameters

typescript
interface MicrosoftQueryParams {
  mode?: 'local' | 'global' | 'naive';     // default: "global"
  onlyNeedContext?: boolean;                 // default: false
  responseType?: string;                     // default: "Multiple Paragraphs"
  level?: number;                            // default: 2
  topK?: number;                             // default: 20

  // Local search token budgets
  localMaxTokenForTextUnit?: number;         // default: 4000
  localMaxTokenForLocalContext?: number;      // default: 4800
  localMaxTokenForCommunityReport?: number;  // default: 3200
  localCommunitySingleOne?: boolean;         // default: false

  // Global search parameters
  globalMinCommunityRating?: number;         // default: 0
  globalMaxConsiderCommunity?: number;       // default: 512
  globalMaxTokenForCommunityReport?: number; // default: 16384

  // Naive search
  naiveMaxTokenForTextUnit?: number;         // default: 12000
}

Query Modes

Entity-centric retrieval: find relevant entities, then expand to their neighborhoods, edges, community reports, and source chunks.

typescript
const result = await graph.query('What is TechCorp?', { mode: 'local' });

How it works:

  1. Embed query → vector search entity index
  2. For each matched entity, gather:
    • Entity description + degree
    • Connected edges and their descriptions
    • Community reports the entity belongs to
    • Source text chunks
  3. Truncate each section to token budgets
  4. Send assembled context to LLM

Best for: Specific questions about entities and their relationships.

Token budget controls:

typescript
await graph.query('question', {
  mode: 'local',
  localMaxTokenForTextUnit: 4000,         // Source chunks
  localMaxTokenForLocalContext: 4800,      // Edges context
  localMaxTokenForCommunityReport: 3200,  // Community reports
});

Map-reduce over community reports for broad thematic questions.

typescript
const result = await graph.query('What are the overarching themes?', { mode: 'global' });

How it works:

  1. Retrieve all community reports from the highest level
  2. Map phase: For each batch of reports, extract key points (description + score) via LLM
  3. Reduce phase: Aggregate key points, sort by score, send to LLM for final answer

Best for: Broad questions about themes, patterns, and summaries across the entire dataset.

Filtering by quality:

typescript
await graph.query('themes?', {
  mode: 'global',
  globalMinCommunityRating: 3,        // Skip low-quality communities
  globalMaxConsiderCommunity: 256,     // Limit number of communities
});

Direct vector search on text chunks, bypassing the knowledge graph.

typescript
const result = await graph.query('What is TechCorp?', { mode: 'naive' });

How it works:

  1. Embed query → vector search chunk index
  2. Retrieve chunk text from KV store
  3. Truncate to token budget
  4. Send to LLM

Best for: Simple factoid questions, baseline comparison.

Context-Only Mode

Get the assembled context without an LLM answer:

typescript
const result = await graph.query('question', {
  mode: 'local',
  onlyNeedContext: true,
});
console.log(result.context); // Raw context string

How It Works

1. Indexing Pipeline

Documents

Chunk into pieces (stored in KV + vector store)

For each chunk:
  ├─ LLM entity/relationship extraction
  ├─ Gleaning: re-extract to find missed entities
  └─ Parse records: ("entity"<|>NAME<|>TYPE<|>DESCRIPTION)

Merge duplicate entities (most common type wins)

Merge duplicate edges (sum weights, join descriptions)

Summarize long descriptions via LLM

Upsert entities + edges into graph store

Embed entities into vector store

Leiden clustering (hierarchical community detection)

Generate community reports via LLM (deepest levels first)

Store community reports in KV store

2. Entity Extraction Detail

Each chunk is processed with an extraction prompt that produces structured records:

("entity"<|>"TECHCORP"<|>"ORGANIZATION"<|>"TechCorp is a technology company.")
##
("entity"<|>"AI DIVISION"<|>"ORGANIZATION"<|>"AI Division builds ML models.")
##
("relationship"<|>"TECHCORP"<|>"AI DIVISION"<|>"TechCorp contains AI Division."<|>8)

Gleaning re-prompts the LLM: "MANY entities were missed. Add them." The LLM decides whether to continue (YES/NO), and new entities are merged with existing ones.

3. Community Report Generation

For each detected community, the LLM generates a structured report:

json
{
  "title": "AI Research Community",
  "summary": "A cluster of organizations focused on AI...",
  "rating": 7.5,
  "rating_explanation": "High impact on the technology sector",
  "findings": [
    { "summary": "...", "explanation": "..." }
  ]
}

Reports are generated bottom-up: deeper community levels first, with sub-community reports feeding into parent community context.

Graph Structure

Nodes (Entities)

Each entity is stored as a graph node with:

typescript
{
  entity_type: 'ORGANIZATION',
  description: 'TechCorp is a technology company specializing in AI.',
  source_id: 'chunk-0<SEP>chunk-1',  // Source chunks
}

Edges (Relationships)

typescript
{
  weight: 8.0,
  description: 'TechCorp contains the AI Division.',
  source_id: 'chunk-0',
  order: 1,
}

Vector Indexes

  • entities - Entity embeddings (name + description)
  • chunks - Text chunk embeddings

KV Stores

  • Text chunks - Full chunk content keyed by chunk ID
  • Community reports - JSON reports keyed by community ID

Usage Examples

Basic Usage

typescript
import { createGraph } from '@graphrag-js/core';
import { microsoftGraph } from '@graphrag-js/microsoft';
import { openai } from '@ai-sdk/openai';

const graph = createGraph({
  model: openai('gpt-4o-mini'),
  embedding: openai.embedding('text-embedding-3-small'),
  provider: microsoftGraph(),
});

await graph.insert('Your documents...');
const result = await graph.query('What are the main themes?', { mode: 'global' });

Custom Entity Types

typescript
const graph = createGraph({
  provider: microsoftGraph({
    entityTypes: ['company', 'product', 'technology', 'person'],
    entityExtractMaxGleaning: 2,  // More thorough extraction
  }),
});

With External Storage

typescript
import { neo4jGraph } from '@graphrag-js/neo4j';
import { qdrantVector } from '@graphrag-js/qdrant';
import { redisKV } from '@graphrag-js/redis';

const graph = createGraph({
  model: openai('gpt-4o-mini'),
  embedding: openai.embedding('text-embedding-3-small'),
  provider: microsoftGraph({
    entityTypes: ['person', 'organization', 'location', 'event'],
    maxGraphClusterSize: 10,
    concurrency: 16,
  }),
  storage: {
    graph: neo4jGraph({
      url: 'bolt://localhost:7687',
      username: 'neo4j',
      password: 'password',
    }),
    vector: qdrantVector({ url: 'http://localhost:6333' }),
    kv: redisKV({ host: 'localhost', port: 6379 }),
  },
});

High Concurrency for Large Datasets

typescript
const graph = createGraph({
  provider: microsoftGraph({
    concurrency: 16,              // More parallel LLM calls
    entityExtractMaxGleaning: 0,  // Skip gleaning for speed
    entitySummaryMaxTokens: 300,  // Shorter summaries
  }),
});

Tuning Global Search Quality

typescript
// Strict: only high-quality community reports
const result = await graph.query('themes?', {
  mode: 'global',
  globalMinCommunityRating: 5,
  globalMaxConsiderCommunity: 128,
});

// Broad: include all communities
const result = await graph.query('themes?', {
  mode: 'global',
  globalMinCommunityRating: 0,
  globalMaxConsiderCommunity: 512,
});

When to Use

Best For

  • Thematic analysis - "What are the overarching themes in this dataset?"
  • Research documents - Academic papers, reports, policy documents
  • Large document collections - Discovering global patterns
  • Exploratory analysis - "What's this dataset about?"
  • Multi-level reasoning - Questions at different levels of abstraction

Not Ideal For

  • Simple factoid queries - Overkill for basic Q&A (use Similarity Graph)
  • Real-time applications - Slow indexing (community reports take time)
  • Budget-constrained projects - Many LLM calls for extraction + reports
  • Frequently updated data - Expensive to rebuild communities

Comparison

FeatureMicrosoftLightRAGFastSimilarity
Community detection✅ Leiden❌ No❌ No❌ No
Community reports✅ Yes❌ No❌ No❌ No
Entity extraction✅ + gleaning✅ + gleaning✅ + gleaning❌ No
Global reasoningExcellentGoodFairBasic
Indexing cost$$$$$ Highest$$$ Medium$$ Low$ Lowest
Indexing speedSlowestMediumFastFastest
Query qualityExcellentVery GoodGoodBasic

Storage Requirements

Microsoft GraphRAG requires all three storage types:

  • Graph store - For entities, relationships, and clustering (Neo4j recommended for Leiden)
  • Vector store - For entity and chunk embeddings
  • KV store - For text chunks and community reports
typescript
import { memoryGraph, memoryVector, memoryKV } from '@graphrag-js/memory';

// In-memory (development/prototyping)
const graph = createGraph({
  provider: microsoftGraph(),
  storage: {
    graph: memoryGraph,
    vector: memoryVector,
    kv: memoryKV,
  },
});

Performance Characteristics

Indexing

OperationNotes
Entity extraction1 + gleaning LLM calls per chunk
Community detectionLeiden clustering (graph store)
Community reports1 LLM call per community
Embedding1 embedding call per entity

Querying

ModeSpeedQuality
LocalFast (vector + graph lookup)High for entity queries
GlobalMedium (map-reduce over reports)Excellent for thematic queries
NaiveFast (vector search only)Baseline

Troubleshooting

No Entities Extracted

Problem: extractEntities returns false

Solutions:

  • Check that your documents contain meaningful text (not too short)
  • Adjust entityTypes to match your domain
  • Increase entityExtractMaxGleaning for more thorough extraction
  • Verify your LLM model can handle the extraction prompt

Poor Global Query Results

Problem: Global search returns generic answers

Solutions:

  • Check community reports exist (verify clustering ran)
  • Increase globalMaxConsiderCommunity
  • Lower globalMinCommunityRating to include more communities
  • Use a stronger model for community report generation

Slow Indexing

Solutions:

  • Increase concurrency (default: 8, try 16-32)
  • Decrease entityExtractMaxGleaning to 0
  • Use a faster/cheaper model for extraction
  • Reduce chunk sizes to process fewer tokens

Reference Implementation

Based on:

Papers:

Source Code

View the implementation:

Released under the Elastic License 2.0.