Algorithm Overview โ
GraphRAG.js provides multiple graph RAG algorithms, each with different strategies for building and querying knowledge graphs. All algorithms share the same API but use different approaches under the hood.
The Core Concept โ
All GraphRAG algorithms follow this pattern:
Documents โ Graph Construction โ Query Processing โ Answer GenerationBut they differ in:
- What nodes/edges represent (chunks, entities, facts, statements)
- How the graph is built (extraction, similarity, clustering)
- How queries are answered (vector search, traversal, PageRank, communities)
Available Algorithms โ
| Algorithm | Status | Best For | Complexity | Cost |
|---|---|---|---|---|
| Similarity Graph | โ Available | Quick prototyping, baselines | Low | Low |
| LightRAG | ๐ง Coming Soon | General purpose, balanced | Medium | Medium |
| Microsoft GraphRAG | โ Available | Deep thematic analysis | High | High |
| Fast GraphRAG | ๐ง Coming Soon | Speed, cost efficiency | Medium | Low |
| AWS GraphRAG | ๐ง Coming Soon | Multi-hop reasoning | High | Medium-High |
Quick Comparison โ
Similarity Graph โ โ
Available Now
The simplest baseline: chunks as nodes, similarity as edges.
import { similarityGraph } from '@graphrag-js/similarity';
const graph = createGraph({
provider: similarityGraph({
similarityThreshold: 0.7,
}),
});How it works:
- Chunk documents
- Create edges between similar chunks (cosine > threshold)
- Query: Vector search + BFS expansion
Pros:
- โ Simple to understand
- โ Fast setup
- โ Low cost
- โ Good baseline
Cons:
- โ No entity extraction
- โ Limited relationship understanding
- โ No global reasoning
LightRAG ๐ง โ
Coming Soon
Dual-level retrieval with entities and relationships embedded separately.
import { lightrag } from '@graphrag-js/lightrag';
const graph = createGraph({
provider: lightrag({
entityTypes: ['person', 'organization', 'location'],
maxGleanings: 1,
}),
});How it works:
- Extract entities and relationships via LLM
- Create two separate vector indexes (entities + relations)
- Query modes:
- Local: Search entity vectors
- Global: Search relationship vectors
- Hybrid: Combine both
Pros:
- โ Balanced cost/performance
- โ Good for general use cases
- โ Fast incremental updates
- โ Multiple query modes
Cons:
- โ No community detection
- โ No hierarchical summaries
Status: ๐ง Implementation in progress
Microsoft GraphRAG โ โ
Available Now
Hierarchical community detection with summarized reports. Ported from nano-graphrag.
import { microsoftGraph } from '@graphrag-js/microsoft';
const graph = createGraph({
provider: microsoftGraph({
entityTypes: ['organization', 'person', 'geo', 'event'],
entityExtractMaxGleaning: 1,
maxGraphClusterSize: 10,
}),
});How it works:
- Extract entities and relationships via LLM (with gleaning)
- Run Leiden clustering to detect communities
- Generate hierarchical community reports via LLM
- Query modes:
- Local: Entity neighborhoods + community context
- Global: Map-reduce over community reports
- Naive: Pure vector search baseline
Pros:
- โ Best for thematic analysis
- โ Global reasoning capabilities
- โ Hierarchical understanding
- โ Well-researched (Microsoft)
Cons:
- โ Expensive (many LLM calls for reports)
- โ Slow indexing
- โ Complex setup
Status: โ Complete โ 29 tests passing
Fast GraphRAG ๐ง โ
Coming Soon
PageRank-based retrieval without expensive community detection.
import { fastGraph } from '@graphrag-js/fast';
const graph = createGraph({
provider: fastGraph({
pagerank: {
damping: 0.85,
maxIterations: 100,
},
}),
});How it works:
- Extract entities and relationships via LLM
- No community detection (saves cost!)
- Query: Personalized PageRank from seed entities
- Token-budget truncation for context
Pros:
- โ Fast and cheap
- โ No community overhead
- โ Good incremental updates
- โ PageRank naturally surfaces importance
Cons:
- โ No global summaries
- โ Relies on good entity extraction
- โ May miss disconnected clusters
Status: ๐ง Planned for Phase 6
AWS GraphRAG ๐ง โ
Coming Soon
Fact-centric hierarchical graph: chunks โ statements โ facts โ entities.
import { awsGraph } from '@graphrag-js/aws';
const graph = createGraph({
provider: awsGraph({
semantic: {
beamWidth: 5,
maxPaths: 10,
},
}),
});How it works:
- Extract statements (propositions) from chunks
- Extract facts (subject/relation/object triples) from statements
- Extract entities from facts
- Build hierarchical graph
- Query modes:
- Traversal: Top-down (chunk vectors) + bottom-up (entity keywords)
- Semantic: Beam search through fact chains
Pros:
- โ Best for multi-hop reasoning
- โ Explicit fact representation
- โ Cross-document connections
- โ Statement-level granularity
Cons:
- โ Complex extraction pipeline
- โ Many LLM calls
- โ Higher latency
Status: ๐ง Planned for Phase 7
Choosing an Algorithm โ
By Use Case โ
Prototyping / Baseline โ Use Similarity Graph โ
General Purpose RAG โ Use LightRAG ๐ง (when available)
Thematic Analysis / Research โ Use Microsoft GraphRAG โ
Fast / Cost-Effective โ Use Fast GraphRAG ๐ง (when available)
Multi-Hop Reasoning โ Use AWS GraphRAG ๐ง (when available)
By Dataset Size โ
< 10K documents โ Any algorithm works
10K - 100K documents โ Similarity Graph โ or Fast GraphRAG ๐ง
100K - 1M documents โ Fast GraphRAG ๐ง or LightRAG ๐ง
> 1M documents โ Fast GraphRAG ๐ง with distributed storage
By Query Type โ
Factoid questions ("What is X?") โ Similarity Graph โ or LightRAG ๐ง
Relationship queries ("How are X and Y related?") โ LightRAG ๐ง or Fast GraphRAG ๐ง
Thematic questions ("What are the main themes?") โ Microsoft GraphRAG โ
Multi-hop questions ("If X, then Y, then what?") โ AWS GraphRAG ๐ง
By Budget โ
Low cost โ Similarity Graph โ (no LLM extraction) or Fast GraphRAG ๐ง
Medium cost โ LightRAG ๐ง or AWS GraphRAG ๐ง
High cost โ Microsoft GraphRAG โ (many LLM calls for community reports)
Implementation Roadmap โ
| Phase | Algorithm | Status | ETA |
|---|---|---|---|
| 3 | Similarity Graph | โ Complete | Available Now |
| 4 | Microsoft GraphRAG | โ Complete | Available Now |
| 5 | LightRAG (default) | โฌ Planned | TBD |
| 6 | Fast GraphRAG | โฌ Planned | TBD |
| 7 | AWS GraphRAG | โฌ Planned | TBD |
See ROADMAP.md for detailed implementation status.
Algorithm Details โ
Graph Structure Comparison โ
| Algorithm | Nodes | Edges | Indexes |
|---|---|---|---|
| Similarity | Chunks | Similarity | Chunk vectors |
| LightRAG | Entities, Chunks | Relations, Contains | Entity vectors, Relation vectors, Chunk vectors |
| Microsoft | Entities, Communities | Relations, MemberOf | Entity vectors, Chunk vectors |
| Fast | Entities | Relations | Entity vectors (HNSW) |
| AWS | Chunks, Statements, Facts, Entities | Contains, Extracts, References | Chunk vectors, Statement vectors |
Query Processing Comparison โ
| Algorithm | Query Processing |
|---|---|
| Similarity | Embed query โ Vector search โ BFS expansion |
| LightRAG | Embed query โ Dual vector search (entities + relations) โ Expand โ LLM |
| Microsoft | Embed query โ Extract entities โ Search + Community reports โ LLM |
| Fast | Embed query โ Entity search โ Personalized PageRank โ LLM |
| AWS | Embed query โ Traversal (top-down + bottom-up) or Beam search โ LLM |
Performance Characteristics โ
Indexing Speed โ
| Algorithm | Speed | Cost |
|---|---|---|
| Similarity | โกโกโก Fastest | $ Cheapest |
| Fast | โกโก Fast | $$ Low |
| LightRAG | โก Medium | $$$ Medium |
| AWS | ๐ Slow | $$$$ High |
| Microsoft | ๐๐ Slowest | $$$$$ Highest |
Query Speed โ
| Algorithm | Speed | Quality |
|---|---|---|
| Similarity | โกโกโก Fastest | โญโญ Basic |
| Fast | โกโก Fast | โญโญโญ Good |
| LightRAG | โก Medium | โญโญโญโญ Very Good |
| AWS | ๐ Slow | โญโญโญโญ Very Good |
| Microsoft | ๐ Slow | โญโญโญโญโญ Excellent |
Next Steps โ
- Similarity Graph - Available now โ
- LightRAG - Coming soon ๐ง
- Microsoft GraphRAG - Available now โ
- Fast GraphRAG - Coming soon ๐ง
- AWS GraphRAG - Coming soon ๐ง