Storage Overview
GraphRAG.js uses a three-layer storage architecture to handle different aspects of graph RAG:
- Graph Store - Stores nodes and edges (entities, relationships, communities)
- Vector Store - Stores embeddings for similarity search
- Key-Value Store - Stores metadata, chunks, and other document data
Storage Architecture
typescript
const graph = createGraph({
model: openai("gpt-4o-mini"),
embedding: openai.embedding("text-embedding-3-small"),
storage: {
graph: neo4jGraph({ url: "bolt://localhost:7687", ... }),
vector: qdrantVector({ url: "http://localhost:6333", ... }),
kv: redisKV({ host: "localhost", port: 6379 })
}
});Each layer can use a different backend, or you can use a unified storage solution.
Available Storage Packages
Graph Stores
| Package | Database | Best For | Status |
|---|---|---|---|
@graphrag-js/memory | In-memory (Cytoscape) | Development, testing | ✅ |
@graphrag-js/neo4j | Neo4j + GDS | Production, community detection | ✅ |
@graphrag-js/dozerdb | DozerDB (Neo4j-compatible) | Open-source production | ✅ |
@graphrag-js/falkordb | FalkorDB (Redis-based) | Lightweight production | ✅ |
Vector Stores
| Package | Database | Best For | Status |
|---|---|---|---|
@graphrag-js/memory | In-memory cosine | Development, testing | ✅ |
@graphrag-js/qdrant | Qdrant | High-performance vector search | ✅ |
@graphrag-js/pgvector | PostgreSQL + pgvector | SQL-based workflows | ✅ |
Key-Value Stores
| Package | Database | Best For | Status |
|---|---|---|---|
@graphrag-js/memory | In-memory Map | Development, testing | ✅ |
@graphrag-js/memory | JSON files | Persistence without DB | ✅ |
@graphrag-js/redis | Redis | Production KV storage | ✅ |
Quick Comparison
Development & Testing
Recommended: Use @graphrag-js/memory for all three layers.
typescript
import { memoryGraph, memoryVector, memoryKV } from '@graphrag-js/memory';
const graph = createGraph({
// ... model config
storage: {
graph: memoryGraph,
vector: memoryVector,
kv: memoryKV,
}
});Production
Option 1: Specialized Stack
typescript
import { neo4jGraph } from '@graphrag-js/neo4j';
import { qdrantVector } from '@graphrag-js/qdrant';
import { redisKV } from '@graphrag-js/redis';
const graph = createGraph({
// ... model config
storage: {
graph: neo4jGraph({ url: "bolt://localhost:7687", ... }),
vector: qdrantVector({ url: "http://localhost:6333", ... }),
kv: redisKV({ host: "localhost", port: 6379 })
}
});Option 2: PostgreSQL-Only Stack
typescript
import { pgVector } from '@graphrag-js/pgvector';
import { memoryGraph } from '@graphrag-js/memory';
// Note: Use pgvector for vectors, memory for graph until pg graph store is implemented
const graph = createGraph({
// ... model config
storage: {
graph: memoryGraph, // or neo4j/falkordb
vector: pgVector({ host: "localhost", database: "graphrag", ... }),
kv: memoryKV, // or redis
}
});Option 3: Redis-Based Stack
typescript
import { falkorGraph } from '@graphrag-js/falkordb';
import { redisKV } from '@graphrag-js/redis';
import { memoryVector } from '@graphrag-js/memory';
const graph = createGraph({
// ... model config
storage: {
graph: falkorGraph({ host: "localhost", port: 6379 }),
vector: memoryVector, // or qdrant/pgvector
kv: redisKV({ host: "localhost", port: 6379 })
}
});Storage Requirements by Algorithm
Different algorithms have different storage requirements:
Similarity Graph
- Graph Store: Optional (stores similarity edges)
- Vector Store: Required (primary data structure)
- KV Store: Required (chunk metadata)
LightRAG
- Graph Store: Required (entities + relations)
- Vector Store: Required (dual vectors for entities and relations)
- KV Store: Required (chunk metadata)
Microsoft GraphRAG
- Graph Store: Required (entities, relations, communities)
- Vector Store: Required (entity and chunk vectors)
- KV Store: Required (chunks, community reports)
Fast GraphRAG
- Graph Store: Required (entities + relations for PageRank)
- Vector Store: Required (entity vectors)
- KV Store: Required (chunk metadata)
AWS GraphRAG
- Graph Store: Required (hierarchical fact graph)
- Vector Store: Required (chunk and statement vectors)
- KV Store: Required (chunks, statements, facts)
Performance Considerations
Latency
| Storage | Vector Search | Graph Traversal | KV Lookup |
|---|---|---|---|
| Memory | < 1ms | < 1ms | < 1ms |
| Neo4j | N/A | 10-50ms | N/A |
| Qdrant | 5-20ms | N/A | N/A |
| pgvector | 10-30ms | N/A | N/A |
| FalkorDB | N/A | 5-15ms | N/A |
| Redis | N/A | N/A | 1-5ms |
Scalability
| Storage | Max Nodes/Vectors | Horizontal Scaling |
|---|---|---|
| Memory | Limited by RAM | No |
| Neo4j | Billions | Yes (Enterprise) |
| Qdrant | Billions | Yes |
| pgvector | Millions | Yes (with sharding) |
| FalkorDB | Millions | Yes (via Redis cluster) |
| Redis | Billions (keys) | Yes (cluster mode) |
Cost
| Storage | Hosting Cost | License |
|---|---|---|
| Memory | Free | MIT |
| Neo4j | Medium-High | Community (GPL) / Enterprise |
| Qdrant | Low-Medium | Apache 2.0 |
| pgvector | Low | PostgreSQL License |
| FalkorDB | Low | SSPL / Commercial |
| Redis | Low-Medium | RSALv2 / Commercial |
Namespace Isolation
All storage backends support namespace isolation for multi-tenancy:
typescript
// Each namespace gets its own isolated storage
const graph1 = createGraph({
namespace: "tenant-1",
storage: { ... }
});
const graph2 = createGraph({
namespace: "tenant-2",
storage: { ... }
});How namespaces are implemented:
- Memory: Separate in-memory instances
- Neo4j: Label-based isolation (
namespace__tenant-1) - DozerDB: Label-based isolation (
namespace__tenant-1) - Qdrant: Collection prefixes (
namespace_tenant-1_index) - pgvector: Table prefixes (
namespace_tenant_1_index) - FalkorDB: Graph name prefixes
- Redis: Key prefixes (
namespace:tenant-1:key)
Migration & Persistence
Development to Production
typescript
// Step 1: Export from memory storage
const data = await graph.export('json');
// Step 2: Initialize production storage
const prodGraph = createGraph({
storage: {
graph: neo4jGraph({ ... }),
vector: qdrantVector({ ... }),
kv: redisKV({ ... })
}
});
// Step 3: Import data
await prodGraph.import(data);Backup Strategies
Each storage backend has its own backup approach:
- Memory: Use
export()to save to JSON - Neo4j: Use
neo4j-admin dumpor GDS backup - Qdrant: Use collection snapshots
- pgvector: Use
pg_dump - FalkorDB: Use Redis RDB/AOF persistence
- Redis: Use RDB snapshots or AOF logs
Next Steps
- Memory Storage - In-memory and file-based storage
- Neo4j - Graph database with GDS support
- DozerDB - Open-source Neo4j-compatible graph database
- Qdrant - High-performance vector search
- PostgreSQL + pgvector - SQL-based vector storage
- FalkorDB - Redis-based graph database
- Redis - Key-value storage