Skip to content

Storage Interfaces

GraphRAG.js uses three storage abstractions to persist your knowledge graph data.

Overview

GraphRAG.js separates storage into three concerns:

  1. GraphStore: Stores entities, relationships, and graph structure
  2. VectorStore: Stores embeddings for similarity search
  3. KVStore: Stores key-value metadata and configuration

Each interface can use a different backend, allowing you to mix and match storage solutions.

Storage Configuration

Configure storage when creating a graph:

typescript
import { createGraph } from "@graphrag-js/core";
import { memoryStorage } from "@graphrag-js/memory";

const graph = createGraph({
  model: openai("gpt-4o-mini"),
  embedding: openai.embedding("text-embedding-3-small"),
  provider: lightrag(),
  storage: memoryStorage(),  // All three stores in-memory
});

Or configure each store separately:

typescript
import { neo4jGraph } from "@graphrag-js/neo4j";
import { qdrantVector } from "@graphrag-js/qdrant";
import { redisKV } from "@graphrag-js/redis";

const graph = createGraph({
  model: openai("gpt-4o-mini"),
  embedding: openai.embedding("text-embedding-3-small"),
  provider: lightrag(),
  storage: {
    graph: neo4jGraph({ url: "bolt://localhost:7687", ... }),
    vector: qdrantVector({ url: "http://localhost:6333", ... }),
    kv: redisKV({ host: "localhost", port: 6379 }),
  },
});

GraphStore Interface

Stores graph structure: entities (nodes) and relationships (edges).

Interface Definition

typescript
interface GraphStore {
  upsertNode(node: GNode): Promise<void>;
  upsertEdge(edge: GEdge): Promise<void>;
  getNode(id: string): Promise<GNode | null>;
  getNeighbors(id: string, direction?: "in" | "out" | "both"): Promise<GNode[]>;
  query(cypher: string, params?: any): Promise<any>;
  deleteNode(id: string): Promise<void>;
  deleteEdge(source: string, target: string, type?: string): Promise<void>;
  close(): Promise<void>;
}

Data Types

GNode (Entity):

typescript
interface GNode {
  id: string;
  label: string;
  properties: {
    name: string;
    type: string;
    description: string;
    [key: string]: any;
  };
}

GEdge (Relationship):

typescript
interface GEdge {
  source: string;
  target: string;
  type: string;
  properties: {
    description: string;
    weight?: number;
    [key: string]: any;
  };
}

Available Implementations

In-Memory (Development)

typescript
import { memoryGraph } from "@graphrag-js/memory";

storage: {
  graph: memoryGraph(),
}

Use for: Development, testing, small datasets

JSON File (Development)

typescript
import { jsonGraph } from "@graphrag-js/memory";

storage: {
  graph: jsonGraph({
    filePath: "./data/graph.json",
  }),
}

Use for: Persistent local storage, small datasets

Neo4j (Production)

typescript
import { neo4jGraph } from "@graphrag-js/neo4j";

storage: {
  graph: neo4jGraph({
    url: "bolt://localhost:7687",
    username: "neo4j",
    password: "password",
    database: "neo4j",  // optional
  }),
}

Use for: Production graph databases, complex queries, Cypher support

Learn more →

FalkorDB (Production)

typescript
import { falkorGraph } from "@graphrag-js/falkordb";

storage: {
  graph: falkorGraph({
    host: "localhost",
    port: 6379,
    graphName: "my-graph",
  }),
}

Use for: Redis-based graph database, high performance

Learn more →


VectorStore Interface

Stores embeddings for fast similarity search.

Interface Definition

typescript
interface VectorStore {
  upsert(vectors: VectorRecord[]): Promise<void>;
  query(vector: number[], topK: number, filter?: any): Promise<VectorQueryResult[]>;
  delete(ids: string[]): Promise<void>;
  close(): Promise<void>;
}

Data Types

VectorRecord:

typescript
interface VectorRecord {
  id: string;
  vector: number[];
  metadata?: Record<string, any>;
}

VectorQueryResult:

typescript
interface VectorQueryResult {
  id: string;
  score: number;
  metadata?: Record<string, any>;
}

Available Implementations

In-Memory (Development)

typescript
import { memoryVector } from "@graphrag-js/memory";

storage: {
  vector: memoryVector(),
}

Features: Cosine similarity, linear search Use for: Development, testing, small datasets (<10k vectors)

Qdrant (Production)

typescript
import { qdrantVector } from "@graphrag-js/qdrant";

storage: {
  vector: qdrantVector({
    url: "http://localhost:6333",
    apiKey: "your-api-key",  // optional
    collectionName: "embeddings",
    dimension: 1536,
  }),
}

Features: HNSW index, metadata filtering, high performance Use for: Production vector search, large datasets (>10k vectors)

Learn more →

PostgreSQL + pgvector (Production)

typescript
import { pgVector } from "@graphrag-js/pgvector";

storage: {
  vector: pgVector({
    connectionString: "postgresql://user:pass@localhost:5432/db",
    tableName: "embeddings",
    dimension: 1536,
  }),
}

Features: PostgreSQL extension, SQL queries, ACID transactions Use for: Existing PostgreSQL infrastructure, transactional guarantees

Learn more →


KVStore Interface

Stores key-value metadata and configuration.

Interface Definition

typescript
interface KVStore {
  get(key: string): Promise<any>;
  set(key: string, value: any): Promise<void>;
  delete(key: string): Promise<void>;
  keys(pattern?: string): Promise<string[]>;
  close(): Promise<void>;
}

Available Implementations

In-Memory (Development)

typescript
import { memoryKV } from "@graphrag-js/memory";

storage: {
  kv: memoryKV(),
}

Use for: Development, testing, ephemeral data

JSON File (Development)

typescript
import { jsonKV } from "@graphrag-js/memory";

storage: {
  kv: jsonKV({
    filePath: "./data/metadata.json",
  }),
}

Use for: Persistent local storage, configuration files

Redis (Production)

typescript
import { redisKV } from "@graphrag-js/redis";

storage: {
  kv: redisKV({
    host: "localhost",
    port: 6379,
    password: "your-password",  // optional
    db: 0,                      // optional
  }),
}

Features: High performance, TTL support, atomic operations Use for: Production key-value storage, caching, distributed systems

Learn more →


Storage Patterns

Development Setup

Fast setup with in-memory storage:

typescript
import { memoryStorage } from "@graphrag-js/memory";

const graph = createGraph({
  // ...
  storage: memoryStorage(),  // All three stores in-memory
});

Local Development with Persistence

Use JSON files for persistent local development:

typescript
import { jsonGraph, jsonKV, memoryVector } from "@graphrag-js/memory";

const graph = createGraph({
  // ...
  storage: {
    graph: jsonGraph({ filePath: "./data/graph.json" }),
    vector: memoryVector(),  // Vectors don't persist well in JSON
    kv: jsonKV({ filePath: "./data/metadata.json" }),
  },
});

Production Setup

Use production-grade databases:

typescript
import { neo4jGraph } from "@graphrag-js/neo4j";
import { qdrantVector } from "@graphrag-js/qdrant";
import { redisKV } from "@graphrag-js/redis";

const graph = createGraph({
  // ...
  storage: {
    graph: neo4jGraph({
      url: process.env.NEO4J_URL,
      username: process.env.NEO4J_USER,
      password: process.env.NEO4J_PASSWORD,
    }),
    vector: qdrantVector({
      url: process.env.QDRANT_URL,
      apiKey: process.env.QDRANT_API_KEY,
      collectionName: "prod-embeddings",
    }),
    kv: redisKV({
      host: process.env.REDIS_HOST,
      port: parseInt(process.env.REDIS_PORT),
      password: process.env.REDIS_PASSWORD,
    }),
  },
});

Hybrid Setup

Mix in-memory and external storage:

typescript
import { memoryGraph, memoryKV } from "@graphrag-js/memory";
import { qdrantVector } from "@graphrag-js/qdrant";

const graph = createGraph({
  // ...
  storage: {
    graph: memoryGraph(),     // Small graph, keep in memory
    vector: qdrantVector({    // Large vectors, use Qdrant
      url: process.env.QDRANT_URL,
    }),
    kv: memoryKV(),          // Metadata, keep in memory
  },
});

Implementing Custom Storage

You can implement custom storage backends by adhering to the interfaces:

Custom GraphStore Example

typescript
import { GraphStore, GNode, GEdge } from "@graphrag-js/core";

export function customGraphStore(config: any): GraphStore {
  return {
    async upsertNode(node: GNode) {
      // Your implementation
    },

    async upsertEdge(edge: GEdge) {
      // Your implementation
    },

    async getNode(id: string) {
      // Your implementation
      return node;
    },

    async getNeighbors(id: string, direction = "both") {
      // Your implementation
      return neighbors;
    },

    async query(cypher: string, params?: any) {
      // Your implementation (optional)
      throw new Error("Custom query not supported");
    },

    async deleteNode(id: string) {
      // Your implementation
    },

    async deleteEdge(source: string, target: string, type?: string) {
      // Your implementation
    },

    async close() {
      // Cleanup resources
    },
  };
}

Custom VectorStore Example

typescript
import { VectorStore, VectorRecord, VectorQueryResult } from "@graphrag-js/core";

export function customVectorStore(config: any): VectorStore {
  return {
    async upsert(vectors: VectorRecord[]) {
      // Your implementation
    },

    async query(vector: number[], topK: number, filter?: any) {
      // Your implementation
      return results;
    },

    async delete(ids: string[]) {
      // Your implementation
    },

    async close() {
      // Cleanup resources
    },
  };
}

Storage Comparison

GraphStore Options

StoreTypeBest ForProsCons
memoryGraphIn-memoryDevelopmentFast, no setupNot persistent
jsonGraphFileLocal devPersistent, simpleSlow for large graphs
neo4jGraphDatabaseProductionCypher, matureRequires Neo4j
falkorGraphDatabaseProductionFast, Redis-basedRequires FalkorDB

VectorStore Options

StoreTypeBest ForProsCons
memoryVectorIn-memoryDevelopmentFast, no setupNot persistent, no index
qdrantVectorDatabaseProductionHNSW, filters, fastRequires Qdrant
pgVectorDatabaseProductionPostgreSQL, ACIDRequires setup

KVStore Options

StoreTypeBest ForProsCons
memoryKVIn-memoryDevelopmentFast, no setupNot persistent
jsonKVFileLocal devPersistent, simpleSlow for many keys
redisKVDatabaseProductionFast, distributedRequires Redis

Best Practices

1. Use Appropriate Storage for Each Stage

  • Development: In-memory or JSON files
  • Staging: Mix of in-memory and external databases
  • Production: External databases for all three stores

2. Configure Connection Pooling

For production databases, configure connection limits:

typescript
storage: {
  graph: neo4jGraph({
    url: process.env.NEO4J_URL,
    maxConnectionPoolSize: 50,
  }),
}

3. Handle Connection Errors

Implement retry logic and error handling:

typescript
try {
  await graph.insert(documents);
} catch (error) {
  if (error instanceof StorageError) {
    // Retry or fallback logic
  }
}

4. Always Close Connections

typescript
const graph = createGraph(options);
try {
  await graph.insert(documents);
} finally {
  await graph.close();  // Cleanup all storage connections
}

5. Use Namespaces for Multi-Tenancy

typescript
const graph = createGraph({
  // ...
  namespace: "tenant-123",
  storage: sharedStorage,  // Same storage, isolated data
});

See Also

Released under the Elastic License 2.0.