Skip to content

Redis Key-Value Storage

The @graphrag-js/redis package provides high-performance key-value storage using Redis, ideal for metadata, chunks, and document storage.

Installation

bash
pnpm add @graphrag-js/redis

Features

  • High Performance - Sub-millisecond operations
  • Pipelining - Batch operations for efficiency
  • JSON Serialization - Automatic data encoding
  • Namespace Isolation - Multi-tenant support
  • Field Filtering - Partial data retrieval
  • Horizontal Scaling - Redis Cluster support

Prerequisites

Redis Instance

Option 1: Docker (Recommended)

bash
docker run -d \
  --name redis \
  -p 6379:6379 \
  -v redis-data:/data \
  redis:7-alpine

Option 2: Redis with Persistence

bash
docker run -d \
  --name redis \
  -p 6379:6379 \
  -v redis-data:/data \
  redis:7-alpine \
  redis-server --appendonly yes --save 60 1

Option 3: Redis Cloud

  • Sign up at Redis Cloud
  • Create a database
  • Get connection URL

Verify Installation

bash
redis-cli ping
# PONG

redis-cli set test "hello"
redis-cli get test

Quick Start

typescript
import { createGraph } from '@graphrag-js/core';
import { lightrag } from '@graphrag-js/lightrag';
import { redisKV } from '@graphrag-js/redis';
import { openai } from '@ai-sdk/openai';

const graph = createGraph({
  model: openai('gpt-4o-mini'),
  embedding: openai.embedding('text-embedding-3-small'),
  provider: lightrag(),
  storage: {
    kv: redisKV({
      host: 'localhost',
      port: 6379,
    }),
  }
});

await graph.insert('Your documents...');
const result = await graph.query('Your question?');

Configuration

redisKV(config)

typescript
interface RedisKVConfig {
  host?: string;       // Redis host (default: 'localhost')
  port?: number;       // Redis port (default: 6379)
  password?: string;   // Redis password (optional)
  db?: number;         // Database number (default: 0)

  // ioredis options
  family?: 4 | 6;      // IP version
  connectionName?: string;
  keepAlive?: number;
  noDelay?: boolean;
  retryStrategy?: (times: number) => number | void;
}

Connection Examples

typescript
// Local instance
redisKV({
  host: 'localhost',
  port: 6379,
})

// Password-protected
redisKV({
  host: 'localhost',
  port: 6379,
  password: 'your-password',
})

// Redis Cloud
redisKV({
  host: 'redis-12345.c123.us-east-1.ec2.cloud.redislabs.com',
  port: 12345,
  password: 'your-cloud-password',
})

// Multiple databases (tenant isolation)
redisKV({
  host: 'localhost',
  port: 6379,
  db: 0,  // Database 0 for tenant 1
})

redisKV({
  host: 'localhost',
  port: 6379,
  db: 1,  // Database 1 for tenant 2
})

Usage Examples

Basic KV Operations

typescript
import { redisKV } from '@graphrag-js/redis';

const kvStore = redisKV({
  host: 'localhost',
  port: 6379,
})('my-namespace');

// Store data
await kvStore.upsert({
  'doc-1': {
    text: 'Document content...',
    metadata: { author: 'Alice', year: 2024 },
  },
  'doc-2': {
    text: 'Another document...',
    metadata: { author: 'Bob', year: 2024 },
  },
});

// Retrieve by ID
const doc = await kvStore.getById('doc-1');

// Retrieve multiple
const docs = await kvStore.getByIds(['doc-1', 'doc-2']);

// Retrieve with field filtering
const filtered = await kvStore.getByIds(['doc-1', 'doc-2'], ['text']);
// Only returns { text: '...' } for each document

// Get all keys
const keys = await kvStore.allKeys();

// Filter existing keys
const existing = await kvStore.filterKeys(['doc-1', 'doc-2', 'doc-3']);
// Returns Set(['doc-1', 'doc-2']) if doc-3 doesn't exist

// Delete all data
await kvStore.drop();

Namespace Isolation

typescript
// Different namespaces, same Redis instance
const tenant1KV = redisKV({ host: 'localhost' })('tenant-1');
const tenant2KV = redisKV({ host: 'localhost' })('tenant-2');

// Completely isolated
await tenant1KV.upsert({ 'key-1': 'value-1' });
await tenant2KV.upsert({ 'key-1': 'value-1' });

// Keys are stored as:
// tenant-1:key-1
// tenant-2:key-1

Field Filtering

typescript
interface Document {
  id: string;
  text: string;
  embedding: number[];
  metadata: {
    title: string;
    author: string;
    date: string;
  };
}

// Store full documents
await kvStore.upsert({
  'doc-1': {
    id: 'doc-1',
    text: 'Long document text...',
    embedding: [0.1, 0.2, ...],  // 1536 dimensions
    metadata: { title: 'AI Paper', author: 'Alice', date: '2024-01-01' },
  },
});

// Retrieve only specific fields (saves bandwidth)
const lightweight = await kvStore.getByIds(['doc-1'], ['id', 'metadata']);
// Returns: { id: 'doc-1', metadata: {...} }
// Embedding array not transferred

Performance Optimization

Pipelining

Redis KV uses pipelining automatically for batch operations:

typescript
// Single round-trip for 1000 keys
await kvStore.upsert({
  'key-1': 'value-1',
  'key-2': 'value-2',
  // ... 1000 items
});

// Single round-trip for checks
const existing = await kvStore.filterKeys([...1000keys]);

Connection Pooling

ioredis maintains a connection pool automatically:

typescript
const kvStore = redisKV({
  host: 'localhost',
  port: 6379,
  // Connection pool settings
  lazyConnect: false,
  maxRetriesPerRequest: 3,
  enableReadyCheck: true,
  enableOfflineQueue: true,
})('my-namespace');

Data Persistence

RDB Snapshots

bash
# Enable RDB persistence
docker run -d \
  --name redis \
  -p 6379:6379 \
  -v redis-data:/data \
  redis:7-alpine \
  redis-server --save 900 1 --save 300 10 --save 60 10000

Configuration:

  • save 900 1: Save after 900s if 1 key changed
  • save 300 10: Save after 300s if 10 keys changed
  • save 60 10000: Save after 60s if 10000 keys changed

AOF (Append-Only File)

bash
# Enable AOF persistence
docker run -d \
  --name redis \
  -p 6379:6379 \
  -v redis-data:/data \
  redis:7-alpine \
  redis-server --appendonly yes --appendfsync everysec

AOF sync modes:

  • always: Sync after every write (slowest, safest)
  • everysec: Sync every second (default)
  • no: Let OS decide when to sync (fastest, least safe)

Hybrid Persistence

bash
# RDB + AOF (recommended for production)
redis-server \
  --save 900 1 \
  --appendonly yes \
  --appendfsync everysec

Production Deployment

Docker Compose

yaml
version: '3.8'
services:
  redis:
    image: redis:7-alpine
    command: >
      redis-server
      --appendonly yes
      --appendfsync everysec
      --save 900 1
      --save 300 10
      --requirepass your-secure-password
    ports:
      - "6379:6379"
    volumes:
      - redis-data:/data
    restart: unless-stopped
    healthcheck:
      test: ["CMD", "redis-cli", "--raw", "incr", "ping"]
      interval: 10s
      timeout: 3s
      retries: 5

volumes:
  redis-data:

Redis Sentinel (High Availability)

yaml
version: '3.8'
services:
  redis-master:
    image: redis:7-alpine
    command: redis-server --appendonly yes
    ports:
      - "6379:6379"
    volumes:
      - redis-master-data:/data

  redis-replica:
    image: redis:7-alpine
    command: redis-server --slaveof redis-master 6379 --appendonly yes
    depends_on:
      - redis-master
    volumes:
      - redis-replica-data:/data

  redis-sentinel:
    image: redis:7-alpine
    command: >
      redis-sentinel /etc/redis/sentinel.conf
    depends_on:
      - redis-master
      - redis-replica
    ports:
      - "26379:26379"
    volumes:
      - ./sentinel.conf:/etc/redis/sentinel.conf

volumes:
  redis-master-data:
  redis-replica-data:

Redis Cluster (Horizontal Scaling)

For very large datasets:

yaml
version: '3.8'
services:
  redis-1:
    image: redis:7-alpine
    command: redis-server --cluster-enabled yes --cluster-config-file nodes.conf
    ports:
      - "7001:6379"

  redis-2:
    image: redis:7-alpine
    command: redis-server --cluster-enabled yes --cluster-config-file nodes.conf
    ports:
      - "7002:6379"

  redis-3:
    image: redis:7-alpine
    command: redis-server --cluster-enabled yes --cluster-config-file nodes.conf
    ports:
      - "7003:6379"

Then create cluster:

bash
redis-cli --cluster create \
  127.0.0.1:7001 127.0.0.1:7002 127.0.0.1:7003 \
  --cluster-replicas 0

Monitoring

Redis CLI

bash
# Monitor real-time commands
redis-cli monitor

# Get server info
redis-cli info

# Memory stats
redis-cli info memory

# Check key pattern
redis-cli keys "my-namespace:*"

# Count keys
redis-cli dbsize

Memory Usage

bash
# Get memory usage of specific key
redis-cli memory usage my-namespace:doc-1

# Get largest keys
redis-cli --bigkeys

# Memory doctor
redis-cli memory doctor

Performance Testing

bash
# Benchmark
redis-benchmark -t set,get -n 100000 -q

# Custom benchmark
redis-benchmark -t set -n 100000 -d 1024 -q

Troubleshooting

Connection Refused

Error: Error: connect ECONNREFUSED 127.0.0.1:6379

Solution:

  1. Verify Redis is running: docker ps | grep redis
  2. Check port: 6379
  3. Test: redis-cli ping

Out of Memory

Error: OOM command not allowed when used memory > 'maxmemory'

Solution:

bash
# Set max memory
redis-cli config set maxmemory 2gb
redis-cli config set maxmemory-policy allkeys-lru

# Or in Docker
docker run -d redis:7-alpine \
  redis-server --maxmemory 2gb --maxmemory-policy allkeys-lru

Eviction policies:

  • noeviction: Return error when memory limit reached
  • allkeys-lru: Evict least recently used keys
  • volatile-lru: Evict LRU keys with TTL
  • allkeys-random: Evict random keys

Slow Operations

Problem: Operations taking too long

Solution:

  1. Use pipelining for batch operations
  2. Check network latency
  3. Enable compression for large values
  4. Monitor with redis-cli --latency

Backup Strategy

bash
# Manual backup (RDB)
redis-cli bgsave

# Copy RDB file
docker cp redis:/data/dump.rdb ./backup/dump-$(date +%Y%m%d).rdb

# Restore
docker cp ./backup/dump-20240101.rdb redis:/data/dump.rdb
docker restart redis

# Export to JSON
redis-cli --scan --pattern "my-namespace:*" | \
  xargs redis-cli mget | \
  jq -R -s 'split("\n")' > backup.json

Cost Considerations

DeploymentCostBest For
Self-hostedFree + hostingFull control
Redis Cloud FreeFree (30MB)Testing
Redis Cloud$5+/monthManaged hosting
AWS ElastiCache$15+/monthAWS ecosystem
Azure Cache$15+/monthAzure ecosystem

Benchmarks

On M1 Mac (local Redis):

OperationLatencyThroughput
SET0.5-1ms50K ops/s
GET0.5-1ms50K ops/s
MSET (100)2-5ms20K ops/s
MGET (100)2-5ms20K ops/s

Best Practices

Key Naming

typescript
// Use consistent namespaces
const kvStore = redisKV({ host: 'localhost' })('app');

// Keys stored as:
// app:doc-1
// app:doc-2

// For multi-tenant:
const tenant1 = redisKV({ host: 'localhost' })('tenant-1');
const tenant2 = redisKV({ host: 'localhost' })('tenant-2');

// tenant-1:doc-1
// tenant-2:doc-1

Connection Management

typescript
// Reuse connections
const kvFactory = redisKV({ host: 'localhost' });
const kv1 = kvFactory('namespace-1');
const kv2 = kvFactory('namespace-2');

// Close when done
await kvStore.close();

Error Handling

typescript
try {
  await kvStore.upsert({ 'key-1': 'value-1' });
} catch (error) {
  if (error.message.includes('READONLY')) {
    // Redis is in read-only mode (replica)
    console.error('Cannot write to replica');
  } else if (error.message.includes('OOM')) {
    // Out of memory
    console.error('Redis out of memory');
  }
}

Next Steps

Released under the Elastic License 2.0.