Skip to main content
On this page

Hybrid retrieval combining knowledge graphs (Neo4j) with vector embeddings for more accurate, explainable AI responses.


Overview

Graph RAG extends traditional vector-based RAG by incorporating knowledge graphs. Instead of just finding semantically similar chunks, it traverses relationships between entities to provide more comprehensive and contextually accurate answers.

Traditional RAG:                       Graph RAG:
┌───────────────────┐                  ┌───────────────────┐
│  User Question    │                  │  User Question    │
│        ↓          │                  │        ↓          │
│  Embed Question   │                  │  Embed Question   │
│        ↓          │                  │        ↓          │
│  Vector Search    │                  │  Vector Search ───┼──┐
│        ↓          │                  │        ↓          │  │
│  Top-K Chunks     │                  │  Top-K Chunks     │  │ Graph
│        ↓          │                  │        ↓          │  │ Traversal
│  LLM Response     │                  │  Entity Linking ←─┼──┘
└───────────────────┘                  │        ↓          │
                                       │  Relationship     │
                                       │  Context          │
                                       │        ↓          │
                                       │  LLM Response     │
                                       └───────────────────┘

Why Graph RAG?

Aspect Vector RAG Graph RAG Winner
Semantic similarity Excellent Good Vector
Multi-hop reasoning Poor Excellent Graph
Relationship queries Cannot Native Graph
Explainability Low High Graph
Setup complexity Simple Complex Vector
Maintenance Low Medium Vector

When to Use Graph RAG

Use Case Why
"What features depend on Drizzle?" Multi-hop traversal
"How are users related to this item?" Relationship discovery
"Show me the prerequisite pages" Path finding
"What concepts are connected to auth?" Knowledge exploration

When Vector RAG is Better

Use Case Why
"Find docs about error handling" Semantic similarity
"What does this error mean?" Content retrieval
Vague, exploratory queries Broad semantic match
Large unstructured document sets Scale

Architecture

Combine vector similarity with graph traversal for best results:

┌─────────────────────────────────────────────────────────────────────┐
│                      Hybrid RAG Pipeline                             │
├─────────────────────────────────────────────────────────────────────┤
│  User Question                                                       │
│       ↓                                                              │
│  ┌─────────────┐     ┌─────────────┐                                │
│  │   Mistral   │     │   Groq      │                                │
│  │  Embedding  │     │   NER       │  (Entity extraction)           │
│  └──────┬──────┘     └──────┬──────┘                                │
│         ↓                   ↓                                        │
│  ┌─────────────┐     ┌─────────────┐                                │
│  │   Neon      │     │   Neo4j     │                                │
│  │  pgvector   │     │   Aura      │                                │
│  │ (semantic)  │     │ (relations) │                                │
│  └──────┬──────┘     └──────┬──────┘                                │
│         ↓                   ↓                                        │
│  ┌─────────────────────────────────┐                                │
│  │      Merge & Rank Results       │                                │
│  └─────────────┬───────────────────┘                                │
│                ↓                                                     │
│  ┌─────────────────────────────────┐                                │
│  │  ★ encode(context) → TOON ★    │  (30-60% token savings)        │
│  └─────────────┬───────────────────┘                                │
│                ↓                                                     │
│  ┌─────────────────────────────────┐                                │
│  │     Groq LLM Response           │                                │
│  └─────────────────────────────────┘                                │
└─────────────────────────────────────────────────────────────────────┘

Search Methods

Method Best For Description
Local Search Entity-specific queries Fan out from specific entities to neighbors
Global Search Thematic questions Use community summaries for broad understanding
Hybrid Search Most queries Combine vector + graph for comprehensive results

Implementation

1. Knowledge Graph Construction

First, extend the existing Neo4j model (from graph.md) with document entities:

// src/lib/server/graph/types.ts
// Add to existing types

export interface DocumentNode {
  id: string;
  title: string;
  content: string;
  path: string;
  chunkIndex: number;
  embedding?: number[];  // Store in Neo4j or reference pgvector
}

export interface EntityNode {
  id: string;
  name: string;
  type: 'concept' | 'feature' | 'page' | 'person' | 'component';
  description?: string;
}

// Relationship types for RAG
export type RagRelationType =
  | 'MENTIONS'        // Document mentions entity
  | 'RELATES_TO'      // Entity relates to entity
  | 'PART_OF'         // Entity is part of entity
  | 'DEPENDS_ON'      // Technical dependency
  | 'IMPLEMENTS'      // Feature implements concept
  | 'AUTHORED_BY';    // Document authored by person
// Additional schema for Graph RAG
(:Document {
  id: string,
  title: string,
  path: string,
  chunkIndex: number,
  content: string
})

(:Entity {
  id: string,
  name: string,
  type: string,
  description: string
})

(:Document)-[:MENTIONS]->(:Entity)
(:Entity)-[:RELATES_TO]->(:Entity)
(:Entity)-[:PART_OF]->(:Entity)

2. Entity Extraction with LLM

Use Groq to extract entities and relationships from documents:

// src/lib/server/ai/entity-extraction.ts
import { generateObject } from 'ai';
import { providers } from './providers';
import * as v from 'valibot';

const EntitySchema = v.object({
  entities: v.array(v.object({
    name: v.string(),
    type: v.picklist(['concept', 'feature', 'page', 'person', 'component']),
    description: v.optional(v.string()),
  })),
  relationships: v.array(v.object({
    source: v.string(),
    target: v.string(),
    type: v.picklist(['RELATES_TO', 'PART_OF', 'DEPENDS_ON', 'IMPLEMENTS']),
  })),
});

export async function extractEntities(text: string, title: string) {
  const result = await generateObject({
    model: providers.chat('llama-3.3-70b-versatile'),
    schema: EntitySchema,
    prompt: `Extract entities and relationships from this document.

Document Title: ${title}

Content:
${text}

Extract:
1. Named entities (concepts, features, components, people)
2. Relationships between entities
3. Be specific and avoid generic terms`,
  });

  return result.object;
}

3. Document Ingestion Pipeline

// src/lib/server/graph/ingestion.ts
import { embed } from 'ai';
import { providers } from '$lib/server/ai/providers';
import { extractEntities } from '$lib/server/ai/entity-extraction';
import { getSession } from './index';
import { nanoid } from 'nanoid';

interface Document {
  title: string;
  path: string;
  content: string;
}

export async function ingestDocument(doc: Document) {
  const session = await getSession();

  try {
    // 1. Chunk document
    const chunks = chunkDocument(doc.content);

    // 2. Process each chunk
    for (let i = 0; i < chunks.length; i++) {
      const chunk = chunks[i];
      const chunkId = `doc_${nanoid(8)}`;

      // 3. Generate embedding (Mistral)
      const { embedding } = await embed({
        model: providers.embeddings.embedding('mistral-embed'),
        value: chunk,
      });

      // 4. Extract entities (Groq)
      const { entities, relationships } = await extractEntities(chunk, doc.title);

      // 5. Store document chunk
      await session.run(`
        CREATE (d:Document {
          id: $id,
          title: $title,
          path: $path,
          chunkIndex: $index,
          content: $content
        })
      `, {
        id: chunkId,
        title: doc.title,
        path: doc.path,
        index: i,
        content: chunk
      });

      // 6. Store embedding in pgvector (for semantic search)
      await storeEmbeddingInPostgres(chunkId, embedding);

      // 7. Create/merge entities and relationships
      for (const entity of entities) {
        await session.run(`
          MERGE (e:Entity {name: $name})
          ON CREATE SET e.id = $id, e.type = $type, e.description = $description
          WITH e
          MATCH (d:Document {id: $docId})
          MERGE (d)-[:MENTIONS]->(e)
        `, {
          id: `ent_${nanoid(8)}`,
          name: entity.name,
          type: entity.type,
          description: entity.description ?? '',
          docId: chunkId,
        });
      }

      for (const rel of relationships) {
        await session.run(`
          MATCH (s:Entity {name: $source})
          MATCH (t:Entity {name: $target})
          MERGE (s)-[:${rel.type}]->(t)
        `, {
          source: rel.source,
          target: rel.target,
        });
      }
    }

    console.log(`Ingested: ${doc.title} (${chunks.length} chunks)`);
  } finally {
    await session.close();
  }
}

function chunkDocument(content: string, maxChars = 1000): string[] {
  const chunks: string[] = [];
  const paragraphs = content.split('\n\n');
  let current = '';

  for (const para of paragraphs) {
    if ((current + para).length > maxChars && current) {
      chunks.push(current.trim());
      current = para;
    } else {
      current += '\n\n' + para;
    }
  }

  if (current.trim()) {
    chunks.push(current.trim());
  }

  return chunks;
}

4. Hybrid Retrieval

// src/lib/server/graph/retrieval.ts
import { embed } from 'ai';
import { providers } from '$lib/server/ai/providers';
import { getSession } from './index';
import { db } from '$lib/server/db';
import { sql } from 'drizzle-orm';

interface RetrievalResult {
  documents: Array<{
    id: string;
    title: string;
    content: string;
    score: number;
    source: 'vector' | 'graph';
  }>;
  entities: Array<{
    name: string;
    type: string;
    relatedEntities: string[];
  }>;
}

export async function hybridRetrieve(
  query: string,
  options = { vectorLimit: 5, graphDepth: 2 }
): Promise<RetrievalResult> {
  // 1. Embed query with Mistral
  const { embedding } = await embed({
    model: providers.embeddings.embedding('mistral-embed'),
    value: query,
  });

  // 2. Vector search (pgvector)
  const vectorResults = await db.execute(sql`
    SELECT
      id,
      title,
      content,
      1 - (embedding <=> ${embedding}::vector) as score
    FROM document_embeddings
    ORDER BY embedding <=> ${embedding}::vector
    LIMIT ${options.vectorLimit}
  `);

  // 3. Extract entities from query for graph lookup
  const queryEntities = await extractQueryEntities(query);

  // 4. Graph traversal (Neo4j)
  const session = await getSession();
  let graphResults: any[] = [];
  let relatedEntities: any[] = [];

  try {
    if (queryEntities.length > 0) {
      // Find documents mentioning related entities
      const graphQuery = await session.run(`
        UNWIND $entities AS entityName
        MATCH (e:Entity {name: entityName})
        OPTIONAL MATCH (e)-[:RELATES_TO|DEPENDS_ON|PART_OF*1..${options.graphDepth}]-(related:Entity)
        OPTIONAL MATCH (d:Document)-[:MENTIONS]->(e)
        OPTIONAL MATCH (d2:Document)-[:MENTIONS]->(related)
        RETURN DISTINCT
          d.id AS docId,
          d.title AS title,
          d.content AS content,
          collect(DISTINCT related.name) AS relatedEntities
      `, { entities: queryEntities });

      graphResults = graphQuery.records
        .filter(r => r.get('docId'))
        .map(r => ({
          id: r.get('docId'),
          title: r.get('title'),
          content: r.get('content'),
          relatedEntities: r.get('relatedEntities'),
        }));

      // Get entity context
      const entityQuery = await session.run(`
        UNWIND $entities AS entityName
        MATCH (e:Entity {name: entityName})
        OPTIONAL MATCH (e)-[r]-(related:Entity)
        RETURN e.name AS name, e.type AS type,
               collect(DISTINCT related.name) AS relatedEntities
      `, { entities: queryEntities });

      relatedEntities = entityQuery.records.map(r => ({
        name: r.get('name'),
        type: r.get('type'),
        relatedEntities: r.get('relatedEntities'),
      }));
    }
  } finally {
    await session.close();
  }

  // 5. Merge and deduplicate results
  const seenIds = new Set<string>();
  const documents: RetrievalResult['documents'] = [];

  // Add vector results first (higher confidence for semantic match)
  for (const row of vectorResults.rows) {
    if (!seenIds.has(row.id)) {
      seenIds.add(row.id);
      documents.push({
        id: row.id,
        title: row.title,
        content: row.content,
        score: row.score,
        source: 'vector',
      });
    }
  }

  // Add graph results (relationship-based)
  for (const doc of graphResults) {
    if (!seenIds.has(doc.id)) {
      seenIds.add(doc.id);
      documents.push({
        id: doc.id,
        title: doc.title,
        content: doc.content,
        score: 0.8, // Base score for graph-retrieved docs
        source: 'graph',
      });
    }
  }

  // Sort by score
  documents.sort((a, b) => b.score - a.score);

  return { documents, entities: relatedEntities };
}

async function extractQueryEntities(query: string): Promise<string[]> {
  // Simple NER - for production, use Groq with structured output
  const session = await getSession();
  try {
    // Match query words against known entities
    const result = await session.run(`
      WITH split(toLower($query), ' ') AS words
      MATCH (e:Entity)
      WHERE any(word IN words WHERE toLower(e.name) CONTAINS word)
      RETURN e.name AS name
      LIMIT 10
    `, { query });

    return result.records.map(r => r.get('name'));
  } finally {
    await session.close();
  }
}

5. RAG Endpoint with Graph Context

// src/routes/api/ai/chat/+server.ts
import { json } from '@sveltejs/kit';
import { streamText, convertToModelMessages, type UIMessage } from 'ai';
import { encode } from '@toon-format/toon';
import { providers } from '$lib/server/ai/providers';
import { aiConfig } from '$lib/server/ai/config';
import { hybridRetrieve } from '$lib/server/graph/retrieval';
import type { RequestHandler } from './$types';

export const POST: RequestHandler = async ({ request, locals }) => {
  if (!locals.user) {
    return json({ error: 'Unauthorized' }, { status: 401 });
  }

  try {
    const { messages }: { messages: UIMessage[] } = await request.json();
    const lastMessage = messages[messages.length - 1];

    // Get relevant context via hybrid retrieval
    const { documents, entities } = await hybridRetrieve(
      lastMessage.content,
      { vectorLimit: 5, graphDepth: 2 }
    );

    // Format context as TOON (30-60% token savings)
    const context = encode({
      documents: documents.map(d => ({
        title: d.title,
        content: d.content,
        relevance: d.score.toFixed(2),
        source: d.source,
      })),
      entities: entities.map(e => ({
        name: e.name,
        type: e.type,
        related: e.relatedEntities.join(', '),
      })),
    });

    const systemPrompt = `${aiConfig.systemPrompt}

Use this context to answer questions. The context includes:
- Documents (with relevance scores)
- Entities and their relationships

Context:
${context}

When answering:
1. Cite specific documents when relevant
2. Explain relationships between concepts
3. Be concise and accurate`;

    const result = streamText({
      model: providers.chat(aiConfig.models.chat),
      system: systemPrompt,
      messages: await convertToModelMessages(messages),
    });

    return result.toUIMessageStreamResponse();
  } catch (error) {
    console.error('Chat API error:', error);
    return json({ error: 'Failed to process chat request' }, { status: 500 });
  }
};

Entity Linking

Connect mentions in documents to canonical entities:

// src/lib/server/graph/entity-linking.ts
import { generateObject } from 'ai';
import { providers } from '$lib/server/ai/providers';
import { getSession } from './index';
import * as v from 'valibot';

const LinkingSchema = v.object({
  matches: v.array(v.object({
    mention: v.string(),
    canonicalEntity: v.string(),
    confidence: v.number(),
  })),
});

export async function linkEntities(text: string): Promise<Map<string, string>> {
  const session = await getSession();

  try {
    // Get existing entities from graph
    const existingResult = await session.run(`
      MATCH (e:Entity)
      RETURN e.name AS name, e.type AS type
      LIMIT 100
    `);

    const existingEntities = existingResult.records.map(r => ({
      name: r.get('name'),
      type: r.get('type'),
    }));

    // Use LLM to link mentions to entities
    const result = await generateObject({
      model: providers.chat('llama-3.3-70b-versatile'),
      schema: LinkingSchema,
      prompt: `Match entity mentions in the text to canonical entities.

Text:
${text}

Known entities:
${existingEntities.map(e => `- ${e.name} (${e.type})`).join('\n')}

For each entity mentioned in the text, find the best matching canonical entity.
Consider variations like abbreviations, synonyms, and different phrasings.`,
    });

    const links = new Map<string, string>();
    for (const match of result.object.matches) {
      if (match.confidence > 0.7) {
        links.set(match.mention, match.canonicalEntity);
      }
    }

    return links;
  } finally {
    await session.close();
  }
}

Community Detection

For global search, build community summaries:

// src/lib/server/graph/communities.ts
import { generateText } from 'ai';
import { providers } from '$lib/server/ai/providers';
import { getSession } from './index';

export async function buildCommunitySummaries() {
  const session = await getSession();

  try {
    // Use Neo4j's community detection (requires GDS library)
    // For Aura, use simpler clustering based on relationships

    // Group entities by their connections
    const communities = await session.run(`
      MATCH (e:Entity)-[r]-(connected:Entity)
      WITH e, collect(DISTINCT connected.name) AS neighbors
      WHERE size(neighbors) > 2
      RETURN e.name AS entity, e.type AS type, neighbors
      ORDER BY size(neighbors) DESC
      LIMIT 20
    `);

    // Generate summary for each community
    for (const record of communities.records) {
      const entity = record.get('entity');
      const neighbors = record.get('neighbors');

      const summary = await generateText({
        model: providers.chat('llama-3.3-70b-versatile'),
        prompt: `Summarize this entity cluster in 2-3 sentences:

Central entity: ${entity}
Related entities: ${neighbors.join(', ')}

Focus on what connects these entities and their overall theme.`,
      });

      // Store summary
      await session.run(`
        MATCH (e:Entity {name: $name})
        SET e.communitySummary = $summary
      `, { name: entity, summary: summary.text });
    }
  } finally {
    await session.close();
  }
}

Cypher Queries for RAG

// Documents related through shared entities
MATCH (d1:Document {path: $path})-[:MENTIONS]->(e:Entity)<-[:MENTIONS]-(d2:Document)
WHERE d1 <> d2
RETURN d2.title, d2.path, count(e) AS sharedEntities
ORDER BY sharedEntities DESC
LIMIT 5

Entity Neighborhood

// Get entity with 2-hop neighborhood
MATCH (e:Entity {name: $entityName})
OPTIONAL MATCH path = (e)-[*1..2]-(related:Entity)
RETURN e, collect(DISTINCT related) AS neighborhood

Document-Entity Graph

// Get subgraph for a query context
UNWIND $entities AS entityName
MATCH (e:Entity {name: entityName})
OPTIONAL MATCH (e)-[r]-(connected)
RETURN e, r, connected

Path Finding

// Find connection path between two concepts
MATCH path = shortestPath(
  (start:Entity {name: $from})-[*..5]-(end:Entity {name: $to})
)
RETURN path

LangChain.js Integration

For more complex chains, use LangChain.js:

// src/lib/server/graph/langchain.ts
import { Neo4jGraph } from '@langchain/community/graphs/neo4j_graph';
import { GraphCypherQAChain } from 'langchain/chains/graph_qa/cypher';
import { ChatGroq } from '@langchain/groq';

let graph: Neo4jGraph | null = null;

export async function getGraphChain() {
  if (!graph) {
    graph = await Neo4jGraph.initialize({
      url: process.env.NEO4J_URI!,
      username: process.env.NEO4J_USER!,
      password: process.env.NEO4J_PASSWORD!,
    });
  }

  const model = new ChatGroq({
    apiKey: process.env.GROQ_API_KEY,
    model: 'llama-3.3-70b-versatile',
    temperature: 0,
  });

  return GraphCypherQAChain.fromLLM({
    llm: model,
    graph,
    returnDirect: false,
    returnIntermediateSteps: true,
  });
}

// Usage
export async function graphQA(question: string) {
  const chain = await getGraphChain();
  const result = await chain.invoke({ query: question });
  return {
    answer: result.result,
    cypher: result.intermediateSteps?.[0]?.query,
  };
}

Performance Considerations

Concern Solution
Embedding latency Batch embeddings, cache results
Graph query speed Indexes on frequently queried properties
LLM extraction cost Run during ingestion, not query time
Token usage Use TOON format (30-60% savings)
Cold starts Connection pooling, keep-alive
// Create indexes for faster lookups
CREATE INDEX entity_name IF NOT EXISTS FOR (e:Entity) ON (e.name);
CREATE INDEX entity_type IF NOT EXISTS FOR (e:Entity) ON (e.type);
CREATE INDEX document_path IF NOT EXISTS FOR (d:Document) ON (d.path);
CREATE FULLTEXT INDEX entity_search IF NOT EXISTS FOR (e:Entity) ON EACH [e.name, e.description];

Comparison: Vector RAG vs Graph RAG

Query Type Vector RAG Graph RAG Hybrid
"Explain authentication" Good Good Best
"What depends on Drizzle?" Poor Excellent Excellent
"How are forms and validation related?" Fair Excellent Excellent
"Find error handling docs" Excellent Fair Excellent
"Prerequisites for this page" Poor Excellent Excellent

Limitations

Limitation Mitigation
Graph construction cost Batch processing, incremental updates
Entity extraction errors Review and manual correction
Relationship inference Use domain-specific ontologies
LangChain.js maturity Fallback to raw Neo4j driver
No TypeScript SDK Use neo4j-driver + custom wrappers

File Structure

src/lib/server/
├── graph/
│   ├── index.ts           # Driver connection (existing)
│   ├── types.ts           # Node/relationship types (extend)
│   ├── queries.ts         # Read operations (existing)
│   ├── mutations.ts       # Write operations (existing)
│   ├── ingestion.ts       # Document ingestion pipeline (new)
│   ├── retrieval.ts       # Hybrid retrieval (new)
│   ├── entity-linking.ts  # Entity linking (new)
│   ├── communities.ts     # Community detection (new)
│   └── langchain.ts       # LangChain.js integration (new)
└── ai/
    ├── providers.ts       # Multi-provider setup (existing)
    ├── config.ts          # Configuration (existing)
    └── entity-extraction.ts  # LLM entity extraction (new)

Checklist

  • Extend Neo4j schema with Document/Entity nodes
  • Create entity extraction with Groq
  • Build document ingestion pipeline
  • Implement hybrid retrieval
  • Add entity linking
  • Create indexes for performance
  • Integrate with chat endpoint
  • Add community summaries (optional)
  • Set up LangChain.js for complex queries (optional)


Sources

← Back to Blueprint