The project has no test infrastructure yet. This document defines what to build, why, and how.
The target is an AI agent testing loop: implement → run tests → parse output → fix failures → repeat. Claude Code does this natively. The infrastructure's job is to give the agent clear, structured feedback so that loop terminates quickly.
Container Execution Model
Nothing is installed on the host machine except Podman. Claude Code runs on the host but all tools (Bun, Vitest, Biome, node_modules) live inside the v10r container. Every command must go through podman exec.
HOST (Podman + Claude Code only)
│
│ edits files at /home/ad/dev/velociraptor/
│ runs: podman exec v10r <cmd>
│
└──► CONTAINER v10r (WORKDIR /app)
├── Bun runtime
├── node_modules (named volume)
├── Source at /app/ (bind mount ← host)
├── Vitest runs here
├── PGlite runs here (in-process)
└── Biome runs here
Bind mount (.:/app in compose.yaml): file edits on the host are instantly visible in the container. No copy step.
Exit codes pass through: podman exec returns whatever the inner command returns. Hooks work correctly — a failing test exits non-zero.
No path translation needed: Vitest and Biome run from the project root against src/. Host paths never enter the commands.
Container must be running: podman exec fails if v10r isn't up. If you're coding, the dev server is running — this is always true in practice.
No wrapper scripts: Use podman exec v10r <cmd> directly. Explicit is better.
Test Strategy
Two tools. No more.
| Tool | Purpose |
|---|---|
| Vitest 4.x | All automated tests — unit, integration, Svelte components |
| Claude Chrome extension | Browser interaction, E2E validation |
Why Vitest and not Bun test
Every module in $lib/server/ eventually imports $env/static/private or $env/dynamic/private. These are SvelteKit virtual modules — Vite resolves them at build time. Bun test has no Vite plugin system and cannot resolve them. Tests fail immediately on import (oven-sh/bun#5541, oven-sh/bun#10712 — both still open as of early 2026). Additionally, PGlite itself produces out-of-bounds WASM errors with bun test (oven-sh/bun#15032).
Vitest with the @sveltejs/kit/vite plugin resolves $lib/*, $env/*, and all SvelteKit aliases automatically. Same tool handles both server-side unit/integration tests and Svelte component tests.
Vitest 4.x: Use the projects field (not workspace, deprecated in Vitest 3.2) for multi-project configuration. Browser Mode is now stable in Vitest 4.0 but not needed for this project's initial phases.
Why no Playwright
The Claude Chrome extension handles browser interaction. Playwright would duplicate this capability and add configuration overhead for zero benefit. vitest-browser-svelte (real browser component testing via Playwright) is the future direction but requires ~300MB Chromium in the container — not justified until jsdom-based tests hit real limitations.
Why vi.mock and not dependency injection
Every query and mutation module imports db as a module-level singleton (import { db } from '../index'). Converting to DI would change 30+ function signatures across 8+ files plus all call sites — high cost, zero production benefit. The multi-client core architecture explicitly chose module-level imports over DI containers (zero-cost imports vs 50–200ms cold start on serverless). vi.mock('$lib/server/db', ...) is a one-line swap at a clean module boundary.
Directory Structure
Tests live next to source (co-located). A test file for service.ts sits at service.test.ts.
src/
lib/
server/
errors/
index.ts
index.test.ts ← unit test
retrieval/
rank.ts
rank.test.ts ← unit test (pure logic, highest value)
chunk.ts
chunk.test.ts ← unit test (pure logic)
db/
notifications/
queries.ts
queries.test.ts ← integration test (PGlite)
notifications/
service.ts
service.test.ts ← integration test (PGlite)
test/ ← shared test utilities (not a test runner target)
db.ts ← PGlite setup + migrate from Drizzle migrations
fixtures.ts ← test data factories
helpers.ts ← common assertions
components/
primitives/
button/
Button.svelte
Button.test.ts ← component test
Database Test Isolation
PGlite (unit and integration tests)
PGlite runs WASM Postgres in-process inside the v10r container. No external DB, no network, no shared state. Each test file gets its own database instance — parallel-safe by default. PGlite installs as an npm devDependency into the named node_modules volume — no host installation needed. Target version: 0.3.14+ (PostgreSQL 17.4 based).
Schema sync via migrate: Use the documented migrate function from drizzle-orm/pglite/migrator with generated SQL migration files. This is the stable, officially supported path. Migration files are generated transiently by bun run db:test-schema (which runs drizzle-kit generate) into drizzle/. The drizzle/ folder is gitignored — the project uses a push-only workflow for the live Neon database (drizzle-kit push), so migration files exist only as a test artifact.
Why not pushSchema: The pushSchema function from drizzle-kit/api is undocumented and has two confirmed bugs that affect this project directly:
- Interactive prompt bug (drizzle-orm#4531) — hangs waiting for user input when adding unique constraints or renaming columns. Manifests as a silent test timeout, not a clear error.
pgSchemahandling (drizzle-orm#1181, drizzle-orm#4796) — zero confirmed examples working with non-publicschemas. This project uses 6 custom schemas with 12 schema-scoped enums and 9 cross-schema foreign keys — the maximum risk surface.
Dev workflow stays unchanged: The project uses drizzle-kit push exclusively for the live Neon database. Migration files are generated separately for the test harness via bun run db:test-schema. The drizzle/ folder is gitignored — it exists transiently for test runs only. After schema changes, regenerate with db:test-schema before running tests.
Driver difference: Production uses drizzle-orm/neon-serverless. Tests use drizzle-orm/pglite. Same query API, different connection. Swap via vi.mock.
pgvector: PGlite supports pgvector via @electric-sql/pglite/vector. Must be loaded explicitly via the extensions option — without it, any column using vector(1536) fails with "type vector does not exist." The CREATE EXTENSION must run before migrations that reference the vector type.
Custom schemas: This project uses 6 non-default PostgreSQL schemas (auth, showcase, rag, ai, jobs, notifications). PGlite starts with only public. The migrate function handles these correctly because generated migration files contain explicit CREATE SCHEMA IF NOT EXISTS statements with proper ordering.
Cross-schema foreign keys: Almost every schema references auth.user.id (9 cross-schema FKs total). The auth schema must always be created first. Migration files encode this ordering automatically. Schemas cannot be loaded in isolation — load them incrementally:
| Phase | Schemas | Tables | Why |
|---|---|---|---|
| First | jobs |
1 | Standalone, zero FKs — validates the PGlite + migration pipeline |
| Then | auth + notifications |
10 | Cross-schema FKs, real business queries |
| Then | auth + ai |
6 | Another cross-schema domain |
| Last | auth + rag |
7 | Adds pgvector, custom types, post-migration SQL |
Out-of-schema DDL: src/lib/server/db/rag/setup.ts contains raw SQL for a generated tsvector column (search_vector), an HNSW index (chunk_embedding_hnsw_idx), a GIN index on search_vector, and a seed row in embedding_model. These live outside Drizzle schema definitions and are not handled by any schema sync mechanism. They require separate client.exec(sql) after migrations — only when testing RAG-specific queries.
Index support: PGlite supports GiST (for range types), GIN (for tsvector/jsonb), and B-tree indexes natively — no extensions needed. HNSW indexes (pgvector) are theoretically supported but unverified in WASM at scale — functional correctness is expected, not performance parity with native Postgres.
Snapshot/restore (future optimization): PGlite's dumpDataDir/loadDataDir API can cut test init time from ~4.8s to ~1.3s by creating the schema once and restoring per suite instead of re-running migrations. The API is documented and stable, but the pattern is experimental at scale. Consider this if the test suite grows slow.
// src/lib/server/test/db.ts
import { PGlite } from '@electric-sql/pglite';
import { vector } from '@electric-sql/pglite/vector';
import { drizzle } from 'drizzle-orm/pglite';
import { migrate } from 'drizzle-orm/pglite/migrator';
import * as schema from '$lib/server/db/schema';
export async function createTestDb() {
const client = new PGlite({ extensions: { vector } });
await client.exec('CREATE EXTENSION IF NOT EXISTS vector');
const db = drizzle(client, { schema });
// Apply migrations from generated SQL files — documented, stable API
await migrate(db, { migrationsFolder: './drizzle' });
return { db, client };
}
Fallback if migrate hits issues: Use drizzle-kit export (newly documented) to output DDL SQL, capture to a schema.sql file, and client.exec(sql) directly. This eliminates all dependency on internal APIs.
Inject into tests via vi.mock:
// In any test file that needs the DB
vi.mock('$lib/server/db', async () => {
const { createTestDb } = await import('$lib/server/test/db');
const { db } = await createTestDb();
return { db };
});
Cleanup: Each test file creates one PGlite instance (via vi.mock factory, called once per module). Vitest runs files in parallel across workers — this is safe. Use afterAll to shut down the client:
// In test files using PGlite
let testClient: PGlite;
vi.mock('$lib/server/db', async () => {
const { createTestDb } = await import('$lib/server/test/db');
const { db, client } = await createTestDb();
testClient = client;
return { db };
});
afterAll(async () => {
await testClient?.close();
});
Neo4j (unit and integration tests)
Mock the cypher() function. It's a single import — one mock covers all graph calls.
vi.mock('$lib/server/graph', () => ({
cypher: vi.fn().mockResolvedValue({ records: [] }),
}));
For tests that need specific graph responses, configure the mock per test:
vi.mocked(cypher).mockResolvedValueOnce({
records: [{ get: (key: string) => mockData[key] }],
});
Test Aura instance: only if graph features require E2E validation. Not needed for unit or integration coverage.
Test Data Factories
Factories produce valid test objects with sensible defaults. Override only what the test cares about.
// src/lib/server/test/fixtures.ts
import type { User } from '$lib/server/db/schema';
export function makeUser(overrides?: Partial<User>): User {
return {
id: crypto.randomUUID(),
email: `test-${crypto.randomUUID()}@example.com`,
name: 'Test User',
createdAt: new Date(),
updatedAt: new Date(),
...overrides,
};
}
export function makeNotification(overrides?: Partial<Notification>): Notification {
return {
id: crypto.randomUUID(),
userId: crypto.randomUUID(),
type: 'system',
title: 'Test notification',
body: null,
isRead: false,
createdAt: new Date(),
readAt: null,
archivedAt: null,
...overrides,
};
}
Claude Code Hooks
Hook commands run on the host machine (where Claude Code runs). Since all tools live in the container, every command uses podman exec v10r.
PostToolUse — lint only
Runs after every Edit or Write. Fast (sub-second). Biome only — no tests.
{
"hooks": {
"PostToolUse": [
{
"matcher": "Edit|Write",
"hooks": [{
"type": "command",
"command": "podman exec v10r bun biome check --write --reporter json src/"
}]
}
]
}
}
No path translation needed — Biome runs against src/ from the container's /app workdir.
Why not run tests here: Running full tests after every edit causes thrashing. The agent gets failure reports for partially-complete work and starts fixing intermediate states before the feature is done.
Stop hook — quality gate
Runs when the agent declares completion. Uses a shell script that reads Claude Code's hook input JSON, checks for infinite loop prevention, and runs validation.
{
"hooks": {
"Stop": [
{
"hooks": [{
"type": "command",
"command": ".claude/hooks/stop-gate.sh"
}]
}
]
}
}
#!/bin/bash
# .claude/hooks/stop-gate.sh
# Read hook input from stdin
INPUT=$(cat)
STOP_HOOK_ACTIVE=$(echo "$INPUT" | jq -r '.stop_hook_active')
# CRITICAL: prevent infinite loop. When stop_hook_active is true,
# we are already in forced-continuation from a previous block.
# Allow the agent to stop to avoid looping forever.
if [ "$STOP_HOOK_ACTIVE" = "true" ]; then
exit 0
fi
# Run full validation inside the container.
# Uses bash (not sh) for pipefail support — without it,
# piping to tail swallows the exit code from vitest.
OUTPUT=$(podman exec v10r bash -c 'set -o pipefail; bun run check && bun biome check . && bun vitest run 2>&1 | tail -80' 2>&1)
EXIT_CODE=$?
if [ $EXIT_CODE -ne 0 ]; then
# Block the stop — feed failure output back to the agent
echo "{\"decision\": \"block\", \"reason\": \"Validation failed:\\n${OUTPUT}\"}"
exit 0
fi
# All checks passed — allow stop
exit 0
Why bash -c not sh -c: POSIX sh does not support set -o pipefail. Without pipefail, vitest run 2>&1 | tail -80 returns tail's exit code (always 0), silently masking test failures. Using bash -c ensures the pipeline returns vitest's exit code.
Why stop_hook_active check: Claude Code sets stop_hook_active: true in the hook input when the agent is already in forced-continuation from a previous block. Without this check, the agent loops infinitely: fail → block → retry → fail → block → ... This is a confirmed issue (claude-code#10205).
Why JSON output (not exit code 2): Exit code 2 with stderr is supposed to block the stop, but has a known bug in plugin-installed hooks (claude-code#10412). JSON {"decision": "block", "reason": "..."} on stdout with exit 0 is the reliable approach.
Risk: If pre-existing failures exist, the agent loops trying to fix them (until stopped by stop_hook_active). Fix all pre-existing failures before enabling this hook.
Container down: If v10r isn't running, podman exec fails and the output contains "no such container" — a clear signal, not a confusing test failure.
Machine-readable output
All commands prefixed with podman exec v10r when run from the host.
| Tool | Format | Agent benefit |
|---|---|---|
bun vitest run |
Console (default) | File paths, line numbers, diff output |
bun biome check --reporter json |
Structured JSON | File path, line number, fix suggestions |
bun vitest run --reporter=junit --outputFile=./junit.xml |
JUnit XML | CI integration |
Configuration Files
vitest.config.ts
import { defineConfig } from 'vitest/config';
import { sveltekit } from '@sveltejs/kit/vite';
export default defineConfig({
plugins: [sveltekit()],
test: {
include: ['src/**/*.test.ts'],
environment: 'node',
globals: true,
testTimeout: 15_000,
setupFiles: ['src/lib/server/test/vitest.setup.ts'],
},
});
The sveltekit() plugin resolves all SvelteKit virtual modules — $lib/*, $env/*, $app/*. Without it, every server-side test fails on import.
testTimeout: 15_000: PGlite WASM startup + migration apply can take 3–5 seconds per test file. The default 5000ms timeout causes false failures. Set higher globally, keep individual tests fast.
setupFiles: Runs once per worker (per test file). Used for global mocking.
Vitest 4.x note: The workspace field is deprecated since Vitest 3.2. Use projects for multi-project configuration if needed later. The simple single-project config above works for the initial phases.
vitest.setup.ts
// src/lib/server/test/vitest.setup.ts
import { vi } from 'vitest';
// Prevent job schedulers from starting. Both check globalThis
// sentinels before starting setInterval loops. Without this,
// any transitive import of hooks.server.ts starts real schedulers.
globalThis.__v10r_scheduler = 'test';
globalThis.__v10r_delivery_scheduler = 'test';
// $env/dynamic/private resolves to an empty object in Vitest because
// no SvelteKit adapter calls server.init(). Redirect to process.env
// so that values from .env.test are available.
vi.mock('$env/dynamic/private', () => ({
env: process.env,
}));
// $app/environment — needed if any import chain touches schedulers or SSR guards
vi.mock('$app/environment', () => ({
building: false,
browser: false,
dev: true,
version: 'test',
}));
Why $env/dynamic/private mock: $env/static/private works via the sveltekit() plugin (reads .env at Vite startup). But $env/dynamic/private depends on adapter initialization that never happens during vitest run. Without this mock, any module importing env from $env/dynamic/private gets an empty object. Known open issue: sveltejs/kit#9564.
Why scheduler sentinels: hooks.server.ts imports $lib/server/jobs/scheduler and $lib/server/jobs/delivery-scheduler as side effects. Both start setInterval loops unless a globalThis sentinel is set. The sentinels are the cleanest containment — no need to mock the platform module.
Why $app/environment mock: Both schedulers check building from $app/environment. Other modules (sidebar.svelte.ts) check browser. Providing deterministic values prevents runtime detection logic from firing.
.env.test
# Fake values for $env/static/private — only needed when tests import
# modules that read these at module load time. Pure logic tests
# (rank.ts, chunk.ts, errors/) don't need any of these.
DATABASE_URL=postgresql://test:test@localhost/test
BETTER_AUTH_SECRET=a-test-secret-that-is-at-least-32-chars
BETTER_AUTH_URL=http://localhost:5173
GITHUB_CLIENT_ID=test
GITHUB_CLIENT_SECRET=test
GOOGLE_CLIENT_ID=test
GOOGLE_CLIENT_SECRET=test
NEO4J_URI=neo4j+s://test.databases.neo4j.io
NEO4J_USERNAME=neo4j
NEO4J_PASSWORD=test
Why this is needed: $lib/server/auth/index.ts throws at module load if BETTER_AUTH_SECRET is shorter than 32 characters. $lib/server/db/index.ts reads DATABASE_URL at module load. The sveltekit() plugin reads .env.test automatically for $env/static/private values. These are never used for real connections — PGlite replaces the DB, and auth is mocked.
package.json scripts
{
"scripts": {
"test": "vitest run",
"test:watch": "vitest",
"db:test-schema": "bunx drizzle-kit generate",
"validate": "bun run check && bun biome check . && bun run test"
}
}
validate is the quality gate command. From the host: podman exec v10r bun run validate.
db:test-schema generates migration SQL files in drizzle/ from the TypeScript schema definitions. The drizzle/ folder is gitignored — these files exist only as a test artifact for PGlite's migrate function. Run after schema changes to keep test migrations in sync. The live Neon database is managed exclusively via drizzle-kit push.
Implementation Phases
Phase 0 — Foundation
- Add
vitesttodevDependenciesinpackage.json - Create
vitest.config.tswithsveltekit()plugin,testTimeout, andsetupFiles - Create
src/lib/server/test/vitest.setup.ts— global mocks and scheduler sentinels - Create
.env.testwith fake values for$env/static/privatemodules - Add test scripts to
package.json(test,test:watch,db:test-schema) - Restart container so deps install:
podman compose down && podman compose up -d - Ensure
.svelte-kit/types exist — thesveltekit()plugin needs them. The dev server creates them automatically on startup (svelte-kit sync). If running tests before the dev server has started, runpodman exec v10r bun run checkfirst (which triggerssvelte-kit sync).
No tests yet. Verify from host: podman exec v10r bun run test exits cleanly with "no test files found."
Phase 1 — Pure logic tests
Start with modules that have zero external dependencies — no DB, no mocks, maximum signal. These validate the full pipeline (host → container → Vitest → results) while testing the highest-value algorithmic code.
Priority order:
$lib/server/rawrag/rank.ts— RRF algorithm, deduplication, fusion. Four functions, deterministic math, highest ROI. The ranking logic is non-trivial and could easily have off-by-one errors.$lib/server/rawrag/chunk.ts— text splitting with overlap and hierarchy. Only needscrypto.subtle(available in Bun). Complex enough to have real bugs.$lib/server/errors/index.ts—ServerErrorclass,toStatus()mapping, JSON serialization. Pure logic, validates the pipeline.$lib/schemas/showcase/— Valibot schema rules. Test withv.safeParse(). TherealtimeSchemapassword-confirm cross-field validation is the most interesting.
// src/lib/server/errors/index.test.ts
import { describe, expect, it } from 'vitest';
import { ServerError } from './index';
describe('ServerError', () => {
it('sets kind and message', () => {
const err = new ServerError('db', 'something failed');
expect(err.kind).toBe('db');
expect(err.message).toBe('something failed');
});
it('is an Error instance', () => {
expect(new ServerError('ai', 'provider unavailable')).toBeInstanceOf(Error);
});
it('maps to 500 by default', () => {
const err = new ServerError('unknown', 'test');
expect(err.toStatus()).toBe(500);
});
it('serializes to JSON', () => {
const err = new ServerError('db', 'connection lost', 'ECONNREFUSED');
const json = err.toJSON();
expect(json).toEqual({
name: 'ServerError',
kind: 'db',
message: 'connection lost',
code: 'ECONNREFUSED',
status: 500,
});
});
});
Goal: verify the full pipeline works end-to-end. Claude Code (host) runs podman exec v10r bun run test → Vitest resolves $lib paths inside container → results flow back to host.
Phase 2 — DB tests
- Add
@electric-sql/pglitetodevDependenciesinpackage.json(drizzle-ormanddrizzle-kitare already dependencies) - Restart container:
podman compose down && podman compose up -d - Generate migration files:
podman exec v10r bun run db:test-schema - Verify migration files exist in
drizzle/and containCREATE SCHEMA IF NOT EXISTSfor all 6 schemas - Create
src/lib/server/test/db.ts— PGlite setup withmigrate - Create
src/lib/server/test/fixtures.ts— data factories - Validate incrementally — start with the
jobsschema (1 table, zero FKs) to prove the pipeline, thenauth+notifications(10 tables, cross-schema FKs) for real business logic - Write first DB test for notification queries
Decision tree if migrate fails:
- Step A: Check migration file contents — do they include
CREATE SCHEMA IF NOT EXISTSfor each custom schema? If not, addCREATE SCHEMAstatements tocreateTestDb()beforemigrate(). - Step B: If migration files reference types that PGlite doesn't support, use
drizzle-kit exportto capture DDL SQL andclient.exec(sql)directly. - Step C: As last resort, try
pushSchemafromdrizzle-kit/apiwithcreateRequireworkaround. Pindrizzle-kitto the exact working version.
// src/lib/server/db/notifications/queries.test.ts
import { beforeEach, describe, expect, it, vi } from 'vitest';
import { makeUser, makeNotification } from '$lib/server/test/fixtures';
vi.mock('$lib/server/db', async () => {
const { createTestDb } = await import('$lib/server/test/db');
const { db } = await createTestDb();
return { db };
});
describe('getNotifications', () => {
it('returns notifications for user', async () => {
const { getNotifications } = await import('./queries');
const user = makeUser();
// seed test data, then assert
const results = await getNotifications(user.id, 10, 0);
expect(results).toBeInstanceOf(Array);
});
});
Phase 3 — Stop hook
- Create
.claude/hooks/stop-gate.sh(see Stop hook section above) - Make it executable:
chmod +x .claude/hooks/stop-gate.sh - Add Stop hook to
.claude/settings.local.json - Install
jqin the container (needed by the hook script to parse JSON input) — or parse withgrep/sedifjqis unavailable
Every agent session must pass podman exec v10r bun run validate before completing.
Prerequisite: Phase 1 and Phase 2 tests pass cleanly. Never enable the Stop hook with failing tests.
Phase 4 — Expand coverage
Priority order based on codebase analysis (highest value first):
Data access layer (primary test boundary):
$lib/server/db/notifications/mutations.ts— race condition ingetOrCreateSettings(), IDOR protection viauserIdscoping inmarkAsRead(). High bug potential.$lib/server/db/notifications/queries.ts— core CRUD with auth scoping.$lib/server/db/ai/mutations.ts— multi-step operations with ownership verification.$lib/server/db/errors.ts—classifyDbError(),classifyCode(),safeDbMessage().
Orchestration layer:
$lib/server/notifications/service.ts—NotificationService.send()coordinates DB insert, SSE push, and external routing. Requires mocking SSE and router in addition to the DB.$lib/server/rawrag/index.ts— 176 lines of pure orchestration over embedding, three retrieval tiers, and ranking. Highly testable with mocked tier functions, no DB needed.
Auth and security:
$lib/server/auth/guards.ts—requireAuth,requireApiUser,requireAdmin. TakesApp.Localsas a plain argument — mock the object, assert redirect/error behavior.hooks.server.ts— security middleware (headers, CSRF, auth, rate limiting). Isolate as a separate concern: it has side-effect imports (schedulers), Redis connection, and feature logging that must be fully mocked. Mockauth.api.getSession,@upstash/ratelimit, constructRequestEventobjects. 176 lines of critical security logic.
Load functions:
- Form showcase load functions (
forms/basics/contact,forms/validation/server, etc.) — import only fromsveltekit-superformsand$lib/schemas/. No DB, easiest load function tests. Callload()directly with a mock event object.
Components:
Alert.svelte— self-contained, internal$state(true)for visibility, dismiss flow. Good first component test.- Primitives (
Switch,Toggle,Progress) — prop-to-DOM behavior, validates custom spacing tokens.
State machines (high value, tested as factories):
createRawragTrace()fromrawrag-trace.svelte.ts— streaming annotation processing, cursor tracking, step status transitions. Pure factory, no external imports.createDockState()fromdock.state.svelte.ts— tab activation, panel close, split creation. Pure factory.
What NOT to test (thin adapters):
- Route handlers (
+page.server.ts,+server.ts) — they are 15–25 line adapters that parse params and call domain functions. If the domain functions work, the routes work. Testing them requires constructingRequestEventobjects with high ceremony for low signal. - Structural components (
Card.svelte,BackLink.svelte,Kbd.svelte) — pure slots/snippets, no logic. - Canvas/WebGL components (
NetworkGraph.svelte,Scene.svelte) — impossible in jsdom.
Side-effect containment: The globalThis sentinels in vitest.setup.ts prevent scheduler startup. No earlier test should import anything that chains to hooks.server.ts. The vi.mock('$lib/server/db', ...) pattern is safe because query/mutation modules only import from '../index' (the DB client) — they do not chain through hooks.
Svelte 5 state testing note: All .svelte.ts state files in this project use the context factory pattern (createToastState(), createDockState(), etc.), not module-level $state singletons. This means the flushSync() requirement for external reactive state does not apply — call the factory directly in tests.
Phase 5 — AGENTS.md and CI
- Write
AGENTS.mdat project root (see template below) - Add
.github/workflows/test.ymlrunningvitest runandbiome checkon PRs
AGENTS.md Template
Place at project root. All AI coding agents read this file alongside CLAUDE.md.
# AGENTS.md
Supplementary instructions for AI coding agents working in this project.
Read CLAUDE.md first. This file adds testing-specific guidance.
## Container-First Execution
Nothing is installed on the host except Podman. All commands run inside the `v10r` container:
podman exec v10r <command>
Source is bind-mounted (host `.` → container `/app`). File edits are instantly visible.
The container must be running (`podman compose up -d`).
## Test Commands
| From host | What it does |
|-----------|-------------|
| `podman exec v10r bun run test` | Run all tests once |
| `podman exec v10r bun run test:watch` | Watch mode during development |
| `podman exec v10r bun run validate` | Full quality gate before declaring done |
| `podman exec v10r bun vitest run src/lib/server/errors/` | Run tests for one module |
| `podman exec v10r bun biome check --write src/lib/server/notifications/service.ts` | Lint and fix one file |
## Test Patterns
**Co-location:** Tests live next to source. `service.ts` → `service.test.ts`.
**DB isolation:** Use PGlite for any test touching the database.
Mock the DB module with `vi.mock('$lib/server/db', ...)` — see `src/lib/server/test/db.ts`.
Schema is applied via `migrate` from `drizzle-orm/pglite/migrator` using generated migration files.
**Neo4j:** Mock `cypher()` for all unit and integration tests.
Only use Test Aura if the test cannot be written any other way.
**Factories:** Use `src/lib/server/test/fixtures.ts` for test data.
Override only what the test cares about.
**Schema changes:** After modifying schema files in `src/lib/server/db/schema/`,
regenerate test migrations: `podman exec v10r bun run db:test-schema`.
## Boundaries
- **Never** run commands on the host — always `podman exec v10r`
- **Never** modify migration files in `drizzle/` by hand — they are generated from schema definitions
- **Never** commit `.env` or `.env.local` (`.env.test` is safe to commit — it contains only fake values)
- **Never** delete or skip a failing test — fix the code or fix the test
- **Never** use `bun test` as the test runner — use `vitest` only
- **Never** install packages on the host — add to `package.json`, restart container
- **Never** import `hooks.server.ts` in unit/integration tests — it starts real schedulers and Redis connections
Known Issues and Gotchas
| Issue | Impact | Mitigation |
|---|---|---|
vi.stubEnv() does not affect $env modules |
Env var mocking broken | Mock $env/dynamic/private in vitest.setup.ts (sveltejs/kit#9564) |
$lib/server/auth throws if BETTER_AUTH_SECRET < 32 chars |
Module load crash in tests | .env.test with a 32+ char fake secret |
Job schedulers start on import of hooks.server.ts |
Real setInterval loops in tests |
globalThis.__v10r_scheduler = 'test' sentinel in setup file |
| RAG setup SQL lives outside Drizzle schema | Migrations don't create tsvector, HNSW, GIN | Separate client.exec(sql) post-migration for RAG tests only |
drizzle-kit v1 beta breaking changes |
Migration generation may change | Pin drizzle-kit to exact working version |
Stop hook exit 2 ignored in plugin hooks |
Hook doesn't block agent | Use JSON output {"decision": "block"} with exit 0 (claude-code#10412) |
| Stop hook infinite loop | Agent never terminates | Check stop_hook_active field in hook input (claude-code#10205) |
POSIX sh lacks pipefail |
Pipe to tail masks exit codes |
Use bash -c with set -o pipefail |
| PGlite WASM startup + migrations ~200ms–3s | Default 5s test timeout too short | Set testTimeout: 15_000 in vitest config |
Named node_modules volume persists across recreations |
Stale packages after removal | Use podman compose down -v for clean installs |
Better Auth CLI generates pgTable not pgSchema |
Wrong schema namespace in generated migrations | Manual patch needed (better-auth#6606) |
| Bun test cannot resolve SvelteKit virtual modules | Tests fail on import | Use Vitest only (oven-sh/bun#5541, oven-sh/bun#10712) |
| PGlite + Bun test: WASM out-of-bounds errors | PGlite crashes under Bun test runner | Use Vitest + Node (oven-sh/bun#15032) |
Vitest workspace field deprecated |
Warning in Vitest 3.2+ | Use projects field instead |
Rejected Approaches
| Approach | Why rejected |
|---|---|
pushSchema from drizzle-kit/api |
Undocumented; interactive prompt bug (#4531) hangs in CI; pgSchema support unverified (#1181, #4796). 6 custom schemas + 12 enums + 9 cross-schema FKs = maximum risk surface |
| Dependency injection for DB swap | 30+ function signatures change, all call sites affected, zero production benefit. Architecture explicitly chose module imports over DI |
| Testcontainers (real Postgres in Docker) | Container-startup latency per test run, adds Docker daemon dependency inside Podman container. PGlite is faster and sufficient |
| pg-mem | TypeScript reimplementation of Postgres, not real Postgres. No timezone support, approximate numerics, no Drizzle adapter |
| Playwright for component tests | Adds ~300MB Chromium to container. jsdom via @testing-library/svelte is sufficient for current component complexity |
| Neon Testing library (branch-per-test) | Real cloud Postgres branches per test file. Production parity but adds network latency, API key dependency, cost at scale. No independent practitioner reports yet — revisit in 6 months |
References
Official documentation
| Source | Relevance |
|---|---|
| Svelte Testing | Official Vitest + @sveltejs/kit/vite setup |
| Drizzle + PGlite connect | Official Drizzle/PGlite connection docs |
| Drizzle migrate docs | migrate function — stable API |
| Drizzle-kit export docs | DDL SQL export — documented fallback |
| PGlite API | dumpDataDir, loadDataDir, exec, query, close |
| PGlite Extensions | pgvector, btree_gist, pg_trgm extension loading |
| PGlite ORM support | Drizzle listed as officially supported |
| PGlite Benchmarks | Query performance data |
| Vitest 4.0 release | Browser Mode now stable, projects field |
| Claude Code Hooks | PostToolUse and Stop hook configuration |
| Biome CLI | --reporter json flag, --write flag |
Community implementations
| Source | Relevance |
|---|---|
| PGlite + Drizzle tutorial | PGlite test setup with Drizzle ORM (team serving 2.1M users) |
| rphlmr/drizzle-vitest-pg | Reference implementation: Drizzle + PGlite + Vitest (uses migrate, not pushSchema) |
| 1300 tests in 25s benchmark | PGlite performance at scale |
| PGlite snapshot pattern | Snapshot/restore for faster tests (experimental) |
| sveltest.dev | vitest-browser-svelte patterns reference |
Issues and discussions
| Source | Relevance |
|---|---|
| sveltejs/kit#9564 | vi.stubEnv vs $env modules — still open |
| sveltejs/kit#8180 | $env/dynamic/private in Vitest — resolved |
| drizzle-orm#4531 | pushSchema interactive prompt bug |
| drizzle-orm#1181 | pgSchema push silent failure |
| drizzle-orm#4796 | push generates DROP SCHEMA for custom schemas |
| drizzle-orm#4205 | pushSchema community usage and gotchas |
| drizzle-orm#2532 | PGlite with migrate function — confirmed working |
| oven-sh/bun#5541 | Bun test cannot resolve $app/environment |
| oven-sh/bun#10712 | Bun test cannot handle $env/dynamic/* |
| oven-sh/bun#15032 | Bun bundler + PGlite WASM errors |
| testing-library/svelte#284 | Svelte 5 support status and gotchas |
| claude-code#10205 | Stop hook infinite loop issue |
| claude-code#10412 | Stop hook exit code bug |
Architecture and patterns
| Source | Relevance |
|---|---|
| AGENTS.md Standard | AGENTS.md file format and purpose |
| AGENTS.md: lessons from 2500+ repos | What makes AGENTS.md files useful |
| Self-Improving Coding Agents | Agent testing loops, structured feedback |
| Neon Testing library | Branch-per-test with real Neon Postgres — future option |