This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Evalite is a TypeScript-native, local-first tool for testing LLM-powered apps built on Vitest. It allows developers to write evaluations (evals) as .eval.ts files that run like tests.
The primary configuration method is evalite.config.ts. While vitest.config.ts is still supported for backward compatibility, it is not documented and evalite.config.ts should be used for all configuration needs.
Development mode (recommended for working on Evalite itself):
pnpm run devThis runs:
- TypeScript type checker on
evalitepackage - Tests in
evalite-testspackage - Live reload for both packages
Build all packages:
pnpm buildThis builds evalite package first, then evalite-ui, copying UI assets to packages/evalite/dist/ui.
Run CI pipeline (build, test, lint):
pnpm ciTest the example evals:
pnpm run example
# Or: cd packages/example && pnpm evalite watchRun single package tests:
cd packages/evalite && pnpm test
cd packages/evalite-tests && pnpm testLint a package:
cd packages/evalite && pnpm lintWhen working on specific packages in this monorepo, use pnpm's --filter flag to run commands on specific packages.
Build a specific package:
pnpm --filter evalite build
pnpm --filter evalite-ui buildRun tests for a specific package:
pnpm --filter evalite-tests testRun dev mode for a specific package:
pnpm --filter evalite dev
pnpm --filter evalite-ui devLint a specific package:
pnpm --filter evalite lint
pnpm --filter evalite-tests lintpnpm supports several filter patterns:
--filter evalite- Run task for theevalitepackage only--filter evalite...- Run task forevaliteand all its dependencies--filter ...evalite- Run task forevaliteand all packages that depend on it--filter "./packages/*"- Run task for all packages in the packages directory--filter "!evalite"- Run task for all packages exceptevalite
Working on the main evalite package:
# Build evalite and watch for changes
pnpm --filter evalite dev
# Run tests after making changes
pnpm --filter evalite testWorking on the UI:
# Build evalite first, then start UI dev server
pnpm run build:evalite && pnpm --filter evalite-ui devWorking on integration tests:
# Ensure evalite is built before running tests
pnpm run build && pnpm --filter evalite-tests testUse filters when:
- You need to run commands on specific packages
- You want to avoid changing directories
- You're running build, test, or lint tasks
Direct package commands are fine for:
- Quick one-off commands (like
pnpm install) - Running the evalite CLI itself (e.g.,
cd packages/example && pnpm evalite watch) - When already in the package directory
This is a pnpm workspace:
-
packages/evalite: Main package that users install. Exports theevalite()function, CLI binary (evalite), server, database layer, and utilities. Built with TypeScript. -
packages/evalite-core: Shared core utilities (currently appears to be deprecated or minimal) -
packages/evalite-tests: Integration tests for evalite functionality -
packages/example: Example eval files demonstrating usage patterns (e.g.,example.eval.ts,traces.eval.ts) -
apps/evalite-ui: React-based web UI that displays eval results. Built with Vite, TanStack Router, and Tailwind. Gets copied topackages/evalite/dist/uiduring build via theafter-buildscript. -
apps/evalite-docs: Documentation site
Eval files: Files matching *.eval.ts (or .eval.mts) that contain evalite() calls. These define:
- A dataset (via
data()function returning input/expected pairs) - A task (the LLM interaction to test)
- Scorers (functions that evaluate output quality)
- Optional columns for custom data display
Execution flow:
- The
evaliteCLI uses Vitest under the hood to discover and run*.eval.tsfiles - Each eval creates a Vitest
describeblock with concurrentittests for each data point - Results are stored in a SQLite database (
evalite.db) - A Fastify server serves the UI and provides WebSocket updates during runs
- Files (images, audio, etc.) are saved to
.evalitedirectory
Key architecture points:
- Uses Vitest's
inject("cwd")to get the working directory - Supports async iterables (streaming) from tasks via
executeTask() - Files in input/output/expected are automatically detected and saved using
createEvaliteFileIfNeeded() - Traces can be reported via
reportTraceLocalStoragefor nested LLM calls - Integrates with AI SDK via
evalite/ai-sdkexport (providestraceAISDKModel())
SQLite database (evalite.db) stores:
- Runs (full or partial)
- Evals (distinct eval names with metadata)
- Results (individual test case results with scores, traces, columns)
- Scores and traces are stored as JSON
Key queries in packages/evalite/src/db.ts:
getEvals(),getResults(),getScores(),getTraces()getMostRecentRun(),getPreviousCompletedEval()
The Fastify server in packages/evalite/src/server.ts:
- Serves the UI from
dist/ui/ - Provides REST API at
/api/*(menu-items, server-state, evals, results, etc.) - WebSocket endpoint at
/api/socketfor live updates during eval runs
Linking for local development: If you need to test the global evalite command locally:
pnpm build
cd packages/evalite && npm linkNode version: Requires Node.js >= 22
Environment setup for examples: Create a .env file in packages/example with:
OPENAI_API_KEY=your-api-key
File extensions: Both .eval.ts and .eval.mts files are supported (see changeset #151)
To add a changeset, write a new file to the .changeset directory.
The file should be named 0000-your-change.md. Decide yourself whether to make it a patch, minor, or major change.
The format of the file should be:
---
"evalite": patch
---
Description of the change.The description of the change should be user-facing, describing which features were added or bugs were fixed.