This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
bun install- Install dependenciesbun run build- Build the project using tsdown, outputs to dist/bun run lint- Run Biome linter and formatterbun run typecheck- Run TypeScript type checking using tsgobun test- Run unit testsbun release- Full release process: lint, typecheck, test, build, and version bump
bunx . [url]- Run the MCP server for a specific URL- Example:
bunx . https://example.com --concurrency 5 --match "/**" --cache
This is a Model Context Protocol (MCP) server that fetches websites and exposes them as tools for AI assistants.
-
MCP Server (
server.ts): Creates dynamic tools based on the fetched site:indexOf{SiteName}- Returns paginated list of all pagesgetDocumentOf{SiteName}- Returns content of specific pages- Uses YAML format for responses
- Implements file-based caching in
~/.cache/sitemcp
-
Site Fetcher (
fetch-site.ts): Queue-based concurrent web crawler:- Uses Cheerio for link extraction
- Uses Happy DOM + Readability for content extraction
- Respects match patterns and same-origin policy
- Handles redirects and non-HTML content gracefully
-
CLI (
cli.ts): Command-line interface using gunshi framework- Validates inputs and configures the server
- Supports options for concurrency, caching, content selectors, etc.
- Content Extraction: Uses Mozilla's Readability library to extract article content intelligently
- Concurrent Fetching: Queue-based system with configurable concurrency (default: 3)
- Tool Naming: Dynamic tool names generated from site URL (domain/subdomain/pathname strategies)
- Response Format: YAML for better readability in AI contexts
- Logging: Custom logger outputs to stderr (stdout reserved for MCP protocol)
Tests use Bun's built-in test runner and focus on utility functions. Run individual tests with:
bun test utils.test.ts- Run specific test filebun test -t "test name"- Run tests matching pattern