Introduction
Holdpoint is a universal eval-guard for AI coding agents. It enforces a set of checkpoints — shell commands and agent instructions — that must pass before any agent can mark a task as done. You define the rules once in a single checks.yaml file. Holdpoint generates the correct adapter for whichever agent you use.
Holdpoint is not tied to any specific agent. It works with any AI coding tool that exposes a hook surface, completion event, or instruction-injection mechanism — including GitHub Copilot CLI, Claude Code, Cursor, and others.
There are two kinds of checks:
- cmd checks — a shell command (e.g.
pnpm test) that Holdpoint runs automatically. If it exits non-zero, the agent is blocked from completing the task. - prompt checks — an instruction that Holdpoint surfaces to the agent (e.g. "Update the OpenAPI spec"). The agent reads it and must act before marking the task done.
How it works
- Define checks in
checks.yaml— one file at your project root declares all cmd and prompt checks, optional file-scope filters, and conditions. - Run
holdpoint init— Holdpoint detects your agent and stack, then generates adapter files that hook into the agent's completion mechanism. - The adapter enforces checks at task completion — when the agent tries to finish a task, the adapter runs all relevant checks. Failures block completion and surface the issue.
The adapter mechanism varies by agent:
| Agent | Mechanism | Generated files |
|---|---|---|
| GitHub Copilot CLI | beforeTaskComplete hook in extension.mjs | .github/hooks/holdpoint.json .github/hooks/holdpoint-check.mjs .github/holdpoint/generated/checks.immutable.json |
| Claude Code | PostToolUse + Stop hooks in settings.json | .claude/settings.json |
| Cursor | .cursorrules instruction injection — agent reads rules and self-enforces | .cursorrules (appended) |
.cursorrules so the agent reads and follows them. cmd checks are listed as instructions for the agent to run manually, not enforced by a runtime hook.Installation
Run in your project root (git repo required):
npx holdpoint@latest init
Or with the shell installer:
curl -fsSL https://holdpoint.dev/install.sh | sh
Holdpoint auto-detects your agent type and project stack. You can also pass flags:
# Explicit stack + agent npx holdpoint init --stack=typescript --agent=copilot # Available stacks: typescript, python, go, nextjs, fullstack # Available agents: copilot, claude, cursor
Requirements: Node.js 18+, an active git repository, and one of the supported agents installed.
checks.yaml reference
The checks.yaml file lives at your project root and is the single source of truth for all Holdpoint checks. A minimal example:
version: 1
context:
guides: {}
conditions: []
checks:
- id: typecheck
label: "TypeScript type check"
cmd: "pnpm typecheck"
- id: docs-updated
label: "Update documentation"
when: docs
prompt: "Ensure all public APIs changed in this task are documented."version
Always 1. Required.
session_context_files
Optional list of file paths (relative to repo root) injected as agent context at session start. Useful for injecting project-specific guides or conventions.
session_context_files: - MASTER_PROMPT.md - AGENT_CONTEXT.md
context
Named guide text injected into every task. Use context.guides to add key/value pairs:
context:
guides:
architecture: >
This project uses a hexagonal architecture.
Always keep domain logic independent of infrastructure.conditions
Conditions gate checks — a check with a conditionId only runs when its condition evaluates to true.
conditions:
- id: has-openapi
operator: file_exists
path: openapi.yaml
- id: has-env-token
operator: env_var_set
variable: API_TOKEN
- id: schema-has-users
operator: file_contains
path: prisma/schema.prisma
contains: "model User"
- id: server-healthy
operator: shell_returns_0
command: "curl -sf http://localhost:3000/health"| Operator | Required fields | Description |
|---|---|---|
| file_exists | path | True if the file exists on disk |
| file_contains | path, contains | True if the file exists and contains the substring |
| env_var_set | variable | True if the environment variable is non-empty |
| shell_returns_0 | command | True if the shell command exits with code 0 |
checks
An array of check definitions. Each check is either a cmd check (has a cmd field) or a prompt check (has a prompt field).
| Field | Type | Required | Description |
|---|---|---|---|
| id | string | yes | Unique identifier for the check |
| label | string | yes | Human-readable display name |
| cmd | string | cmd checks | Shell command — exits non-zero to block task completion |
| prompt | string | prompt checks | Instruction the agent must act on before finishing |
| when | string | no | File filter: named scope or regex — see below |
| conditionId | string | no | Gate this check behind a condition |
| on | string | no | Hook event — currently only before_done (default) |
checks:
# cmd check — runs a shell command
- id: lint
label: "ESLint"
cmd: "pnpm lint"
# cmd check with file filter — only runs when frontend files change
- id: typecheck-frontend
label: "TypeScript — frontend"
when: frontend
cmd: "pnpm tsc --noEmit"
# prompt check — instruction surfaced to the agent
- id: changelog
label: "Add CHANGELOG entry"
prompt: "Add an entry to CHANGELOG.md describing what changed in this task."
# prompt check with condition gate
- id: openapi-sync
label: "OpenAPI spec up to date"
when: backend
conditionId: has-openapi
prompt: "Update openapi.yaml to reflect any API route changes."File filters (when:)
The when field narrows which checks activate based on which files changed. If omitted, the check always runs. Holdpoint ships with 16 named scopes covering the most common patterns across GitHub repos.
When no git-staged files are detected (e.g. running holdpoint check without staged changes), all checks run regardless of their when filter.
| Scope | Fires when changed files match |
|---|---|
| (absent) | Every task — no file filter applied |
| frontend | **/*.tsx, **/*.jsx, **/*.css, **/*.scss, **/tailwind.config.*, apps/** |
| backend | **/api/**, **/server/**, **/routes/**, **/controllers/**, packages/*/src/** |
| socket | **/socket/**, **/ws/**, **/websocket/** |
| visual | **/*.stories.{ts,tsx}, **/__screenshots__/**, **/*.snap |
| python | **/*.py, **/*.pyi, **/requirements*.txt, **/pyproject.toml |
| go | **/*.go, **/go.mod, **/go.sum |
| rust | **/*.rs, **/Cargo.toml, **/Cargo.lock |
| java | **/*.java, **/*.kt, **/*.gradle, **/pom.xml |
| ruby | **/*.rb, **/Gemfile, **/Rakefile |
| database | **/*.sql, **/migrations/**, **/db/**, **/prisma/**, **/*.prisma |
| prisma | **/prisma/**, **/*.prisma — focused subset for Prisma-specific checks |
| testing | **/*.test.*, **/*.spec.*, **/__tests__/**, **/tests/**, **/spec/** |
| infra | **/Dockerfile*, **/docker-compose.*, **/*.tf, **/k8s/** |
| ci | **/.github/workflows/**, **/.circleci/**, **/Jenkinsfile, **/.gitlab-ci.yml |
| docs | **/*.mdx, **/*.rst, **/docs/**, **/documentation/** |
| structural | package.json, tsconfig*, go.mod, Cargo.toml, Dockerfile*, docker-compose*, *.tf, openapi.*, .github/workflows/*.yml, vitest/jest/playwright configs, linter configs, and more — any file whose change signals the project's dependency graph or toolchain has shifted |
You can also use any JavaScript regex as the whenvalue. It is tested against each changed file's repo-relative path:
checks:
- id: e2e
label: "E2E tests for builder"
when: "^apps/builder/src/"
cmd: "pnpm test:e2e"For project-specific paths, define named aliases in a top-level patterns: map so checks stay readable:
patterns:
api-routes: "^src/api/"
openapi-spec: "openapi\\.(yaml|yml|json)$"
checks:
- id: openapi-lint
label: "Lint OpenAPI spec"
when: openapi-spec
cmd: "npx redocly lint openapi.yaml"frontend, structural, etc.) cannot be overridden in patterns.Supported agents
Holdpoint generates agent-specific adapter files from your checks.yaml. Run holdpoint update after any change to regenerate them.
GitHub Copilot CLI
Holdpoint registers a beforeTaskComplete extension hook. Before Copilot marks a task done, the hook reads git-staged files, runs all matching deterministic checks with a 60-second timeout, and blocks completion if any fail.
Generated files:
- .github/hooks/holdpoint.json — hook registration
- .github/hooks/holdpoint-check.mjs — self-contained check runner
- .github/holdpoint/generated/checks.immutable.json — parsed config
Claude Code
Holdpoint adds PostToolUse and Stop hook entries to .claude/settings.json. The hooks delegate to npx holdpoint check at runtime, which reads the currentchecks.yaml directly.
{
"hooks": {
"PostToolUse": [
{
"matcher": "Task",
"hooks": [{ "type": "command", "command": "npx holdpoint check" }]
}
],
"Stop": [
{ "type": "command", "command": "npx holdpoint check" }
]
}
}Cursor
Holdpoint appends a structured instruction block to .cursorrules. The block lists all checks the agent must carry out before marking a task complete. Because Cursor does not expose a programmatic hook, enforcement depends on the agent reading and following the instructions.
Visual builder
The visual builder lets you create and edit checks.yaml without writing YAML by hand. Open it with:
npx @holdpoint/cli@alpha builder
The builder has two views:
- Graph view — an n8n-style node canvas. Nodes represent triggers (hook events), file filters, checks (cmd/prompt), and conditions. Drag and connect them to define your configuration. Use the side panel to edit node properties.
- List view — displays checks grouped by hook event and file filter. Supports inline create, edit, and delete without leaving the list. Useful for quickly scanning or bulk-editing checks.
Both views are bidirectionally synced. Use the Export YAML button in the toolbar to copy the generated config.
CLI reference
| Command | Description |
|---|---|
| holdpoint init [--stack] [--agent] | Install Holdpoint — detects stack + agent automatically |
| holdpoint check [--staged] | Run all deterministic checks; surface prompt checks |
| holdpoint evolve [--apply] | Scan project and propose (or apply) new checks |
| holdpoint validate | Validate checks.yaml against the schema and print errors |
| holdpoint update | Regenerate adapter files from the current checks.yaml |
| holdpoint builder | Open the visual builder on localhost:4321 |
holdpoint check
Reads git-staged files to determine which checks to run (via when: filter matching). If no staged files are found, all checks run. Use --staged to always scope to staged files only.
cmd checks exit non-zero on failure and print the shell output. prompt checks are displayed as a list of instructions — they are not automatically enforced as commands.
holdpoint update
Must be run after any change to checks.yaml. Regenerates all adapter files. The holdpoint-sync check in the default configuration enforces this automatically when checks.yaml is staged.
holdpoint evolve
Scans the project filesystem, detects languages, frameworks, and tooling, then diffs the result against the current checks.yaml. In dry-run mode (default) it prints proposed new checks and any stale checks whose when: pattern matches zero files. Pass --apply to write all proposals to checks.yaml and regenerate engine files automatically.
The MASTER_PROMPT.md installed by holdpoint init instructs your AI agent to run holdpoint evolve --apply whenever the project structure changes — closing the zero-config evolution loop.
Stack templates
holdpoint init generates a starter checks.yaml based on a stack template. Templates are pre-configured with common checks and appropriate when: file filters.
| Stack | Cmd checks | Prompt checks |
|---|---|---|
| typescript | eslint, tsc | JSDoc coverage, type-hint review |
| python | ruff, mypy, pytest (when: python) | docstrings, type-hints (when: python) |
| go | go build, go vet, go test (when: go) | GoDoc review (when: go), test coverage (when: testing) |
| nextjs | eslint, tsc, next build, lighthouse (when: frontend) | visual regression, accessibility, SEO, OpenAPI (when: backend) |
| fullstack | eslint, tsc, pytest, openapi-diff, playwright | visual check, accessibility, type-hints, db-migrations (when: database), PR description |
Auto-detection: Holdpoint reads project files to select the best template — go.mod for Go, pyproject.toml /requirements.txt for Python, next.config.* for Next.js, and so on.
Advanced
Conditions
Use conditionId to gate a check behind a runtime condition. A gated check is skipped entirely when its condition is false — it does not appear in the output at all. Useful for checks that only make sense in certain project setups (e.g. an OpenAPI check that only applies when an openapi.yaml exists).
conditions:
- id: has-openapi
operator: file_exists
path: openapi.yaml
checks:
- id: openapi-sync
label: "OpenAPI in sync"
conditionId: has-openapi # skipped if openapi.yaml does not exist
when: backend
prompt: "Update openapi.yaml for any changed API routes."Custom when: regex
Any string that is not a named scope is treated as a JavaScript regex and tested against each changed file path. Anchor with ^ to match from the repo root:
checks:
- id: builder-e2e
label: "Builder E2E tests"
when: "^apps/builder/src/" # only runs when builder source changes
cmd: "pnpm --filter @holdpoint/builder test:e2e"Multi-agent projects
holdpoint init detects and configures a single agent. If your project uses multiple agents (e.g. Copilot and Claude Code), run holdpoint init --agent=copilot then holdpoint update for the second agent after manually editing the config. Automatic multi-agent support is planned.
session_context_files
Files listed under session_context_files are read at session start and injected as additional context into the agent via the sessionStart hook. Useful for injecting project conventions, architectural guides, or onboarding notes that the agent should know before starting any task:
session_context_files: - MASTER_PROMPT.md # project conventions and holdpoint config guide - AGENT_CONTEXT.md # current repo state, what works, what's broken
Keeping generated files in sync
Holdpoint's own checks.yaml includes a holdpoint-sync check that runs npx holdpoint update whenever checks.yaml is staged. Add this to your project to enforce the same invariant:
checks:
- id: holdpoint-sync
label: "Regenerate adapter files"
when: "^checks\.yaml$"
cmd: "npx holdpoint update"Open source under the MIT license. GitHub ↗