Introduction

Holdpoint is a universal eval-guard for AI coding agents. It enforces a set of checkpoints — shell commands and agent instructions — that must pass before any agent can mark a task as done. You define the rules once in a single checks.yaml file. Holdpoint generates the correct adapter for whichever agent you use.

Holdpoint is not tied to any specific agent. It works with any AI coding tool that exposes a hook surface, completion event, or instruction-injection mechanism — including GitHub Copilot CLI, Claude Code, Cursor, and others.

There are two kinds of checks:

  • cmd checks — a shell command (e.g. pnpm test) that Holdpoint runs automatically. If it exits non-zero, the agent is blocked from completing the task.
  • prompt checks — an instruction that Holdpoint surfaces to the agent (e.g. "Update the OpenAPI spec"). The agent reads it and must act before marking the task done.

How it works

  1. Define checks in checks.yaml — one file at your project root declares all cmd and prompt checks, optional file-scope filters, and conditions.
  2. Run holdpoint init — Holdpoint detects your agent and stack, then generates adapter files that hook into the agent's completion mechanism.
  3. The adapter enforces checks at task completion — when the agent tries to finish a task, the adapter runs all relevant checks. Failures block completion and surface the issue.

The adapter mechanism varies by agent:

AgentMechanismGenerated files
GitHub Copilot CLIbeforeTaskComplete hook in extension.mjs.github/hooks/holdpoint.json .github/hooks/holdpoint-check.mjs .github/holdpoint/generated/checks.immutable.json
Claude CodePostToolUse + Stop hooks in settings.json.claude/settings.json
Cursor.cursorrules instruction injection — agent reads rules and self-enforces.cursorrules (appended)
Cursor does not expose a programmatic hook — Holdpoint injects instructions into .cursorrules so the agent reads and follows them. cmd checks are listed as instructions for the agent to run manually, not enforced by a runtime hook.

Installation

Run in your project root (git repo required):

npx holdpoint@latest init

Or with the shell installer:

curl -fsSL https://holdpoint.dev/install.sh | sh

Holdpoint auto-detects your agent type and project stack. You can also pass flags:

# Explicit stack + agent
npx holdpoint init --stack=typescript --agent=copilot

# Available stacks: typescript, python, go, nextjs, fullstack
# Available agents: copilot, claude, cursor

Requirements: Node.js 18+, an active git repository, and one of the supported agents installed.

checks.yaml reference

The checks.yaml file lives at your project root and is the single source of truth for all Holdpoint checks. A minimal example:

checks.yaml
version: 1

context:
  guides: {}

conditions: []

checks:
  - id: typecheck
    label: "TypeScript type check"
    cmd: "pnpm typecheck"

  - id: docs-updated
    label: "Update documentation"
    when: docs
    prompt: "Ensure all public APIs changed in this task are documented."

version

Always 1. Required.

session_context_files

Optional list of file paths (relative to repo root) injected as agent context at session start. Useful for injecting project-specific guides or conventions.

session_context_files:
  - MASTER_PROMPT.md
  - AGENT_CONTEXT.md

context

Named guide text injected into every task. Use context.guides to add key/value pairs:

context:
  guides:
    architecture: >
      This project uses a hexagonal architecture.
      Always keep domain logic independent of infrastructure.

conditions

Conditions gate checks — a check with a conditionId only runs when its condition evaluates to true.

conditions:
  - id: has-openapi
    operator: file_exists
    path: openapi.yaml

  - id: has-env-token
    operator: env_var_set
    variable: API_TOKEN

  - id: schema-has-users
    operator: file_contains
    path: prisma/schema.prisma
    contains: "model User"

  - id: server-healthy
    operator: shell_returns_0
    command: "curl -sf http://localhost:3000/health"
OperatorRequired fieldsDescription
file_existspathTrue if the file exists on disk
file_containspath, containsTrue if the file exists and contains the substring
env_var_setvariableTrue if the environment variable is non-empty
shell_returns_0commandTrue if the shell command exits with code 0

checks

An array of check definitions. Each check is either a cmd check (has a cmd field) or a prompt check (has a prompt field).

FieldTypeRequiredDescription
idstringyesUnique identifier for the check
labelstringyesHuman-readable display name
cmdstringcmd checksShell command — exits non-zero to block task completion
promptstringprompt checksInstruction the agent must act on before finishing
whenstringnoFile filter: named scope or regex — see below
conditionIdstringnoGate this check behind a condition
onstringnoHook event — currently only before_done (default)
checks.yaml
checks:
  # cmd check — runs a shell command
  - id: lint
    label: "ESLint"
    cmd: "pnpm lint"

  # cmd check with file filter — only runs when frontend files change
  - id: typecheck-frontend
    label: "TypeScript — frontend"
    when: frontend
    cmd: "pnpm tsc --noEmit"

  # prompt check — instruction surfaced to the agent
  - id: changelog
    label: "Add CHANGELOG entry"
    prompt: "Add an entry to CHANGELOG.md describing what changed in this task."

  # prompt check with condition gate
  - id: openapi-sync
    label: "OpenAPI spec up to date"
    when: backend
    conditionId: has-openapi
    prompt: "Update openapi.yaml to reflect any API route changes."

File filters (when:)

The when field narrows which checks activate based on which files changed. If omitted, the check always runs. Holdpoint ships with 16 named scopes covering the most common patterns across GitHub repos.

When no git-staged files are detected (e.g. running holdpoint check without staged changes), all checks run regardless of their when filter.

ScopeFires when changed files match
(absent)Every task — no file filter applied
frontend**/*.tsx, **/*.jsx, **/*.css, **/*.scss, **/tailwind.config.*, apps/**
backend**/api/**, **/server/**, **/routes/**, **/controllers/**, packages/*/src/**
socket**/socket/**, **/ws/**, **/websocket/**
visual**/*.stories.{ts,tsx}, **/__screenshots__/**, **/*.snap
python**/*.py, **/*.pyi, **/requirements*.txt, **/pyproject.toml
go**/*.go, **/go.mod, **/go.sum
rust**/*.rs, **/Cargo.toml, **/Cargo.lock
java**/*.java, **/*.kt, **/*.gradle, **/pom.xml
ruby**/*.rb, **/Gemfile, **/Rakefile
database**/*.sql, **/migrations/**, **/db/**, **/prisma/**, **/*.prisma
prisma**/prisma/**, **/*.prisma — focused subset for Prisma-specific checks
testing**/*.test.*, **/*.spec.*, **/__tests__/**, **/tests/**, **/spec/**
infra**/Dockerfile*, **/docker-compose.*, **/*.tf, **/k8s/**
ci**/.github/workflows/**, **/.circleci/**, **/Jenkinsfile, **/.gitlab-ci.yml
docs**/*.mdx, **/*.rst, **/docs/**, **/documentation/**
structuralpackage.json, tsconfig*, go.mod, Cargo.toml, Dockerfile*, docker-compose*, *.tf, openapi.*, .github/workflows/*.yml, vitest/jest/playwright configs, linter configs, and more — any file whose change signals the project's dependency graph or toolchain has shifted

You can also use any JavaScript regex as the whenvalue. It is tested against each changed file's repo-relative path:

checks:
  - id: e2e
    label: "E2E tests for builder"
    when: "^apps/builder/src/"
    cmd: "pnpm test:e2e"
Named scopes use glob matching (minimatch). Plain strings that are not a named scope are treated as JavaScript regexes. An invalid regex will throw at runtime.

For project-specific paths, define named aliases in a top-level patterns: map so checks stay readable:

checks.yaml
patterns:
  api-routes: "^src/api/"
  openapi-spec: "openapi\\.(yaml|yml|json)$"

checks:
  - id: openapi-lint
    label: "Lint OpenAPI spec"
    when: openapi-spec
    cmd: "npx redocly lint openapi.yaml"
Pattern values are JavaScript regex strings matched against changed file paths. Built-in scope names (frontend, structural, etc.) cannot be overridden in patterns.

Supported agents

Holdpoint generates agent-specific adapter files from your checks.yaml. Run holdpoint update after any change to regenerate them.

GitHub Copilot CLI

Holdpoint registers a beforeTaskComplete extension hook. Before Copilot marks a task done, the hook reads git-staged files, runs all matching deterministic checks with a 60-second timeout, and blocks completion if any fail.

Generated files:

  • .github/hooks/holdpoint.json — hook registration
  • .github/hooks/holdpoint-check.mjs — self-contained check runner
  • .github/holdpoint/generated/checks.immutable.json — parsed config

Claude Code

Holdpoint adds PostToolUse and Stop hook entries to .claude/settings.json. The hooks delegate to npx holdpoint check at runtime, which reads the currentchecks.yaml directly.

.claude/settings.json
{
  "hooks": {
    "PostToolUse": [
      {
        "matcher": "Task",
        "hooks": [{ "type": "command", "command": "npx holdpoint check" }]
      }
    ],
    "Stop": [
      { "type": "command", "command": "npx holdpoint check" }
    ]
  }
}

Cursor

Holdpoint appends a structured instruction block to .cursorrules. The block lists all checks the agent must carry out before marking a task complete. Because Cursor does not expose a programmatic hook, enforcement depends on the agent reading and following the instructions.

Visual builder

The visual builder lets you create and edit checks.yaml without writing YAML by hand. Open it with:

npx @holdpoint/cli@alpha builder

The builder has two views:

  • Graph view — an n8n-style node canvas. Nodes represent triggers (hook events), file filters, checks (cmd/prompt), and conditions. Drag and connect them to define your configuration. Use the side panel to edit node properties.
  • List view — displays checks grouped by hook event and file filter. Supports inline create, edit, and delete without leaving the list. Useful for quickly scanning or bulk-editing checks.

Both views are bidirectionally synced. Use the Export YAML button in the toolbar to copy the generated config.

CLI reference

CommandDescription
holdpoint init [--stack] [--agent]Install Holdpoint — detects stack + agent automatically
holdpoint check [--staged]Run all deterministic checks; surface prompt checks
holdpoint evolve [--apply]Scan project and propose (or apply) new checks
holdpoint validateValidate checks.yaml against the schema and print errors
holdpoint updateRegenerate adapter files from the current checks.yaml
holdpoint builderOpen the visual builder on localhost:4321

holdpoint check

Reads git-staged files to determine which checks to run (via when: filter matching). If no staged files are found, all checks run. Use --staged to always scope to staged files only.

cmd checks exit non-zero on failure and print the shell output. prompt checks are displayed as a list of instructions — they are not automatically enforced as commands.

holdpoint update

Must be run after any change to checks.yaml. Regenerates all adapter files. The holdpoint-sync check in the default configuration enforces this automatically when checks.yaml is staged.

holdpoint evolve

Scans the project filesystem, detects languages, frameworks, and tooling, then diffs the result against the current checks.yaml. In dry-run mode (default) it prints proposed new checks and any stale checks whose when: pattern matches zero files. Pass --apply to write all proposals to checks.yaml and regenerate engine files automatically.

The MASTER_PROMPT.md installed by holdpoint init instructs your AI agent to run holdpoint evolve --apply whenever the project structure changes — closing the zero-config evolution loop.

Stack templates

holdpoint init generates a starter checks.yaml based on a stack template. Templates are pre-configured with common checks and appropriate when: file filters.

StackCmd checksPrompt checks
typescripteslint, tscJSDoc coverage, type-hint review
pythonruff, mypy, pytest (when: python)docstrings, type-hints (when: python)
gogo build, go vet, go test (when: go)GoDoc review (when: go), test coverage (when: testing)
nextjseslint, tsc, next build, lighthouse (when: frontend)visual regression, accessibility, SEO, OpenAPI (when: backend)
fullstackeslint, tsc, pytest, openapi-diff, playwrightvisual check, accessibility, type-hints, db-migrations (when: database), PR description

Auto-detection: Holdpoint reads project files to select the best template — go.mod for Go, pyproject.toml /requirements.txt for Python, next.config.* for Next.js, and so on.

Advanced

Conditions

Use conditionId to gate a check behind a runtime condition. A gated check is skipped entirely when its condition is false — it does not appear in the output at all. Useful for checks that only make sense in certain project setups (e.g. an OpenAPI check that only applies when an openapi.yaml exists).

conditions:
  - id: has-openapi
    operator: file_exists
    path: openapi.yaml

checks:
  - id: openapi-sync
    label: "OpenAPI in sync"
    conditionId: has-openapi   # skipped if openapi.yaml does not exist
    when: backend
    prompt: "Update openapi.yaml for any changed API routes."

Custom when: regex

Any string that is not a named scope is treated as a JavaScript regex and tested against each changed file path. Anchor with ^ to match from the repo root:

checks:
  - id: builder-e2e
    label: "Builder E2E tests"
    when: "^apps/builder/src/"   # only runs when builder source changes
    cmd: "pnpm --filter @holdpoint/builder test:e2e"

Multi-agent projects

holdpoint init detects and configures a single agent. If your project uses multiple agents (e.g. Copilot and Claude Code), run holdpoint init --agent=copilot then holdpoint update for the second agent after manually editing the config. Automatic multi-agent support is planned.

session_context_files

Files listed under session_context_files are read at session start and injected as additional context into the agent via the sessionStart hook. Useful for injecting project conventions, architectural guides, or onboarding notes that the agent should know before starting any task:

session_context_files:
  - MASTER_PROMPT.md    # project conventions and holdpoint config guide
  - AGENT_CONTEXT.md    # current repo state, what works, what's broken

Keeping generated files in sync

Holdpoint's own checks.yaml includes a holdpoint-sync check that runs npx holdpoint update whenever checks.yaml is staged. Add this to your project to enforce the same invariant:

checks:
  - id: holdpoint-sync
    label: "Regenerate adapter files"
    when: "^checks\.yaml$"
    cmd: "npx holdpoint update"

Open source under the MIT license. GitHub ↗