vet

Unnamed repository; edit this file 'description' to name the repository.
Log | Files | Refs | README | LICENSE

commit ada2637d6705f8586ecfe8610f6623989779ee88
Author: Andrew Laack <andrew.laack@imbue.com>
Date:   Wed, 28 Jan 2026 07:50:00 -0600

Init commit

Diffstat:
ADEVELOPMENT.md | 108+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
AREADME.md | 183+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 291 insertions(+), 0 deletions(-)

diff --git a/DEVELOPMENT.md b/DEVELOPMENT.md @@ -0,0 +1,108 @@ +# imbue_verify + +Imbue verify is a library and CLI tool for verifying code quality and correctness. + +## Installation + +From the repository root: + +```bash +uv sync --project imbue-verify +``` + +## Usage + +```bash +uv run imbue-verify --help +uv run imbue-verify "description of what the code change accomplishes" +uv run imbue-verify --list-models +uv run imbue-verify "description of what the code change accomplishes" --model claude-opus-4-5-20251101 +uv run imbue-verify --model claude-opus-4-5-20251101 # no goal specified +uv run imbue-verify --model claude-opus-4-5-20251101 --base-commit main # default is HEAD +``` + +## Custom Models + +You can configure custom models (e.g., Ollama, OpenRouter, or other OpenAI-compatible APIs) by creating a `models.json` file in one of these locations: + +- `~/.config/imbue/models.json` - Global configuration +- `<git-repo-root>/models.json` - Project-specific configuration (overrides global) + +Example configuration: + +```json +{ + "providers": { + "openai": { + "name": "OpenAI", + "api_type": "openai_compatible", + "base_url": "https://api.openai.com/v1", + "api_key_env": "OPENAI_API_KEY", + "models": { + "gpt-4o": { + "model_id": "gpt-4o-2024-08-06", + "context_window": 128000, + "max_output_tokens": 16384 + }, + "gpt-4o-mini": { + "model_id": "gpt-4o-mini-2024-07-18", + "context_window": 128000, + "max_output_tokens": 16384 + }, + "o1": { + "model_id": "o1-2024-12-17", + "context_window": 200000, + "max_output_tokens": 100000 + } + } + } + } +} +``` + +## Exit Status + +The following are the **expected** exit status codes for imbue-verify: + +- `0` - Success, no issues found +- `1` - Issues were found in the code +- `2` - Invalid arguments or configuration + +## Concepts + +### Issue identifiers + +Issue identifiers are pieces of logic capable of finding issues in code. We foresee two basic kinds of those: + +1. File-checking ones. + - To check for "objective" issues in existing files. +2. Commit-checking ones. + - To check for the quality of a single commit. + - "Assuming that we can treat the commit message as a requirement, how well does the commit implement it?" + +By default, `imbue_verify` runs all the registered issue identifiers and outputs all the found issues on the standard output in JSON format. + +#### Adding new Issue Identifiers + +If you want to add a new issue identifier, you need to: + +1. Implement the `IssueIdentifier` protocol from `imbue_tools.repo_utils.data_types`. +2. Register the new issue identifier by adding it to `IDENTIFIERS` in `imbue_verify.issue_identifiers.registry`. + +Based on your needs, instead of the above, you can also extend one of the existing batched zero-shot issue identifiers: + - `imbue_verify/issue_identifiers/batched_commit_check.py` + (for commit checking) +In that case you would simply expand the rubric in the prompt. That is actually the preferred way to catch issues at the moment due to efficiency. +Refer to the source code for more details. + +## Development Notes + +### Logging Configuration + +When creating a new entrypoint into imbue_verify, you must call `ensure_core_log_levels_configured()` to register the custom log levels used throughout the codebase. + +```python +from imbue_core.log_utils import ensure_core_log_levels_configured + +ensure_core_log_levels_configured() +``` diff --git a/README.md b/README.md @@ -0,0 +1,183 @@ +# VET : Verify EveryThing + +VET is a standalone verification tool for **code changes** and **coding agent behavior**. + +It reviews git diffs, and optionally an agent's conversation history, to find issues that tests and linters often miss. VET is optimized for use by humans, CI, and coding agents. + +## Installation + +```bash +pip install vet +``` + +## Quickstart + +Run VET in the current repo: + +```bash +vet "Implement X without breaking Y" +``` + +Compare against a base ref/commit: + +```bash +vet "Refactor storage layer" --base-commit main +``` + +## How it works + +VET snapshots the repo and diff, optionally adds a goal and agent conversation, runs LLM checks, then filters/deduplicates findings into a final list of issues. + +TODO: Create rendering pipeline for this. GitHub MD doesn't directly support Graphviz. + +```dot +digraph VET_DataFlow { + rankdir=TB; + node [shape=box, style=rounded]; + + subgraph cluster_inputs { + label="Inputs"; + diff [label="Git diff"]; + goal [label="Goal\n(optional)", style=dashed]; + history [label="Conversation history\n(optional)", style=dashed]; + } + + context [label="Repo snapshot"]; + checks [label="LLM checks"]; + post [label="Filter + deduplicate"]; + issues [label="Issues"]; + + diff -> context; + diff -> checks; + goal -> checks [style=dashed]; + history -> checks [style=dashed]; + context -> checks; + + checks -> post; + post -> issues; +} +``` + +## Why VET + +- **Verification for agentic workflows**: "the agent said it ran tests" is not the same as "all tests ran successfully". +- **CI-friendly safety net**: catches classes of problems that may not be covered by existing tests. +- **Bring-your-own-model**: can run against hosted providers or local/self-hosted OpenAI-compatible endpoints. +- **No telemetry collected by us**: VET does not collect any user data. + +## Output & exit codes + +- Exit code `0`: no issues found +- Exit code `1`: issues found +- Exit code `2`: invalid usage/configuration error + +Output formats: +- `text` +- `json` + +## CI usage + +Recommended CI usage is to run VET with JSON output and display a warning if any issues are found. + +Example: + +```bash +vet --base-commit main --output-format json > vet-report.json +``` + +- If VET exits `0`, no issues were found. +- If VET exits `1`, issues were found (treat as a failing check). +- If VET exits `2`, the invocation/config is invalid (treat as a failing check). + +## Configuration + +### Model configuration + +VET supports custom model definitions using OpenAI-compatible endpoints via JSON config files searched in: + +- `$XDG_CONFIG_HOME/imbue/models.json` (or `~/.config/imbue/models.json`) +- `models.json` at your repo root + +#### Example `models.json` + +```json +{ + "providers": { + "openai": { + "name": "OpenAI", + "api_type": "openai_compatible", + "base_url": "https://api.openai.com/v1", + "api_key_env": "OPENAI_API_KEY", + "models": { + "gpt-4o": { + "model_id": "gpt-4o-2024-08-06", + "context_window": 128000, + "max_output_tokens": 16384 + }, + "gpt-4o-mini": { + "model_id": "gpt-4o-mini-2024-07-18", + "context_window": 128000, + "max_output_tokens": 16384 + }, + "o1": { + "model_id": "o1-2024-12-17", + "context_window": 200000, + "max_output_tokens": 100000 + } + } + } + } +} +``` + +Then: + +```bash +vet "Harden error handling" --model gpt-4o-mini +``` + +### Configuration profiles (TOML) + +VET supports named profiles so teams can standardize CI usage without long CLI invocations. + +Profiles set defaults like model choice, enabled issue codes, output format, and thresholds. + +## Advanced usage + +### Conversation history + +VET can **optionally** ingest agent conversation history via a **history loader command**. + +#### History loader contract + +`--history-loader` runs a shell command and reads **stdout** as plaintext. + +Security note: this executes a command on your machine. Only run history loader commands you trust. + +- Output format: **any text** +- VET treats this as an opaque transcript (it may include user/assistant messages, tool calls, tool results, logs, etc.) +- If you want VET to catch “claimed to run tests” style issues reliably, ensure your transcript includes tool invocations/results (or other evidence), not just prose. + +Example: + +```bash +vet "Fix flaky tests without behavior changes" \ + --history-loader "<command that prints a transcript>" +``` + +#### Example loaders + +In practice you’ll usually wrap your tool's logs/history with a small command that prints a readable transcript: + +```bash +vet --history-loader "vet-history codex --latest" +vet --history-loader "vet-history opencode --session current" +vet --history-loader "vet-history claude-code --project ." +vet --history-loader "vet-history gemini-cli --latest" +``` + +## Privacy / telemetry + +VET does **not** collect telemetry and does not send usage data to external services. + +If you configure VET to use a hosted inference provider, that provider may log requests; selecting a provider is the user’s responsibility.