vet

Unnamed repository; edit this file 'description' to name the repository.
Log | Files | Refs | README | LICENSE

commit 552a66e3689b914d911f736da4ab1ac7a4a671ea
parent 6fece3b55d2300623e84da444b1f5305fec747d0
Author: andrewlaack-collab <andrew.laack@imbue.com>
Date:   Wed, 11 Feb 2026 23:55:05 +0000

Custom prompts for existing issue codes (#62)

* Address one failure mode

* Updated development documentation version bump note. Added custom prompts for existing issue codes.

* Updated readme for documenting feature. Updated location for models to be consistent with this

* Updated type definition locations to fix dependency direction.

* Updated how specification of custom guides is done to ensure prefix and suffix can be paired.

* Added tests for custom issue guides.

* Centralize issue code loading for codes in use.

* Removed unnecessary wrapper
Diffstat:
MDEVELOPMENT.md | 10++++++----
MREADME.md | 27++++++++++++++++++++++++++-
Mskills/vet/SKILL.md | 2+-
Muv.lock | 2+-
Avet/cli/config/custom_guides_loader_test.py | 163+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Mvet/cli/config/loader.py | 52++++++++++++++++++++++++++++++++++++++++++++++++----
Mvet/cli/config/loader_test.py | 8++++----
Mvet/cli/main.py | 17+++++++++++------
Mvet/imbue_core/data_types.py | 31+++++++++++++++++++++++++++++++
Mvet/imbue_tools/types/vet_config.py | 7++++++-
Avet/issue_identifiers/custom_guides_merge_test.py | 84+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Mvet/issue_identifiers/identification_guides.py | 35+++++++++++++++++++++++++++++++++++
Mvet/issue_identifiers/registry.py | 15++++++++++++---
13 files changed, 428 insertions(+), 25 deletions(-)

diff --git a/DEVELOPMENT.md b/DEVELOPMENT.md @@ -51,18 +51,20 @@ Vet is published to PyPI via the `publish-to-pypi.yml` GitHub Actions workflow. ### Releasing a new version -1. Update the version in `pyproject.toml`. -2. Commit and push the change. +1. Update the version in `pyproject.toml` +2. Commit and push the change 3. Tag the commit and push the tag: ```bash git tag v0.2.0 git push origin v0.2.0 ``` -4. The `Publish to PyPI` workflow will automatically build and publish the package. -5. After the package is published, update the recommended GitHub action pinned version in the `README.md` +4. The `Publish to PyPI` workflow will automatically build and publish the package +5. Update the recommended GitHub action pinned version in the `README.md` ```yaml - run: pip install verify-everything==0.2.0 ``` +6. Update the pinned version for this project +7. Commit and push this change ### Why pin the version in the README? diff --git a/README.md b/README.md @@ -145,7 +145,7 @@ Output formats: Vet supports custom model definitions using OpenAI-compatible endpoints via JSON config files searched in: -- `$XDG_CONFIG_HOME/imbue/models.json` (or `~/.config/imbue/models.json`) +- `$XDG_CONFIG_HOME/vet/models.json` (or `~/.config/vet/models.json`) - `models.json` at your repo root #### Example `models.json` @@ -194,6 +194,31 @@ Profiles set defaults like model choice, enabled issue codes, output format, and See [the example](https://github.com/imbue-ai/vet/blob/main/vet.toml) in this project. +### Custom issue guides + +You can customize the guide text for the issue codes via `guides.toml`. Guide files are loaded from: + +- `$XDG_CONFIG_HOME/vet/guides.toml` (or `~/.config/vet/guides.toml`) +- `guides.toml` at your repo root + +```toml +[logic_error] +suffix = """ +- Check for integer overflow in arithmetic operations +""" + +[insecure_code] +replace = """ +- Check for SQL injection: flag any string concatenation or f-string formatting used to build SQL queries rather than parameterized queries +- Check for XSS: flag user-supplied data rendered into HTML templates without proper escaping or sanitization +- Check for path traversal: flag file operations where user input flows into file paths without validation against directory traversal (e.g. ../) +- Check for insecure cryptography: flag use of deprecated or weak algorithms (e.g. MD5, SHA1 for security purposes, DES, RC4) +- Check for hardcoded credentials: flag passwords, API keys, or tokens embedded directly in source code +""" +``` + +Section keys must be valid issue codes (`vet --list-issue-codes`). Each section supports three optional fields: `prefix` (prepends to built-in guide), `suffix` (appends to built-in guide), and `replace` (fully replaces the built-in guide). `prefix` and `suffix` can be used together, but `replace` is mutually exclusive with the other two. Guide text should be formatted as a list of items starting with `-`. + ## License This project is licensed under the [GNU Affero General Public License v3.0 (AGPL-3.0-only)](https://github.com/imbue-ai/vet/blob/main/LICENSE). diff --git a/skills/vet/SKILL.md b/skills/vet/SKILL.md @@ -7,7 +7,7 @@ description: Run vet immediately after ANY logical unit of code changes. Do not **Run vet immediately after ANY logical unit of code changes. Do not batch changes, do not wait to be asked to run vet, make sure you are proactive.** -Vet reviews git diffs and conversation history to find issues in code changes and conversation history. It is most effective when run frequently with conversation history, which helps it catch misunderstandings between what was requested and what was implemented. +Vet reviews git diffs and conversation history to find issues in code changes and conversation history. It is most effective when run frequently with conversation history, which helps it catch misunderstandings between what was requested and what was implemented. Despite this, vet is not a replacement for running tests. ## Installation diff --git a/uv.lock b/uv.lock @@ -1523,7 +1523,7 @@ wheels = [ [[package]] name = "verify-everything" -version = "0.1.0" +version = "0.1.2" source = { editable = "." } dependencies = [ { name = "anthropic" }, diff --git a/vet/cli/config/custom_guides_loader_test.py b/vet/cli/config/custom_guides_loader_test.py @@ -0,0 +1,163 @@ +from __future__ import annotations + +import os +from pathlib import Path +from unittest.mock import patch + +import pytest +from pydantic import ValidationError + +from vet.cli.config.loader import ConfigLoadError +from vet.cli.config.loader import _load_single_guides_file +from vet.cli.config.loader import load_custom_guides_config +from vet.imbue_core.data_types import CustomGuideConfig + + +def _write_guides_toml(path: Path, content: str) -> Path: + path.parent.mkdir(parents=True, exist_ok=True) + path.write_text(content) + return path + + +def test_custom_guide_config_valid_fields() -> None: + assert CustomGuideConfig(suffix="text").suffix == "text" + assert CustomGuideConfig(prefix="text").prefix == "text" + assert CustomGuideConfig(replace="text").replace == "text" + + both = CustomGuideConfig(prefix="before", suffix="after") + assert both.prefix == "before" + assert both.suffix == "after" + + +def test_custom_guide_config_replace_with_prefix_or_suffix_fails() -> None: + with pytest.raises(ValidationError, match="replace"): + CustomGuideConfig(replace="text", prefix="text") + + +def test_custom_guide_config_no_fields_fails() -> None: + with pytest.raises(ValidationError, match="At least one"): + CustomGuideConfig() + + +def test_custom_guide_config_extra_field_fails() -> None: + with pytest.raises(ValidationError, match="extra"): + CustomGuideConfig.model_validate({"mode": "suffix", "suffix": "text"}) + + +def test_load_single_guides_file_valid(tmp_path: Path) -> None: + config_file = _write_guides_toml( + tmp_path / "guides.toml", + """ +[logic_error] +suffix = "- Check for integer overflow" + +[insecure_code] +replace = "- Check for SQL injection" +""", + ) + + result = _load_single_guides_file(config_file) + + assert "logic_error" in result.guides + assert result.guides["logic_error"].suffix == "- Check for integer overflow" + assert "insecure_code" in result.guides + assert result.guides["insecure_code"].replace == "- Check for SQL injection" + + +def test_load_single_guides_file_unknown_issue_code(tmp_path: Path) -> None: + config_file = _write_guides_toml( + tmp_path / "guides.toml", + """ +[not_a_real_code] +suffix = "text" +""", + ) + + with pytest.raises(ConfigLoadError, match="Unknown issue code 'not_a_real_code'"): + _load_single_guides_file(config_file) + + +def test_load_single_guides_file_invalid_toml(tmp_path: Path) -> None: + config_file = _write_guides_toml( + tmp_path / "guides.toml", + "this is not [valid toml", + ) + + with pytest.raises(ConfigLoadError, match="Invalid TOML"): + _load_single_guides_file(config_file) + + +def test_load_single_guides_file_invalid_schema(tmp_path: Path) -> None: + config_file = _write_guides_toml( + tmp_path / "guides.toml", + """ +[logic_error] +""", + ) + + with pytest.raises(ConfigLoadError, match="Invalid guide configuration"): + _load_single_guides_file(config_file) + + +def test_load_custom_guides_no_files(tmp_path: Path) -> None: + with patch.dict(os.environ, {"XDG_CONFIG_HOME": str(tmp_path / "nonexistent")}): + result = load_custom_guides_config(repo_path=tmp_path) + assert result.guides == {} + + +def test_load_custom_guides_project_overrides_global(tmp_path: Path) -> None: + xdg_config = tmp_path / "xdg" + _write_guides_toml( + xdg_config / "vet" / "guides.toml", + """ +[logic_error] +suffix = "- Global suffix" +""", + ) + + repo_path = tmp_path / "repo" + repo_path.mkdir() + _write_guides_toml( + repo_path / "guides.toml", + """ +[logic_error] +prefix = "- Project prefix" +""", + ) + + with patch.dict(os.environ, {"XDG_CONFIG_HOME": str(xdg_config)}): + result = load_custom_guides_config(repo_path=repo_path) + + assert "logic_error" in result.guides + guide = result.guides["logic_error"] + assert guide.prefix == "- Project prefix" + assert guide.suffix is None + + +def test_load_custom_guides_different_codes_merged(tmp_path: Path) -> None: + xdg_config = tmp_path / "xdg" + _write_guides_toml( + xdg_config / "vet" / "guides.toml", + """ +[logic_error] +suffix = "- Global logic error suffix" +""", + ) + + repo_path = tmp_path / "repo" + repo_path.mkdir() + _write_guides_toml( + repo_path / "guides.toml", + """ +[insecure_code] +replace = "- Project insecure code replacement" +""", + ) + + with patch.dict(os.environ, {"XDG_CONFIG_HOME": str(xdg_config)}): + result = load_custom_guides_config(repo_path=repo_path) + + assert "logic_error" in result.guides + assert "insecure_code" in result.guides + assert result.guides["logic_error"].suffix == "- Global logic error suffix" + assert result.guides["insecure_code"].replace == "- Project insecure code replacement" diff --git a/vet/cli/config/loader.py b/vet/cli/config/loader.py @@ -6,14 +6,17 @@ from pathlib import Path from pydantic import ValidationError -from vet.imbue_core.agents.configs import LanguageModelGenerationConfig -from vet.imbue_core.agents.configs import OpenAICompatibleModelConfig -from vet.imbue_core.agents.llm_apis.common import get_model_max_output_tokens from vet.cli.config.cli_config_schema import CliConfigPreset from vet.cli.config.cli_config_schema import merge_presets from vet.cli.config.cli_config_schema import parse_cli_config_from_dict from vet.cli.config.schema import ModelsConfig from vet.cli.config.schema import ProviderConfig +from vet.imbue_core.agents.configs import LanguageModelGenerationConfig +from vet.imbue_core.agents.configs import OpenAICompatibleModelConfig +from vet.imbue_core.agents.llm_apis.common import get_model_max_output_tokens +from vet.imbue_core.data_types import CustomGuideConfig +from vet.imbue_core.data_types import CustomGuidesConfig +from vet.imbue_core.data_types import get_valid_issue_code_values class ConfigLoadError(Exception): @@ -66,7 +69,7 @@ def _get_config_file_paths( def get_config_file_paths(repo_path: Path | None = None) -> list[Path]: - return _get_config_file_paths("imbue", "models.json", "models.json", repo_path) + return _get_config_file_paths("vet", "models.json", "models.json", repo_path) def _load_single_config_file(config_path: Path) -> ModelsConfig: @@ -208,3 +211,44 @@ def get_config_preset( f"No configuration files found. Create a config at one of these locations:\n{paths_list}" ) return cli_configs[config_name] + + +def get_guides_config_file_paths(repo_path: Path | None = None) -> list[Path]: + return _get_config_file_paths("vet", "guides.toml", "guides.toml", repo_path) + + +def _load_single_guides_file(config_path: Path) -> CustomGuidesConfig: + try: + with open(config_path, "rb") as f: + data = tomllib.load(f) + except tomllib.TOMLDecodeError as e: + raise ConfigLoadError(f"Invalid TOML in {config_path}: {e}") from e + except OSError as e: + raise ConfigLoadError(f"Cannot read {config_path}: {e}") from e + + all_issue_code_values = get_valid_issue_code_values() + guides: dict[str, CustomGuideConfig] = {} + for key, value in data.items(): + if key not in all_issue_code_values: + raise ConfigLoadError( + f"Unknown issue code '{key}' in {config_path}. " f"Use --list-issue-codes to see valid codes." + ) + if not isinstance(value, dict): + raise ConfigLoadError(f"Expected a table for '{key}' in {config_path}, got {type(value).__name__}") + try: + guides[key] = CustomGuideConfig.model_validate(value) + except ValidationError as e: + raise ConfigLoadError(f"Invalid guide configuration for '{key}' in {config_path}: {e}") from e + + return CustomGuidesConfig(guides=guides) + + +def load_custom_guides_config(repo_path: Path | None = None) -> CustomGuidesConfig: + merged_guides: dict[str, CustomGuideConfig] = {} + + for config_path in get_guides_config_file_paths(repo_path): + if config_path.exists(): + config = _load_single_guides_file(config_path) + merged_guides.update(config.guides) + + return CustomGuidesConfig(guides=merged_guides) diff --git a/vet/cli/config/loader_test.py b/vet/cli/config/loader_test.py @@ -58,7 +58,7 @@ def test_get_config_file_paths_returns_global_path(tmp_path: Path) -> None: with patch.dict(os.environ, {"XDG_CONFIG_HOME": str(tmp_path)}): paths = get_config_file_paths(repo_path=None) assert len(paths) == 1 - assert paths[0] == tmp_path / "imbue" / "models.json" + assert paths[0] == tmp_path / "vet" / "models.json" def test_get_config_file_paths_finds_git_root(tmp_path: Path) -> None: @@ -72,7 +72,7 @@ def test_get_config_file_paths_finds_git_root(tmp_path: Path) -> None: with patch.dict(os.environ, {"XDG_CONFIG_HOME": str(xdg_config)}): paths = get_config_file_paths(repo_path=subdir) assert len(paths) == 2 - assert paths[0] == xdg_config / "imbue" / "models.json" + assert paths[0] == xdg_config / "vet" / "models.json" assert paths[1] == git_root / "models.json" @@ -187,8 +187,8 @@ def test_load_models_config_loads_project_config(tmp_path: Path) -> None: def test_load_models_config_project_overrides_global(tmp_path: Path) -> None: xdg_config = tmp_path / "xdg" - (xdg_config / "imbue").mkdir(parents=True) - global_config = xdg_config / "imbue" / "models.json" + (xdg_config / "vet").mkdir(parents=True) + global_config = xdg_config / "vet" / "models.json" global_config.write_text( json.dumps( { diff --git a/vet/cli/main.py b/vet/cli/main.py @@ -12,6 +12,7 @@ from pathlib import Path from loguru import logger from vet.imbue_core.data_types import IssueCode +from vet.imbue_core.data_types import get_valid_issue_code_values from vet.imbue_tools.get_conversation_history.get_conversation_history import ( parse_conversation_history, ) @@ -25,6 +26,7 @@ from vet.cli.config.loader import get_cli_config_file_paths from vet.cli.config.loader import get_config_preset from vet.cli.config.loader import get_max_output_tokens_for_model from vet.cli.config.loader import load_cli_config +from vet.cli.config.loader import load_custom_guides_config from vet.cli.config.loader import load_models_config from vet.cli.config.loader import validate_api_key_for_model from vet.cli.config.schema import ModelsConfig @@ -242,17 +244,13 @@ def create_parser() -> argparse.ArgumentParser: return parser -def _get_available_issue_codes() -> list[IssueCode]: - return [code for code in IssueCode if not code.name.startswith("_DEPRECATED")] - - # TODO: There are logical groupings of codes we should consider because some issue_codes are associated with the same prompts / categories of issues. # This should likely be used to dictate the ordering instead of sorting. def list_issue_codes() -> None: print("Available issue codes:") print() - for code in sorted(_get_available_issue_codes(), key=lambda c: c.value): - print(f" {code.value}") + for code in sorted(get_valid_issue_code_values()): + print(f" {code}") def list_models(user_config: ModelsConfig | None = None) -> None: @@ -356,6 +354,12 @@ def main(argv: list[str] | None = None) -> int: print(f"Error loading model configuration: {e}", file=sys.stderr) return 2 + try: + custom_guides_config = load_custom_guides_config(repo_path) + except ConfigLoadError as e: + print(f"Error loading custom guides: {e}", file=sys.stderr) + return 2 + if args.list_issue_codes: list_issue_codes() return 0 @@ -480,6 +484,7 @@ def main(argv: list[str] | None = None) -> int: max_identify_workers=args.max_workers, max_output_tokens=max_output_tokens or 20000, max_identifier_spend_dollars=args.max_spend, + custom_guides_config=custom_guides_config, ) issues = find_issues( diff --git a/vet/imbue_core/data_types.py b/vet/imbue_core/data_types.py @@ -75,7 +75,10 @@ There are also things we explicitly don't want to catch with this system: from enum import StrEnum from typing import Literal +from pydantic import BaseModel +from pydantic import ConfigDict from pydantic import Field +from pydantic import model_validator from vet.imbue_core.common import generate_id from vet.imbue_core.pydantic_serialization import SerializableModel @@ -237,6 +240,34 @@ class IssueCode(StrEnum): _DEPRECATED_LLM_ARTIFACTS_LEFT_IN_CODE = "llm_artifacts_left_in_code" +def get_valid_issue_code_values() -> set[str]: + return {code.value for code in IssueCode if not code.name.startswith("_DEPRECATED")} + + +class CustomGuideConfig(BaseModel): + model_config = ConfigDict(frozen=True, extra="forbid") + + prefix: str | None = None + suffix: str | None = None + replace: str | None = None + + @model_validator(mode="after") + def validate_fields(self) -> "CustomGuideConfig": + has_prefix_or_suffix = self.prefix is not None or self.suffix is not None + has_replace = self.replace is not None + if has_replace and has_prefix_or_suffix: + raise ValueError("'replace' cannot be used together with 'prefix' or 'suffix'") + if not has_replace and not has_prefix_or_suffix: + raise ValueError("At least one of 'prefix', 'suffix', or 'replace' must be set") + return self + + +class CustomGuidesConfig(BaseModel): + model_config = ConfigDict(frozen=True, extra="forbid") + + guides: dict[str, CustomGuideConfig] = Field(default_factory=dict) + + class IssueLocation(SerializableModel): """A location in a file.""" diff --git a/vet/imbue_tools/types/vet_config.py b/vet/imbue_tools/types/vet_config.py @@ -2,7 +2,9 @@ from pathlib import Path from vet.imbue_core.agents.configs import LanguageModelGenerationConfig from vet.imbue_core.agents.llm_apis.anthropic_api import AnthropicModelName +from vet.imbue_core.data_types import CustomGuidesConfig from vet.imbue_core.data_types import IssueCode +from vet.imbue_core.data_types import get_valid_issue_code_values from vet.imbue_core.pydantic_serialization import SerializableModel DEFAULT_CONFIDENCE_THRESHOLD = 0.8 @@ -23,6 +25,9 @@ class VetConfig(SerializableModel): enabled_issue_codes: tuple[IssueCode, ...] | None = None disabled_issue_codes: tuple[IssueCode, ...] | None = () + # Custom guides to override built-in guides for issue codes. + custom_guides_config: CustomGuidesConfig | None = None + # Todo: Different models for different issue identifiers language_model_generation_config: LanguageModelGenerationConfig = LanguageModelGenerationConfig( model_name=AnthropicModelName.CLAUDE_4_6_OPUS @@ -87,7 +92,7 @@ class VetConfig(SerializableModel): def get_enabled_issue_codes(config: VetConfig) -> set[IssueCode]: - all_issue_code_values = {item.value for item in IssueCode} + all_issue_code_values = get_valid_issue_code_values() explicitly_enabled = config.enabled_issue_codes or tuple() explicitly_disabled = config.disabled_issue_codes or tuple() for code in explicitly_enabled + explicitly_disabled: diff --git a/vet/issue_identifiers/custom_guides_merge_test.py b/vet/issue_identifiers/custom_guides_merge_test.py @@ -0,0 +1,84 @@ +from __future__ import annotations + +import pytest + +from vet.imbue_core.data_types import CustomGuideConfig +from vet.imbue_core.data_types import CustomGuidesConfig +from vet.imbue_core.data_types import IssueCode +from vet.issue_identifiers.identification_guides import IssueIdentificationGuide +from vet.issue_identifiers.identification_guides import apply_custom_guides + + +@pytest.fixture +def built_in_guides() -> dict[IssueCode, IssueIdentificationGuide]: + return { + IssueCode.LOGIC_ERROR: IssueIdentificationGuide( + issue_code=IssueCode.LOGIC_ERROR, + guide="- Built-in logic error guide", + additional_guide_for_agent="agent guide", + examples=("example1",), + exceptions=("exception1",), + ), + IssueCode.INSECURE_CODE: IssueIdentificationGuide( + issue_code=IssueCode.INSECURE_CODE, + guide="- Built-in insecure code guide", + ), + } + + +def test_apply_none_config_returns_unchanged( + built_in_guides: dict[IssueCode, IssueIdentificationGuide], +) -> None: + result = apply_custom_guides(built_in_guides, None) + assert result is built_in_guides + + +def test_apply_suffix( + built_in_guides: dict[IssueCode, IssueIdentificationGuide], +) -> None: + config = CustomGuidesConfig(guides={"logic_error": CustomGuideConfig(suffix="- Custom suffix")}) + result = apply_custom_guides(built_in_guides, config) + assert result[IssueCode.LOGIC_ERROR].guide == "- Built-in logic error guide\n- Custom suffix" + + +def test_apply_prefix( + built_in_guides: dict[IssueCode, IssueIdentificationGuide], +) -> None: + config = CustomGuidesConfig(guides={"logic_error": CustomGuideConfig(prefix="- Custom prefix")}) + result = apply_custom_guides(built_in_guides, config) + assert result[IssueCode.LOGIC_ERROR].guide == "- Custom prefix\n- Built-in logic error guide" + + +def test_apply_replace( + built_in_guides: dict[IssueCode, IssueIdentificationGuide], +) -> None: + config = CustomGuidesConfig(guides={"logic_error": CustomGuideConfig(replace="- Replacement guide")}) + result = apply_custom_guides(built_in_guides, config) + assert result[IssueCode.LOGIC_ERROR].guide == "- Replacement guide" + + +def test_apply_prefix_and_suffix( + built_in_guides: dict[IssueCode, IssueIdentificationGuide], +) -> None: + config = CustomGuidesConfig(guides={"logic_error": CustomGuideConfig(prefix="- Before", suffix="- After")}) + result = apply_custom_guides(built_in_guides, config) + assert result[IssueCode.LOGIC_ERROR].guide == "- Before\n- Built-in logic error guide\n- After" + + +def test_apply_preserves_non_guide_fields( + built_in_guides: dict[IssueCode, IssueIdentificationGuide], +) -> None: + config = CustomGuidesConfig(guides={"logic_error": CustomGuideConfig(replace="- Replacement")}) + result = apply_custom_guides(built_in_guides, config) + guide = result[IssueCode.LOGIC_ERROR] + assert guide.additional_guide_for_agent == "agent guide" + assert guide.examples == ("example1",) + assert guide.exceptions == ("exception1",) + + +def test_apply_does_not_modify_other_codes( + built_in_guides: dict[IssueCode, IssueIdentificationGuide], +) -> None: + config = CustomGuidesConfig(guides={"logic_error": CustomGuideConfig(suffix="- Custom suffix")}) + result = apply_custom_guides(built_in_guides, config) + assert result[IssueCode.INSECURE_CODE].guide == "- Built-in insecure code guide" diff --git a/vet/issue_identifiers/identification_guides.py b/vet/issue_identifiers/identification_guides.py @@ -1,3 +1,6 @@ +from __future__ import annotations + +from vet.imbue_core.data_types import CustomGuidesConfig from vet.imbue_core.data_types import IssueCode from vet.imbue_core.pydantic_serialization import SerializableModel @@ -452,3 +455,35 @@ ISSUE_CODES_FOR_CONVERSATION_HISTORY_CHECK: tuple[IssueCode, ...] = ( IssueCode.INSTRUCTION_FILE_DISOBEYED, IssueCode.INSTRUCTION_TO_SAVE, ) + + +def apply_custom_guides( + guides_by_code: dict[IssueCode, IssueIdentificationGuide], + custom_config: CustomGuidesConfig | None, +) -> dict[IssueCode, IssueIdentificationGuide]: + if custom_config is None or not custom_config.guides: + return guides_by_code + + result = dict(guides_by_code) + for issue_code_str, custom in custom_config.guides.items(): + issue_code = IssueCode(issue_code_str) + built_in = result[issue_code] + + if custom.replace is not None: + merged_guide = custom.replace + else: + merged_guide = built_in.guide + if custom.prefix is not None: + merged_guide = custom.prefix + "\n" + merged_guide + if custom.suffix is not None: + merged_guide = merged_guide + "\n" + custom.suffix + + result[issue_code] = IssueIdentificationGuide( + issue_code=issue_code, + guide=merged_guide, + additional_guide_for_agent=built_in.additional_guide_for_agent, + examples=built_in.examples, + exceptions=built_in.exceptions, + ) + + return result diff --git a/vet/issue_identifiers/registry.py b/vet/issue_identifiers/registry.py @@ -50,6 +50,10 @@ from vet.issue_identifiers.identification_guides import ( from vet.issue_identifiers.identification_guides import ( ISSUE_IDENTIFICATION_GUIDES_BY_ISSUE_CODE, ) +from vet.issue_identifiers.identification_guides import ( + IssueIdentificationGuide, +) +from vet.issue_identifiers.identification_guides import apply_custom_guides from vet.issue_identifiers.issue_deduplication import deduplicate_issues from vet.issue_identifiers.issue_evaluation import filter_issues from vet.issue_identifiers.utils import ReturnCapturingGenerator @@ -118,7 +122,9 @@ def _get_enabled_identifier_names( def _build_identifiers( - identifiers_to_build: set[IssueIdentifierType], enabled_issue_codes: set[IssueCode] + identifiers_to_build: set[IssueIdentifierType], + enabled_issue_codes: set[IssueCode], + guides_by_issue_code: dict[IssueCode, IssueIdentificationGuide], ) -> list[tuple[str, IssueIdentifier]]: # Merge the enabled issue codes for each harness enabled_issue_codes_per_harness: defaultdict[IssueIdentifierHarness, set[IssueCode]] = defaultdict(set) @@ -138,7 +144,7 @@ def _build_identifiers( ( combined_name, harness.make_issue_identifier( - identification_guides=tuple(ISSUE_IDENTIFICATION_GUIDES_BY_ISSUE_CODE[code] for code in issue_codes) + identification_guides=tuple(guides_by_issue_code[code] for code in issue_codes) ), ) ) @@ -180,7 +186,10 @@ def run( Run all the registered and configured issue identifiers on the given inputs. """ enabled_issue_codes = get_enabled_issue_codes(config) - identifiers = _build_identifiers(_get_enabled_identifier_names(config), enabled_issue_codes) + guides_by_issue_code = apply_custom_guides( + dict(ISSUE_IDENTIFICATION_GUIDES_BY_ISSUE_CODE), config.custom_guides_config + ) + identifiers = _build_identifiers(_get_enabled_identifier_names(config), enabled_issue_codes, guides_by_issue_code) ensure_global_resource_limits( max_dollars=( config.max_identifier_spend_dollars if config.max_identifier_spend_dollars is not None else float("inf")