Local-only CLIs and GitHub Actions that audit AI-agent activity β config drift, capability changes, scope creep, runtime behavior, and live session trajectory. Nothing leaves the machine; every tool is advisory by default.
Pasadena, CA Β· @conalhck
A layered review stack for AI-agent work: one substrate, five PR-time detectors, one live runtime monitor, one meta-reviewer.
Substrate
- agent-gov-core β canonical
Findingschema,mergeFindings, JSONC/TOML/MCP/shell/transcript parsers. Zero runtime deps.
PR-time detectors β each runs standalone as a CLI or GitHub Action.
- ScopeTrail β diff of agent config files (
.claude/settings.json,.mcp.json, Codex sandbox). - PolicyMesh β contradictions across MCP, Claude, Cursor, Codex, Aider configs.
- CapabilityEcho β new network, subprocess,
eval, lifecycle, or workflow-permission signals on the added diff lines. - TaskBound β scope creep: PR diff vs stated task.
- SessionTrail β risky runtime behavior in Cursor/Claude/Codex transcripts (credential reads,
curl|sh, unknown MCP servers, scope escapes).
Live runtime monitor
- AgentPulse β terminal dashboard that classifies live agent sessions (
converging/exploring/stuck/done/drifting/idle). Deterministic, no LLM.
Meta-reviewer
- GovVerdict β ingests JSON reports from the five PR-time detectors, dedupes by fingerprint, renders one consolidated PR review.
Demo
- agent-gov-demo β a rogue PR (#1) that trips all five detectors at once. The PR is deliberately titled "fix: typo in README" β TaskBound is meant to catch that.
Example workflow: agent-gov-review.yml.
- fit-ontology β client-intelligence ontology for personal trainers. Unifies wearables, intake, and ACSM guidelines into one queryable model, with an explainable rules-based reasoning layer.



