Verify what agents actually shipped, arbitrate file collisions, refuse with structured reasons.
Catch your AI agents when they lie about what they shipped. The whole pitch in one recording: the agent claims two features shipped; git backs one. dos verify answers from the commits, the lie exits 1, and a gate on that exit code refuses the false "done". Every line is the real CLI's verbatim output — scripts/buildcaughtliecast.py re-records it whenever the output changes. Run a fleet of agents…
Verification confirms publisher identity (repo ownership), not code safety. The security scan covers known CVEs and suspicious install scripts — it cannot prove the absence of malicious code.
### Catch your AI agents when they lie about what they shipped. The whole pitch in one recording: the agent claims two features shipped; git backs one. dos verify answers from the commits, the lie exits 1, and a gate on that exit code refuses the false "done". Every line is the real CLI's verbatim output — scripts/buildcaughtliecast.py re-records it whenever the output changes. Run a fleet of agents on one repo. The left loop just feels like progress; the right one you can steer. The only…