Measure whether your AI agrees with itself using statistical consensus metrics.
One command. Find out if your AI agrees with itself. []( [](./LICENSE.md) []() ConKurrence measures whether multiple AI models produce consistent outputs on your evaluation tasks — using the same psychometric methods trusted in clinical research (Fleiss' κ, Kendall's W, bootstrap confidence intervals). Stop guessing whether your golden dataset is reliable. Know statistically. You now know is…
Verification confirms publisher identity (repo ownership), not code safety. The security scan covers known CVEs and suspicious install scripts — it cannot prove the absence of malicious code.
One command. Find out if your AI agrees with itself. []( [](./LICENSE.md) []() ConKurrence measures whether multiple AI models produce consistent outputs on your evaluation tasks — using the same psychometric methods trusted in clinical research (Fleiss' κ, Kendall's W, bootstrap confidence intervals). Stop guessing whether your golden dataset is reliable. Know statistically. You now know is reliable, needs review, and should be redesigned before trusting it. Your golden dataset has a hidden…