advanced-evaluation

SKILLWorkflowcommunity
v0.0.0corticalstackUnknownSource →

This skill should be used when the user asks to "implement LLM-as-judge", "compare model outputs", "create evaluation rubrics", "mitigate evaluation bias", or mentions direct scoring, pairwise comparison, position bias, evaluation pipelines, or automated quality assessment.

Community-submitted skill. Not yet reviewed by the Forge team. Full prompt content may not be available.Request review →
1Clients
1Formats
Skill
Authorcorticalstack
Version0.0.0
LicenseUnknown
CategoryWorkflow
Formatsskill.md
PromptOpen (see Prompt tab)
Compatibility
Claude✓ Supported
Cursor
Copilot
ChatGPT
Gemini
About

This skill should be used when the user asks to "implement LLM-as-judge", "compare model outputs", "create evaluation rubrics", "mitigate evaluation bias", or mentions direct scoring, pairwise comparison, position bias, evaluation pipelines, or automated quality assessment.

Keywords
skillclaude