Skillstore Skillstore
CompétencesVersionsDocsSoumettre
Compétences Auteurs EleutherAI
EleutherAI

EleutherAI

Actif
1
Compétences
1
Catégories
ClaudeCodexCode(CC)

Skills publiés 1

📊

logprob-prefill-analysis

Analyze model susceptibility to reward hacking

Sûr 70

This skill provides documentation for running prefill sensitivity analysis to measure how easily AI models can be manipulated into generating exploit code. Researchers use it to compare token-count versus logprob metrics for predicting reward hacking susceptibility across model checkpoints.

Claude Codex Code(CC)
Installer
Skillstore Skillstore
Documentation GitHub À propos

© 2025 Skillstore