Don’t Pass@k: A Bayesian Framework for Large Language Model Evaluation
Published in Under review at ICLR 2026, 2025
This paper introduces a novel Bayesian framework for evaluating LLMs, improving robustness and reducing computational costs with fewer trials.
Recommended citation: Hariri, M., Samandar, A. (2025). "Don’t Pass@k: A Bayesian Framework for Large Language Model Evaluation." Under review at ICLR 2026.
Download Paper | Download Bibtex
