Boubdir, Meriem

1 publications

NeurIPS 2024 Elo Uncovered: Robustness and Best Practices in Language Model Evaluation Meriem Boubdir, Edward Kim, Beyza Ermis, Sara Hooker, Marzieh Fadaee