Hassell, Jackson

1 publications

ICLR 2026 Towards Reliable Benchmarking: A Contamination Free, Controllable Evaluation Framework for Multi-Step LLM Function Calling Seiji Maekawa, Jackson Hassell, Pouya Pezeshkpour, Tom Mitchell, Estevam Hruschka