Irving, Geoffrey

7 publications

ICML 2024 Scalable AI Safety via Doubly-Efficient Debate Jonah Brown-Cohen, Geoffrey Irving, Georgios Piliouras

ICMLW 2024 Scalable AI Safety via Doubly-Efficient Debate Jonah Brown-Cohen, Geoffrey Irving, Georgios Piliouras

NeurIPS 2022 Characteristics of Harmful Text: Towards Rigorous Benchmarking of Language Models Maribeth Rauh, John Mellor, Jonathan Uesato, Po-Sen Huang, Johannes Welbl, Laura Weidinger, Sumanth Dathathri, Amelia Glaese, Geoffrey Irving, Iason Gabriel, William Isaac, Lisa Anne Hendricks

ICML 2022 Improving Language Models by Retrieving from Trillions of Tokens Sebastian Borgeaud, Arthur Mensch, Jordan Hoffmann, Trevor Cai, Eliza Rutherford, Katie Millican, George Bm Van Den Driessche, Jean-Baptiste Lespiau, Bogdan Damoc, Aidan Clark, Diego De Las Casas, Aurelia Guy, Jacob Menick, Roman Ring, Tom Hennigan, Saffron Huang, Loren Maggiore, Chris Jones, Albin Cassirer, Andy Brock, Michela Paganini, Geoffrey Irving, Oriol Vinyals, Simon Osindero, Karen Simonyan, Jack Rae, Erich Elsen, Laurent Sifre

Distill 2019 AI Safety Needs Social Scientists Geoffrey Irving, Amanda Askell

NeurIPS 2018 Reward Learning from Human Preferences and Demonstrations in Atari Borja Ibarz, Jan Leike, Tobias Pohlen, Geoffrey Irving, Shane Legg, Dario Amodei

NeurIPS 2016 DeepMath - Deep Sequence Models for Premise Selection Geoffrey Irving, Christian Szegedy, Alexander A Alemi, Niklas Een, Francois Chollet, Josef Urban