Kramár, János

8 publications

NeurIPS 2024 Improving Sparse Decomposition of Language Model Activations with Gated Sparse Autoencoders Senthooran Rajamanoharan, Arthur Conmy, Lewis Smith, Tom Lieberum, Vikrant Varma, János Kramár, Rohin Shah, Neel Nanda
ICMLW 2024 Improving Sparse Decomposition of Language Model Activations with Gated Sparse Autoencoders Senthooran Rajamanoharan, Arthur Conmy, Lewis Smith, Tom Lieberum, Vikrant Varma, Janos Kramar, Rohin Shah, Neel Nanda
NeurIPS 2024 On Scalable Oversight with Weak LLMs Judging Strong LLMs Zachary Kenton, Noah Y. Siegel, János Kramár, Jonah Brown-Cohen, Samuel Albanie, Jannis Bulian, Rishabh Agarwal, David Lindner, Yunhao Tang, Noah D. Goodman, Rohin Shah
NeurIPS 2023 Tracr: Compiled Transformers as a Laboratory for Interpretability David Lindner, Janos Kramar, Sebastian Farquhar, Matthew Rahtz, Tom McGrath, Vladimir Mikulik
IJCAI 2021 A Neural Network Auction for Group Decision Making over a Continuous Space Yoram Bachrach, Ian Gemp, Marta Garnelo, János Kramár, Tom Eccles, Dan Rosenbaum, Thore Graepel
NeurIPS 2020 Learning to Play No-Press Diplomacy with Best Response Policy Iteration Thomas Anthony, Tom Eccles, Andrea Tacchetti, János Kramár, Ian Gemp, Thomas Hudson, Nicolas Porcel, Marc Lanctot, Julien Perolat, Richard Everett, Satinder P. Singh, Thore Graepel, Yoram Bachrach
ICLR 2019 Relational Forward Models for Multi-Agent Learning Andrea Tacchetti, H. Francis Song, Pedro A. M. Mediano, Vinicius Zambaldi, János Kramár, Neil C. Rabinowitz, Thore Graepel, Matthew Botvinick, Peter W. Battaglia
ICLR 2017 Zoneout: Regularizing RNNs by Randomly Preserving Hidden Activations David Krueger, Tegan Maharaj, János Kramár, Mohammad Pezeshki, Nicolas Ballas, Nan Rosemary Ke, Anirudh Goyal, Yoshua Bengio, Aaron C. Courville, Christopher J. Pal