Alemi, Alexander A.

14 publications

TMLR 2024 Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models Avi Singh, John D Co-Reyes, Rishabh Agarwal, Ankesh Anand, Piyush Patil, Xavier Garcia, Peter J Liu, James Harrison, Jaehoon Lee, Kelvin Xu, Aaron T Parisi, Abhishek Kumar, Alexander A Alemi, Alex Rizkowsky, Azade Nova, Ben Adlam, Bernd Bohnet, Gamaleldin Fathy Elsayed, Hanie Sedghi, Igor Mordatch, Isabelle Simpson, Izzeddin Gur, Jasper Snoek, Jeffrey Pennington, Jiri Hron, Kathleen Kenealy, Kevin Swersky, Kshiteej Mahajan, Laura A Culp, Lechao Xiao, Maxwell Bileschi, Noah Constant, Roman Novak, Rosanne Liu, Tris Warkentin, Yamini Bansal, Ethan Dyer, Behnam Neyshabur, Jascha Sohl-Dickstein, Noah Fiedel
ICML 2024 Scaling Exponents Across Parameterizations and Optimizers Katie E Everett, Lechao Xiao, Mitchell Wortsman, Alexander A Alemi, Roman Novak, Peter J Liu, Izzeddin Gur, Jascha Sohl-Dickstein, Leslie Pack Kaelbling, Jaehoon Lee, Jeffrey Pennington
ICLR 2024 Small-Scale Proxies for Large-Scale Transformer Training Instabilities Mitchell Wortsman, Peter J Liu, Lechao Xiao, Katie E Everett, Alexander A Alemi, Ben Adlam, John D Co-Reyes, Izzeddin Gur, Abhishek Kumar, Roman Novak, Jeffrey Pennington, Jascha Sohl-Dickstein, Kelvin Xu, Jaehoon Lee, Justin Gilmer, Simon Kornblith
TMLR 2024 Training LLMs over Neurally Compressed Text Brian Lester, Jaehoon Lee, Alexander A Alemi, Jeffrey Pennington, Adam Roberts, Jascha Sohl-Dickstein, Noah Constant
ICLR 2023 Weighted Ensemble Self-Supervised Learning Yangjun Ruan, Saurabh Singh, Warren Richard Morningstar, Alexander A Alemi, Sergey Ioffe, Ian Fischer, Joshua V. Dillon
NeurIPSW 2022 Trajectory Ensembling for Fine Tuning - Performance Gains Without Modifying Training Louise Anderson-Conway, Vighnesh Birodkar, Saurabh Singh, Hossein Mobahi, Alexander A Alemi
ICMLW 2021 A Closer Look at the Adversarial Robustness of Information Bottleneck Models Iryna Korshunova, David Stutz, Alexander A Alemi, Olivia Wiles, Sven Gowal
NeurIPS 2021 Does Knowledge Distillation Really Work? Samuel Stanton, Pavel Izmailov, Polina Kirichenko, Alexander A Alemi, Andrew G Wilson
ICLR 2020 Neural Tangents: Fast and Easy Infinite Neural Networks in Python Roman Novak, Lechao Xiao, Jiri Hron, Jaehoon Lee, Alexander A. Alemi, Jascha Sohl-Dickstein, Samuel S. Schoenholz
NeurIPS 2018 GILBO: One Metric to Measure Them All Alexander A Alemi, Ian Fischer
NeurIPS 2018 Watch Your Step: Learning Node Embeddings via Graph Attention Sami Abu-El-Haija, Bryan Perozzi, Rami Al-Rfou, Alexander A Alemi
ICLR 2017 Deep Variational Information Bottleneck Alexander A. Alemi, Ian Fischer, Joshua V. Dillon, Kevin Murphy
AAAI 2017 Inception-V4, Inception-ResNet and the Impact of Residual Connections on Learning Christian Szegedy, Sergey Ioffe, Vincent Vanhoucke, Alexander A. Alemi
NeurIPS 2016 DeepMath - Deep Sequence Models for Premise Selection Geoffrey Irving, Christian Szegedy, Alexander A Alemi, Niklas Een, Francois Chollet, Josef Urban