ML Anthology
Authors
Search
About
Gupta, Kshitij
7 publications
TMLR
2024
Simple and Scalable Strategies to Continually Pre-Train Large Language Models
Adam Ibrahim
,
Benjamin Thérien
,
Kshitij Gupta
,
Mats Leon Richter
,
Quentin Gregory Anthony
,
Eugene Belilovsky
,
Timothée Lesort
,
Irina Rish
NeurIPSW
2023
ARB: Advanced Reasoning Benchmark for Large Language Models
Tomohiro Sawada
,
Daniel Paleka
,
Alexander Havrilla
,
Pranav Tadepalli
,
Paula Vidas
,
Alexander Kranias
,
John Nay
,
Kshitij Gupta
,
Aran Komatsuzaki
ICLR
2023
Broken Neural Scaling Laws
Ethan Caballero
,
Kshitij Gupta
,
Irina Rish
,
David Krueger
ICLRW
2023
Broken Neural Scaling Laws
Ethan Caballero
,
Kshitij Gupta
,
Irina Rish
,
David Krueger
ICMLW
2023
Continual Pre-Training of Large Language Models: How to Re-Warm Your Model?
Kshitij Gupta
,
Benjamin Thérien
,
Adam Ibrahim
,
Mats Leon Richter
,
Quentin Gregory Anthony
,
Eugene Belilovsky
,
Irina Rish
,
Timothée Lesort
NeurIPSW
2022
Broken Neural Scaling Laws
Ethan Caballero
,
Kshitij Gupta
,
Irina Rish
,
David Krueger
NeurIPS
2022
Temporal Latent Bottleneck: Synthesis of Fast and Slow Processing Mechanisms in Sequence Learning
Aniket Didolkar
,
Kshitij Gupta
,
Anirudh Goyal
,
Nitesh Bharadwaj Gundavarapu
,
Alex M Lamb
,
Nan Rosemary Ke
,
Yoshua Bengio