Elazar, Yanai

11 publications

AAAI 2025 Calibrating Large Language Models with Sample Consistency Qing Lyu, Kumar Shridhar, Chaitanya Malaviya, Li Zhang, Yanai Elazar, Niket Tandon, Marianna Apidianaki, Mrinmaya Sachan, Chris Callison-Burch
ICLR 2025 Generalization V.s. Memorization: Tracing Language Models’ Capabilities Back to Pretraining Data Xinyi Wang, Antonis Antoniades, Yanai Elazar, Alfonso Amayuelas, Alon Albalak, Kexun Zhang, William Yang Wang
TMLR 2025 How Many Images Does It Take? Estimating Imitation Thresholds in Text-to-Image Models Sahil Verma, Royi Rassin, Arnav Mohanty Das, Gantavya Bhatt, Preethi Seshadri, Chirag Shah, Jeff Bilmes, Hannaneh Hajishirzi, Yanai Elazar
ICLR 2025 On Linear Representations and Pretraining Data Frequency in Language Models Jack Merullo, Noah A. Smith, Sarah Wiegreffe, Yanai Elazar
TMLR 2024 A Survey on Data Selection for Language Models Alon Albalak, Yanai Elazar, Sang Michael Xie, Shayne Longpre, Nathan Lambert, Xinyi Wang, Niklas Muennighoff, Bairu Hou, Liangming Pan, Haewon Jeong, Colin Raffel, Shiyu Chang, Tatsunori Hashimoto, William Yang Wang
CLeaR 2024 Estimating the Causal Effect of Early ArXiving on Paper Acceptance Yanai Elazar, Jiayao Zhang, David Wadden, Bo Zhang, Noah A. Smith
ICMLW 2024 Generalization vs. Memorization: Tracing Language Models' Capabilities Back to Pretraining Data Antonis Antoniades, Xinyi Wang, Yanai Elazar, Alfonso Amayuelas, Alon Albalak, Kexun Zhang, William Yang Wang
NeurIPSW 2024 How Many Van Goghs Does It Take to Van Gogh? Finding the Imitation Threshold Sahil Verma, Royi Rassin, Arnav Mohanty Das, Gantavya Bhatt, Preethi Seshadri, Chirag Shah, Jeff Bilmes, Hannaneh Hajishirzi, Yanai Elazar
NeurIPSW 2024 How Many Van Goghs Does It Take to Van Gogh? Finding the Imitation Threshold Sahil Verma, Royi Rassin, Arnav Mohanty Das, Gantavya Bhatt, Preethi Seshadri, Chirag Shah, Jeff Bilmes, Hannaneh Hajishirzi, Yanai Elazar
NeurIPS 2024 Paloma: A Benchmark for Evaluating Language Model Fit Ian Magnusson, Akshita Bhagia, Valentin Hofmann, Luca Soldaini, Ananya Harsh Jha, Oyvind Tafjord, Dustin Schwenk, Evan Pete Walsh, Yanai Elazar, Kyle Lo, Dirk Groeneveld, Iz Beltagy, Hannaneh Hajishirzi, Noah A. Smith, Kyle Richardson, Jesse Dodge
ICLR 2024 What's in My Big Data? Yanai Elazar, Akshita Bhagia, Ian Helgi Magnusson, Abhilasha Ravichander, Dustin Schwenk, Alane Suhr, Evan Pete Walsh, Dirk Groeneveld, Luca Soldaini, Sameer Singh, Hannaneh Hajishirzi, Noah A. Smith, Jesse Dodge