Gastaldi, Juan Luis

3 publications

ICML 2025 From Language Models over Tokens to Language Models over Characters Tim Vieira, Benjamin Lebrun, Mario Giulianelli, Juan Luis Gastaldi, Brian Dusell, John Terilla, Timothy J. O’Donnell, Ryan Cotterell
ICML 2025 Language Models over Canonical Byte-Pair Encodings Tim Vieira, Tianyu Liu, Clemente Pasti, Yahya Emara, Brian Dusell, Benjamin Lebrun, Mario Giulianelli, Juan Luis Gastaldi, Timothy J. O’Donnell, Ryan Cotterell
ICLR 2025 The Foundations of Tokenization: Statistical and Computational Concerns Juan Luis Gastaldi, John Terilla, Luca Malagutti, Brian DuSell, Tim Vieira, Ryan Cotterell