Mach, David

1 publications

ICLR 2026 Common Corpus: The Largest Collection of Ethical Data for LLM Pre-Training Pierre-Carl Langlais, Pavel Chizhov, Catherine Arnett, Carlos Rosas Hinostroza, Mattia Nee, Eliot Krzysztof Jones, Irène Girard, David Mach, Anastasia Stasenko, Ivan P. Yamshchikov