Scaling up Models and Data with T5x and Seqio

Abstract

Scaling up training datasets and model parameters have benefited neural network-based language models, but also present challenges like distributed compute, input data bottlenecks and reproducibility of results. We introduce two simple and scalable software libraries that simplify these issues: t5x enables training large language models at scale, while seqio enables reproducible input and evaluation pipelines. These open-source libraries have been used to train models with hundreds of billions of parameters on multi-terabyte datasets. Configurations and instructions for T5-like and GPT-like models are also provided. The libraries can be found at https://github.com/google-research/t5x and https://github.com/google/seqio.

Cite

Text

Roberts et al. "Scaling up Models and Data with T5x and Seqio." Machine Learning Open Source Software, 2023.

Markdown

[Roberts et al. "Scaling up Models and Data with T5x and Seqio." Machine Learning Open Source Software, 2023.](https://mlanthology.org/mloss/2023/roberts2023jmlr-scaling/)

BibTeX

@article{roberts2023jmlr-scaling,
  title     = {{Scaling up Models and Data with T5x and Seqio}},
  author    = {Roberts, Adam and Chung, Hyung Won and Mishra, Gaurav and Levskaya, Anselm and Bradbury, James and Andor, Daniel and Narang, Sharan and Lester, Brian and Gaffney, Colin and Mohiuddin, Afroz and Hawthorne, Curtis and Lewkowycz, Aitor and Salcianu, Alex and van Zee, Marc and Austin, Jacob and Goodman, Sebastian and Soares, Livio Baldini and Hu, Haitang and Tsvyashchenko, Sasha and Chowdhery, Aakanksha and Bastings, Jasmijn and Bulian, Jannis and Garcia, Xavier and Ni, Jianmo and Chen, Andrew and Kenealy, Kathleen and Han, Kehang and Casbon, Michelle and Clark, Jonathan H. and Lee, Stephan and Garrette, Dan and Lee-Thorp, James and Raffel, Colin and Shazeer, Noam and Ritter, Marvin and Bosma, Maarten and Passos, Alexandre and Maitin-Shepard, Jeremy and Fiedel, Noah and Omernick, Mark and Saeta, Brennan and Sepassi, Ryan and Spiridonov, Alexander and Newlan, Joshua and Gesmundo, Andrea},
  journal   = {Machine Learning Open Source Software},
  year      = {2023},
  pages     = {1-8},
  volume    = {24},
  url       = {https://mlanthology.org/mloss/2023/roberts2023jmlr-scaling/}
}