INCLUDE: Evaluating Multilingual Language Understanding with Regional Knowledge

Romanou, Angelika; Foroutan, Negar; Sotnikova, Anna; Nelaturu, Sree Harsha; Singh, Shivalika; Maheshwary, Rishabh; Altomare, Micol; Chen, Zeming; Haggag, Mohamed A.; A, Snegha; Amayuelas, Alfonso; Amirudin, Azril Hafizi; Boiko, Danylo; Chang, Michael; Chim, Jenny; Cohen, Gal; Dalmia, Aditya Kumar; Diress, Abraham; Duwal, Sharad; Dzenhaliou, Daniil; Florez, Daniel Fernando Erazo; Farestam, Fabian; Imperial, Joseph Marvin; Bin Islam, Shayekh; Isotalo, Perttu; Jabbarishiviari, Maral; Karlsson, Börje F.; Khalilov, Eldar; Klamm, Christopher; Koto, Fajri; Krzemiński, Dominik; de Melo, Gabriel Adriano; Montariol, Syrielle; Nan, Yiyang; Niklaus, Joel; Novikova, Jekaterina; Ceron, Johan Samir Obando; Paul, Debjit; Ploeger, Esther; Purbey, Jebish; Rajwal, Swati; Ravi, Selvan Sunitha; Rydell, Sara; Santhosh, Roshan; Sharma, Drishti; Skenduli, Marjana Prifti; Moakhar, Arshia Soltani; Moakhar, Bardia soltani; Tarun, Ayush Kumar; Wasi, Azmine Toushik; Weerasinghe, Thenuka Ovin; Yilmaz, Serhan; Zhang, Mike; Schlag, Imanol; Fadaee, Marzieh; Hooker, Sara; Bosselut, Antoine

INCLUDE: Evaluating Multilingual Language Understanding with Regional Knowledge

Angelika Romanou, Negar Foroutan, Anna Sotnikova, Sree Harsha Nelaturu, Shivalika Singh, Rishabh Maheshwary, Micol Altomare, Zeming Chen, Mohamed A. Haggag, Snegha A, Alfonso Amayuelas, Azril Hafizi Amirudin, Danylo Boiko, Michael Chang, Jenny Chim, Gal Cohen, Aditya Kumar Dalmia, Abraham Diress, Sharad Duwal, Daniil Dzenhaliou, Daniel Fernando Erazo Florez, Fabian Farestam, Joseph Marvin Imperial, Shayekh Bin Islam, Perttu Isotalo, Maral Jabbarishiviari, Börje F. Karlsson, Eldar Khalilov, Christopher Klamm, Fajri Koto, Dominik Krzemiński, Gabriel Adriano de Melo, Syrielle Montariol, Yiyang Nan, Joel Niklaus, Jekaterina Novikova, Johan Samir Obando Ceron, Debjit Paul, Esther Ploeger, Jebish Purbey, Swati Rajwal, Selvan Sunitha Ravi, Sara Rydell, Roshan Santhosh, Drishti Sharma, Marjana Prifti Skenduli, Arshia Soltani Moakhar, Bardia soltani Moakhar, Ayush Kumar Tarun, Azmine Toushik Wasi, Thenuka Ovin Weerasinghe, Serhan Yilmaz, Mike Zhang, Imanol Schlag, Marzieh Fadaee, Sara Hooker, Antoine Bosselut

ICLR 2025

/iclr/2025/romanou2025iclr-include/

Abstract

The performance differential of large language models (LLM) between languages hinders their effective deployment in many regions, inhibiting the potential economic and societal value of generative AI tools in many communities. However, the development of functional LLMs in many languages (i.e., multilingual LLMs) is bottlenecked by the lack of high-quality evaluation resources in languages other than English. Moreover, current practices in multilingual benchmark construction often translate English resources, ignoring the regional and cultural knowledge of the environments in which multilingual systems would be used. In this work, we construct an evaluation suite of 197,243 QA pairs from local exam sources to measure the capabilities of multilingual LLMs in a variety of regional contexts. Our novel resource, INCLUDE, is a comprehensive knowledge- and reasoning-centric benchmark across 44 written languages that evaluates multilingual LLMs for performance in the actual language environments where they would be deployed.

PDF ICLR Semantic Scholar

Cite

Text

Romanou et al. "INCLUDE: Evaluating Multilingual Language Understanding with Regional Knowledge." International Conference on Learning Representations, 2025.

Markdown

[Romanou et al. "INCLUDE: Evaluating Multilingual Language Understanding with Regional Knowledge." International Conference on Learning Representations, 2025.](https://mlanthology.org/iclr/2025/romanou2025iclr-include/)

BibTeX

@inproceedings{romanou2025iclr-include,
  title     = {{INCLUDE: Evaluating Multilingual Language Understanding with Regional Knowledge}},
  author    = {Romanou, Angelika and Foroutan, Negar and Sotnikova, Anna and Nelaturu, Sree Harsha and Singh, Shivalika and Maheshwary, Rishabh and Altomare, Micol and Chen, Zeming and Haggag, Mohamed A. and A, Snegha and Amayuelas, Alfonso and Amirudin, Azril Hafizi and Boiko, Danylo and Chang, Michael and Chim, Jenny and Cohen, Gal and Dalmia, Aditya Kumar and Diress, Abraham and Duwal, Sharad and Dzenhaliou, Daniil and Florez, Daniel Fernando Erazo and Farestam, Fabian and Imperial, Joseph Marvin and Bin Islam, Shayekh and Isotalo, Perttu and Jabbarishiviari, Maral and Karlsson, Börje F. and Khalilov, Eldar and Klamm, Christopher and Koto, Fajri and Krzemiński, Dominik and de Melo, Gabriel Adriano and Montariol, Syrielle and Nan, Yiyang and Niklaus, Joel and Novikova, Jekaterina and Ceron, Johan Samir Obando and Paul, Debjit and Ploeger, Esther and Purbey, Jebish and Rajwal, Swati and Ravi, Selvan Sunitha and Rydell, Sara and Santhosh, Roshan and Sharma, Drishti and Skenduli, Marjana Prifti and Moakhar, Arshia Soltani and Moakhar, Bardia soltani and Tarun, Ayush Kumar and Wasi, Azmine Toushik and Weerasinghe, Thenuka Ovin and Yilmaz, Serhan and Zhang, Mike and Schlag, Imanol and Fadaee, Marzieh and Hooker, Sara and Bosselut, Antoine},
  booktitle = {International Conference on Learning Representations},
  year      = {2025},
  url       = {https://mlanthology.org/iclr/2025/romanou2025iclr-include/}
}