Evaluation of Geographical Distortions in Language Models
Abstract
Geographic bias in language models (LMs) is an underexplored dimension of model fairness, despite growing attention being given to other social biases. We investigate whether LMs provide equally accurate representations across all global regions and propose a benchmark of four indicators to detect undertrained and underperforming areas: (i) indirect assessment of geographic training data coverage via tokenizer analysis, (ii) evaluation of basic geographic knowledge, (iii) detection of geographic distortions, and (iv) visualization of performance disparities through maps. Applying this framework to ten widely used encoder- and decoder-based models, we find systematic overrepresentation of Western countries and consistent underrepresentation of several African, Eastern European, and Middle Eastern regions, leading to measurable performance gaps. We further analyse the impact of these biases on downstream tasks, particularly in crisis response, and show that regions most vulnerable to natural disasters are often those with poorer LM coverage. Our findings underscore the need for geographically balanced LMs to ensure equitable and effective global applications.
Cite
Text
Decoupes et al. "Evaluation of Geographical Distortions in Language Models." Machine Learning, 2025. doi:10.1007/S10994-025-06916-9Markdown
[Decoupes et al. "Evaluation of Geographical Distortions in Language Models." Machine Learning, 2025.](https://mlanthology.org/mlj/2025/decoupes2025mlj-evaluation/) doi:10.1007/S10994-025-06916-9BibTeX
@article{decoupes2025mlj-evaluation,
title = {{Evaluation of Geographical Distortions in Language Models}},
author = {Decoupes, Rémy and Interdonato, Roberto and Roche, Mathieu and Teisseire, Maguelonne and Valentin, Sarah},
journal = {Machine Learning},
year = {2025},
pages = {263},
doi = {10.1007/S10994-025-06916-9},
volume = {114},
url = {https://mlanthology.org/mlj/2025/decoupes2025mlj-evaluation/}
}