Post

On the Scaling Laws of Geographical Representation in Language Models

arXiv

UPDATE: This paper was accepted to LREC-Coling 2024! 🎉

Abstract
Language models have long been shown to embed geographical information in their hidden representations. This line of work has recently been revisited by extending this result to Large Language Models (LLMs). In this paper, we propose to fill the gap between well-established and recent literature by observing how geographical knowledge evolves when scaling language models. We show that geographical knowledge is observable even for tiny models, and that it scales consistently as we increase the model size. Notably, we observe that larger language models cannot mitigate the geographical bias that is inherent to the training data.

This paper was co-authored by my PhD supervisors Eric Villemonte de la Clergerie and Benoît Sagot from Inria’s ALMAnaCH team.

Here is the PDF version of the paper that you can also find here:

Please cite as:

1
2
3
4
5
6
7
8
@misc{godey2024scaling,
      title={On the Scaling Laws of Geographical Representation in Language Models}, 
      author={Nathan Godey and Éric de la Clergerie and Benoît Sagot},
      year={2024},
      eprint={2402.19406},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

This work was funded by the PRAIRIE institute as part of a PhD contract at Inria Paris and Sorbonne Université.

This post is licensed under CC BY 4.0 by the author.