Responsible Reasoning with Large Language Models and the Impact of Proper Nouns

Abstract

Language models with billions of parameters have shown remarkable emergent properties, including the ability to reason on unstructured data. We show that open-science multi-lingual large language models can perform the task of spatial reasoning on two or more entities with significant accuracy. A responsible large language model would perform this spatial reasoning task with the same accuracy regardless of the choice of the names of the entities over which the spatial relationships are defined. However, we show that the accuracies of contemporary large language models are impacted by the choice of proper nouns even when the underlying task ought to be independent of the choice of proper nouns. In this context, we also observe that the conditional log probabilities or beam scores of open-science multi-lingual large language model predictions are not well-calibrated, and the scores do not discriminate between correct and wrong responses in this context.

Cite

Text

Jha et al. "Responsible Reasoning with Large Language Models and the Impact of Proper Nouns." NeurIPS 2022 Workshops: TSRML, 2022.

Markdown

[Jha et al. "Responsible Reasoning with Large Language Models and the Impact of Proper Nouns." NeurIPS 2022 Workshops: TSRML, 2022.](https://mlanthology.org/neuripsw/2022/jha2022neuripsw-responsible/)

BibTeX

@inproceedings{jha2022neuripsw-responsible,
  title     = {{Responsible Reasoning with Large Language Models and the Impact of Proper Nouns}},
  author    = {Jha, Sumit Kumar and Ewetz, Rickard and Velasquez, Alvaro and Jha, Susmit},
  booktitle = {NeurIPS 2022 Workshops: TSRML},
  year      = {2022},
  url       = {https://mlanthology.org/neuripsw/2022/jha2022neuripsw-responsible/}
}