Contextual Evaluation of Large Language Models for Classifying Tropical and Infectious Diseases

Abstract

While large language models (LLMs) have shown promise for medical question answering, there is limited work focused on tropical and infectious disease-specific exploration. We build on an opensource tropical and infectious diseases (TRINDs) dataset, expanding it to include demographic and semantic clinical and consumer augmentations yielding 11000+ prompts. We evaluate LLM performance on these, comparing generalist and medical LLMs, as well as LLM outcomes to human experts. We demonstrate through systematic experimentation, the benefit of contextual information such as demographics, location, gender, risk factors for optimal LLM response. Finally we develop a prototype of TRINDs-LM, a research tool that provides a playground to navigate how context impacts LLM outputs for health.

Cite

Text

Asiedu et al. "Contextual Evaluation of Large Language Models for Classifying Tropical and Infectious Diseases." NeurIPS 2024 Workshops: AIM-FM, 2024.

Markdown

[Asiedu et al. "Contextual Evaluation of Large Language Models for Classifying Tropical and Infectious Diseases." NeurIPS 2024 Workshops: AIM-FM, 2024.](https://mlanthology.org/neuripsw/2024/asiedu2024neuripsw-contextual/)

BibTeX

@inproceedings{asiedu2024neuripsw-contextual,
  title     = {{Contextual Evaluation of Large Language Models for Classifying Tropical and Infectious Diseases}},
  author    = {Asiedu, Mercy Nyamewaa and Tomasev, Nenad and Ghate, Chintan and Tiyasirichokchai, Tiya and Dieng, Awa and Siwo, Geoffrey and Adudans, Steve and Akande, Oluwatosin Wuraola and Heller, Katherine A},
  booktitle = {NeurIPS 2024 Workshops: AIM-FM},
  year      = {2024},
  url       = {https://mlanthology.org/neuripsw/2024/asiedu2024neuripsw-contextual/}
}