EchoQA: A Large Collection of Instruction Tuning Data for Echocardiogram Reports
Abstract
We introduce a novel question-answering (QA) dataset using echocardiogram reports sourced from the Medical Information Mart for Intensive Care data. This dataset is specifically designed to enhance QA systems in cardiology, consisting of 771,244 QA pairs addressing a wide array of cardiac abnormalities and their severity. We compare various large language models (LLMs), including both open-source general models and biomedical-specific models, alongside state-of-the-art closed-source models for zero-shot evaluation. Our results show that fine-tuning LLMs improves performance across various QA metrics, highlighting the validity and value of our dataset. Further, we conduct fine-grained fairness audits to assess the bias-performance trade-off of LLMs across marginalized populations. Our objective is to propel the field forward by establishing a benchmark framework for developing LLM AI agents that support clinicians in their daily workflow within the cardiology space. The dataset aims to support the advancement of natural language models for use in diagnostic decision support systems, aiming to increase efficiency in cardiology care.
Cite
Text
Moukheiber et al. "EchoQA: A Large Collection of Instruction Tuning Data for Echocardiogram Reports." NeurIPS 2024 Workshops: SafeGenAi, 2024.Markdown
[Moukheiber et al. "EchoQA: A Large Collection of Instruction Tuning Data for Echocardiogram Reports." NeurIPS 2024 Workshops: SafeGenAi, 2024.](https://mlanthology.org/neuripsw/2024/moukheiber2024neuripsw-echoqa/)BibTeX
@inproceedings{moukheiber2024neuripsw-echoqa,
title = {{EchoQA: A Large Collection of Instruction Tuning Data for Echocardiogram Reports}},
author = {Moukheiber, Lama and Moukheiber, Mira and Moukheiber, Dana and Ju, Jae-Woo and Lee, Hyung-Chul},
booktitle = {NeurIPS 2024 Workshops: SafeGenAi},
year = {2024},
url = {https://mlanthology.org/neuripsw/2024/moukheiber2024neuripsw-echoqa/}
}