Re-Creation of Creations: A New Paradigm for Lyric-to-Melody Generation

Abstract

Medical Large Multi-modal Models (LMMs) have demonstrated remarkable capabilities in medical data interpretation. However, these models frequently generate hallucinations contradicting source evidence, particularly due to inadequate localization reasoning. This work reveals a critical limitation in current medical LMMs: instead of analyzing relevant pathological regions, they often rely on linguistic patterns or attend to irrelevant image areas when responding to disease-related queries. To address this, we introduce HEAL-MedVQA (Hallucination Evaluation via Localization MedVQA), a comprehensive benchmark designed to evaluate LMMs' localization abilities and hallucination robustness. HEAL-MedVQA features (i) two innovative evaluation protocols to assess visual and textual shortcut learning, and (ii) a dataset of 67K VQA pairs, with doctor-annotated anatomical segmentation masks for pathological regions. To improve visual reasoning, we propose the Localize-before-Answer (LobA) framework, which trains LMMs to localize target regions of interest and self-prompt to emphasize segmented pathological areas, generating grounded and reliable answers. Experimental results demonstrate that our approach significantly outperforms state-of-the-art biomedical LMMs on the challenging HEAL-MedVQA benchmark, advancing robustness in medical VQA.

Cite

Text

Lv et al. "Re-Creation of Creations: A New Paradigm for Lyric-to-Melody Generation." International Joint Conference on Artificial Intelligence, 2024. doi:10.24963/ijcai.2024/853

Markdown

[Lv et al. "Re-Creation of Creations: A New Paradigm for Lyric-to-Melody Generation." International Joint Conference on Artificial Intelligence, 2024.](https://mlanthology.org/ijcai/2024/lv2024ijcai-re/) doi:10.24963/ijcai.2024/853

BibTeX

@inproceedings{lv2024ijcai-re,
  title     = {{Re-Creation of Creations: A New Paradigm for Lyric-to-Melody Generation}},
  author    = {Lv, Ang and Tan, Xu and Qin, Tao and Liu, Tie-Yan and Yan, Rui},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2024},
  pages     = {7708-7716},
  doi       = {10.24963/ijcai.2024/853},
  url       = {https://mlanthology.org/ijcai/2024/lv2024ijcai-re/}
}