Evaluating Fine-Tuning Efficiency of Human-Inspired Learning Strategies in Medical Question Answering

Abstract

Fine-tuning Large Language Models (LLMs) incurs considerable training costs, driving the need for data-efficient training with optimised data ordering. Human-inspired strategies offer a solution by organising data based on human learning practices. This study evaluates the fine-tuning efficiency of five human-inspired strategies across four language models, three datasets, and both human- and LLM-labelled data in the context of medical question answering. These strategies achieve the best accuracy gain of 1.81% and an average gain of 1.02% across datasets, with interleaved strategies delivering the best average results. However, the best strategy varies across model-dataset combinations, limiting the generalisability of the effects of any single strategy. Additionally, LLM-defined question difficulty outperforms human-defined labels in curriculum-based learning, showing the potential of model-generated data as a cost-effective alternative for optimising fine-tuning.

Cite

Text

Yang et al. "Evaluating Fine-Tuning Efficiency of Human-Inspired Learning Strategies in Medical Question Answering." NeurIPS 2024 Workshops: FITML, 2024.

Markdown

[Yang et al. "Evaluating Fine-Tuning Efficiency of Human-Inspired Learning Strategies in Medical Question Answering." NeurIPS 2024 Workshops: FITML, 2024.](https://mlanthology.org/neuripsw/2024/yang2024neuripsw-evaluating/)

BibTeX

@inproceedings{yang2024neuripsw-evaluating,
  title     = {{Evaluating Fine-Tuning Efficiency of Human-Inspired Learning Strategies in Medical Question Answering}},
  author    = {Yang, Yushi and Bean, Andrew Michael and McCraith, Robert and Mahdi, Adam},
  booktitle = {NeurIPS 2024 Workshops: FITML},
  year      = {2024},
  url       = {https://mlanthology.org/neuripsw/2024/yang2024neuripsw-evaluating/}
}