Medical Scientific Table-to-Text Generation with Synthetic Data Under Data Sparsity Constraint

Abstract

An efficient table-to-text summarization system can drastically reduce manual efforts to understand and summarise tabular data into textual reports. However, in practice, the problem is heavily impeded by data sparsity and the inability of the state-of-the-art natural language generation models (such as T5, PEGASUS, and GPT-Neo) to produce coherent and accurate outputs. This is particularly true in pre-clinical and clinical domains. In this paper, we propose a novel table-to-text approach and tackle these problems with the help of synthetic data generation as well as copy mechanism. Experiments show that the proposed method can boost the performance of copying concise and relevant information from tabular data to generate assay validation and toxicology reports.

Cite

Text

Wu et al. "Medical Scientific Table-to-Text Generation with Synthetic Data Under Data Sparsity Constraint." NeurIPS 2022 Workshops: SyntheticData4ML, 2022.

Markdown

[Wu et al. "Medical Scientific Table-to-Text Generation with Synthetic Data Under Data Sparsity Constraint." NeurIPS 2022 Workshops: SyntheticData4ML, 2022.](https://mlanthology.org/neuripsw/2022/wu2022neuripsw-medical/)

BibTeX

@inproceedings{wu2022neuripsw-medical,
  title     = {{Medical Scientific Table-to-Text Generation with Synthetic Data Under Data Sparsity Constraint}},
  author    = {Wu, Heng-Yi and Zhang, Jingqing and Ive, Julia and Li, Tong and Gupta, Vibhor and Chen, Bingyuan and Guo, Yike},
  booktitle = {NeurIPS 2022 Workshops: SyntheticData4ML},
  year      = {2022},
  url       = {https://mlanthology.org/neuripsw/2022/wu2022neuripsw-medical/}
}