Automating Enterprise Data Engineering with LLMs
Abstract
The automation of data engineering tasks is invaluable for enterprises to increase efficiency and reduce the manual effort associated with handling large amounts of data. Large Language Models (LLMs) have recently shown promising results in enabling this automation. However, data engineering tasks in real-world enterprise scenarios are often more complex than their typical formulations in the scientific community. In this paper, we study the challenges that arise when automating real-world enterprise data engineering tasks with LLMs. As part of the paper, we perform a case study on the task of matching incoming payments to open invoices, an instance of the entity matching problem. We also release a hand-crafted dataset based on the actual enterprise scenario to enable the research community to study the complexity of such enterprise tasks.
Cite
Text
Bodensohn et al. "Automating Enterprise Data Engineering with LLMs." NeurIPS 2024 Workshops: TRL, 2024.Markdown
[Bodensohn et al. "Automating Enterprise Data Engineering with LLMs." NeurIPS 2024 Workshops: TRL, 2024.](https://mlanthology.org/neuripsw/2024/bodensohn2024neuripsw-automating/)BibTeX
@inproceedings{bodensohn2024neuripsw-automating,
title = {{Automating Enterprise Data Engineering with LLMs}},
author = {Bodensohn, Jan-Micha and Brackmann, Ulf and Vogel, Liane and Sanghi, Anupam and Binnig, Carsten},
booktitle = {NeurIPS 2024 Workshops: TRL},
year = {2024},
url = {https://mlanthology.org/neuripsw/2024/bodensohn2024neuripsw-automating/}
}