eCeLLM: Generalizing Large Language Models for E-Commerce from Large-Scale, High-Quality Instruction Data

Abstract

With tremendous efforts on developing effective e-commerce models, conventional e-commerce models show limited success in generalist e-commerce modeling, and suffer from unsatisfactory performance on new users and new products – a typical out-of-domain generalization challenge. Meanwhile, large language models (LLMs) demonstrate outstanding performance in generalist modeling and out-of-domain generalizability in many fields. Toward fully unleashing their power for e-commerce, in this paper, we construct ECInstruct, the first open-sourced, large-scale, and high-quality benchmark instruction dataset for e-commerce. Leveraging ECInstruct, we develop eCeLLM, a series of e-commerce LLMs, by instruction-tuning general-purpose LLMs. Our comprehensive experiments and evaluation demonstrate that eCeLLM models substantially outperform baseline models, including the most advanced GPT-4, and the state-of-the-art task-specific models in in-domain evaluation. Moreover, eCeLLM exhibits excellent generalizability to out-of-domain settings, including unseen products and unseen instructions, highlighting its superiority as a generalist e-commerce model. Both the ECInstruct dataset and the eCeLLM models show great potential in empowering versatile and effective LLMs for e-commerce. ECInstruct and eCeLLM models are publicly accessible through this link.

Cite

Text

Peng et al. "eCeLLM: Generalizing Large Language Models for E-Commerce from Large-Scale, High-Quality Instruction Data." International Conference on Machine Learning, 2024.

Markdown

[Peng et al. "eCeLLM: Generalizing Large Language Models for E-Commerce from Large-Scale, High-Quality Instruction Data." International Conference on Machine Learning, 2024.](https://mlanthology.org/icml/2024/peng2024icml-ecellm/)

BibTeX

@inproceedings{peng2024icml-ecellm,
  title     = {{eCeLLM: Generalizing Large Language Models for E-Commerce from Large-Scale, High-Quality Instruction Data}},
  author    = {Peng, Bo and Ling, Xinyi and Chen, Ziru and Sun, Huan and Ning, Xia},
  booktitle = {International Conference on Machine Learning},
  year      = {2024},
  pages     = {40215-40257},
  volume    = {235},
  url       = {https://mlanthology.org/icml/2024/peng2024icml-ecellm/}
}