Battery Fault: A Comprehensive Dataset and Benchmark for Battery Fault Diagnosis
Abstract
With the accelerated popularization of electric vehicles (EV), battery safety issues have become an important research focus. Data-driven battery fault diagnosis algorithms, built on real-world operational data, are critical methods for reducing safety risks. However, existing battery datasets have limitations such as insufficient scale, coarse-grained labels, and lack of coverage of real-world operating conditions, which seriously restrict the development of data-driven fault diagnosis algorithms. To address these issues, this paper introduces a large-scale benchmark dataset named CH-BatteryGen, which is, to the best of our knowledge, the first EV battery system fault diagnosis dataset based on real-world operating conditions. This dataset integrates real on-board operation data with mechanism-constrained generative modeling technology, balancing authenticity and scalability. It covers two mainstream battery chemistries, namely nickel-cobalt-manganese (NCM) lithium batteries and lithium iron phosphate (LFP) batteries, and involves charging, discharging, and operation data of 1000 electric vehicles. It provides four fault labels (normal, self-discharge, high-resistance, low-capacity) and three severity level annotations, supporting two benchmark tasks: fault classification and fault grading. Through systematic validation using traditional machine learning methods (random forest (RF), support vector machine (SVM)) and deep learning models (long short-term memory (LSTM), convolutional neural network (CNN)), the results show that the CNN model performs best in the fault classification task, achieving an F1-score of 0.9280 in the LFP discharging scenario; in the fault grading task, the F1-score reaches 0.8813. The CH-BatteryGen dataset has been open-sourced, aiming to provide a standardized evaluation platform for battery fault diagnosis algorithms, promote research development in this field, and contribute to the transformation of sustainable transportation systems.
Cite
Text
Liu et al. "Battery Fault: A Comprehensive Dataset and Benchmark for Battery Fault Diagnosis." International Conference on Learning Representations, 2026.Markdown
[Liu et al. "Battery Fault: A Comprehensive Dataset and Benchmark for Battery Fault Diagnosis." International Conference on Learning Representations, 2026.](https://mlanthology.org/iclr/2026/liu2026iclr-battery/)BibTeX
@inproceedings{liu2026iclr-battery,
title = {{Battery Fault: A Comprehensive Dataset and Benchmark for Battery Fault Diagnosis}},
author = {Liu, Qingdi and Fu, Yan and Liu, Lishuo and Lin, Yanke and Xin, Jin and Zhang, Jianfeng and Liu, Cheng Hao and Pan, Lujia and Guo, Dongxu and Zheng, Yuejiu and Li, Qiang},
booktitle = {International Conference on Learning Representations},
year = {2026},
url = {https://mlanthology.org/iclr/2026/liu2026iclr-battery/}
}