Enhancing Deep Batch Active Learning for Regression with Imperfect Data Guided Selection

Abstract

Active learning (AL) reduces annotation costs by selecting the most informative samples based on both model sensitivity and predictive uncertainty. While sensitivity can be measured through parameter gradients in an unsupervised manner, predictive uncertainty can hardly be estimated without true labels especially for regression tasks, reducing the informativeness of actively selected samples. This paper proposes the concept of \textit{auxiliary data} to aid the uncertainty estimation for regression tasks. With detailed theoretical analysis, we reveal that auxiliary data, despite potential distribution shifts, can provide a promising uncertainty surrogate when properly weighted. Such finding inspires our design of AGBAL, a novel AL framework that recalibrates auxiliary data losses through density ratio weighting to obtain reliable uncertainty estimates for sample selection. Extensive experiments show that AGBAL consistently outperforms existing approaches without auxiliary data across diverse synthetic and real-world datasets.

Cite

Text

Min et al. "Enhancing Deep Batch Active Learning for Regression with Imperfect Data Guided Selection." Advances in Neural Information Processing Systems, 2025.

Markdown

[Min et al. "Enhancing Deep Batch Active Learning for Regression with Imperfect Data Guided Selection." Advances in Neural Information Processing Systems, 2025.](https://mlanthology.org/neurips/2025/min2025neurips-enhancing/)

BibTeX

@inproceedings{min2025neurips-enhancing,
  title     = {{Enhancing Deep Batch Active Learning for Regression with Imperfect Data Guided Selection}},
  author    = {Min, Yinjie and Xu, Furong and Li, Xinyao and Zou, Changliang and Zhou, Yongdao},
  booktitle = {Advances in Neural Information Processing Systems},
  year      = {2025},
  url       = {https://mlanthology.org/neurips/2025/min2025neurips-enhancing/}
}