Worst-Case Feature Risk Minimization for Data-Efficient Learning
Abstract
Deep learning models typically require massive amounts of annotated data to train a strong model for a task of interest. However, data annotation is time-consuming and costly. How to use labeled data from a related but distinct domain, or just a few samples to train a satisfactory model are thus important questions. To achieve this goal, models should resist overfitting to the specifics of the training data in order to generalize well to new data. This paper proposes a novel Worst-case Feature Risk Minimization (WFRM) method that helps improve model generalization. Specifically, we tackle a minimax optimization problem in feature space at each training iteration. Given the input features, we seek the feature perturbation that maximizes the current training loss and then minimizes the training loss of the worst-case features. By incorporating our WFRM during training, we significantly improve model generalization under distributional shift – Domain Generalization (DG) and in the low-data regime – Few-shot Learning (FSL). We theoretically analyze WFRM and find the key reason why it works better than ERM – it induces an empirical risk-based semi-adaptive $L_{2}$ regularization of the classifier weights, enabling a better risk-complexity trade-off. We evaluate WFRM on two data-efficient learning tasks, including three standard DG benchmarks of PACS, VLCS, OfficeHome and the most challenging FSL benchmark Meta-Dataset. Despite the simplicity, our method consistently improves various DG and FSL methods, leading to the new state-of-the-art performances in all settings. Codes & models will be released at https://github.com/jslei/WFRM.
Cite
Text
Lei et al. "Worst-Case Feature Risk Minimization for Data-Efficient Learning." Transactions on Machine Learning Research, 2023.Markdown
[Lei et al. "Worst-Case Feature Risk Minimization for Data-Efficient Learning." Transactions on Machine Learning Research, 2023.](https://mlanthology.org/tmlr/2023/lei2023tmlr-worstcase/)BibTeX
@article{lei2023tmlr-worstcase,
title = {{Worst-Case Feature Risk Minimization for Data-Efficient Learning}},
author = {Lei, Jingshi and Li, Da and Xu, Chengming and Fang, Liming and Hospedales, Timothy and Fu, Yanwei},
journal = {Transactions on Machine Learning Research},
year = {2023},
url = {https://mlanthology.org/tmlr/2023/lei2023tmlr-worstcase/}
}