Idling Neurons, Appropriately Lenient Workload During Fine-Tuning Leads to Better Generalization
Abstract
Pre-training on large-scale datasets has become a fundamental method for training deep neural networks. Pre-training provides a better set of parameters than random initialization, which reduces the training cost of deep neural networks on the target task. In addition, pre-training also provides a large number of feature representations, which may help improve generalization capabilities. However, this potential advantage has not received enough attention and has been buried by rough fine-tuning. Based on some exploratory experiments, this paper rethinks the fine-tuning process and gives a new perspective on understanding fine-tuning. Moreover, this paper proposes some plug-and-play fine-tuning strategies as alternatives for simple fine-tuning. These fine-tuning strategies all preserve pre-trained features better by creating idling of some neurons, leading to better generalization.
Cite
Text
Niu et al. "Idling Neurons, Appropriately Lenient Workload During Fine-Tuning Leads to Better Generalization." Proceedings of the European Conference on Computer Vision (ECCV), 2024. doi:10.1007/978-3-031-73668-1_16Markdown
[Niu et al. "Idling Neurons, Appropriately Lenient Workload During Fine-Tuning Leads to Better Generalization." Proceedings of the European Conference on Computer Vision (ECCV), 2024.](https://mlanthology.org/eccv/2024/niu2024eccv-idling/) doi:10.1007/978-3-031-73668-1_16BibTeX
@inproceedings{niu2024eccv-idling,
title = {{Idling Neurons, Appropriately Lenient Workload During Fine-Tuning Leads to Better Generalization}},
author = {Niu, Hongjing and Li, Hanting and Li, Bin and Zhao, Feng},
booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
year = {2024},
doi = {10.1007/978-3-031-73668-1_16},
url = {https://mlanthology.org/eccv/2024/niu2024eccv-idling/}
}