FANS: A Flatness-Aware Network Structure for Generalization in Offline Reinforcement Learning
Abstract
Offline reinforcement learning (RL) aims to learn optimal policies from static datasets while enhancing generalization to out-of-distribution (OOD) data. To mitigate overfitting to suboptimal behaviors in offline datasets, existing methods often relax constraints on policy and data or extract informative patterns through data-driven techniques. However, there has been limited exploration into structurally guiding the optimization process toward flatter regions of the solution space that offer better generalization. Motivated by this observation, we present \textit{FANS}, a generalization-oriented structured network framework that promotes flatter and robust policy learning by guiding the optimization trajectory through modular architectural design. FANS comprises four key components: (1) Residual Blocks, which facilitate compact and expressive representations; (2) Gaussian Activation, which promotes smoother gradients; (3) Layer Normalization, which mitigates overfitting; and (4) Ensemble Modeling, which reduces estimation variance. By integrating FANS into a standard actor-critic framework, we highlight that this remarkably simple architecture achieves superior performance across various tasks compared to many existing advanced methods. Moreover, we validate the effectiveness of FANS in mitigating overestimation and promoting generalization, demonstrating the promising potential of architectural design in advancing offline RL.
Cite
Text
Wang et al. "FANS: A Flatness-Aware Network Structure for Generalization in Offline Reinforcement Learning." Advances in Neural Information Processing Systems, 2025.Markdown
[Wang et al. "FANS: A Flatness-Aware Network Structure for Generalization in Offline Reinforcement Learning." Advances in Neural Information Processing Systems, 2025.](https://mlanthology.org/neurips/2025/wang2025neurips-fans/)BibTeX
@inproceedings{wang2025neurips-fans,
title = {{FANS: A Flatness-Aware Network Structure for Generalization in Offline Reinforcement Learning}},
author = {Wang, Da and Ma, Yi and Guo, Ting and Tang, Hongyao and Wei, Wei and Liang, Jiye},
booktitle = {Advances in Neural Information Processing Systems},
year = {2025},
url = {https://mlanthology.org/neurips/2025/wang2025neurips-fans/}
}