Rank-1 Approximation of Inverse Fisher for Natural Policy Gradients in Deep Reinforcement Learning

Huo, Yingxiao; Dash, Satya Prakash; Stoican, Radu; Kaski, Samuel; Sun, Mingfei

Rank-1 Approximation of Inverse Fisher for Natural Policy Gradients in Deep Reinforcement Learning

Yingxiao Huo, Satya Prakash Dash, Radu Stoican, Samuel Kaski, Mingfei Sun

TMLR 2026

/tmlr/2026/huo2026tmlr-rank1/

Abstract

Natural gradients have been long studied in deep reinforcement learning due to its fast convergence properties and covariant weight updates. However, computing natural gradients requires inversion of Fisher Information Matrix (FIM) at each iteration, which is computationally prohibitive in nature. In this paper, we present an efficient and scalable natural policy optimization technique which leverages a rank-1 approximation to full inverse-FIM. We theoretically show that under certain conditions, rank-1 approximation to inverse-FIM converges faster than policy gradients and under some condition, enjoys the same sample complexity as stochastic policy gradient methods. We benchmark our method on a diverse set of environments and show that our methods achieve superior performance than standard trust-region and actor-critic baselines.

PDF TMLR OpenReview Code Semantic Scholar

Cite

Text

Huo et al. "Rank-1 Approximation of Inverse Fisher for Natural Policy Gradients in Deep Reinforcement Learning." Transactions on Machine Learning Research, 2026.

Markdown

[Huo et al. "Rank-1 Approximation of Inverse Fisher for Natural Policy Gradients in Deep Reinforcement Learning." Transactions on Machine Learning Research, 2026.](https://mlanthology.org/tmlr/2026/huo2026tmlr-rank1/)

BibTeX

@article{huo2026tmlr-rank1,
  title     = {{Rank-1 Approximation of Inverse Fisher for Natural Policy Gradients in Deep Reinforcement Learning}},
  author    = {Huo, Yingxiao and Dash, Satya Prakash and Stoican, Radu and Kaski, Samuel and Sun, Mingfei},
  journal   = {Transactions on Machine Learning Research},
  year      = {2026},
  url       = {https://mlanthology.org/tmlr/2026/huo2026tmlr-rank1/}
}