Trade-Offs of Diagonal Fisher Information Matrix Estimators
Abstract
The Fisher information matrix can be used to characterize the local geometry ofthe parameter space of neural networks. It elucidates insightful theories anduseful tools to understand and optimize neural networks. Given its highcomputational cost, practitioners often use random estimators and evaluate onlythe diagonal entries. We examine two popular estimators whose accuracy and samplecomplexity depend on their associated variances. We derive bounds of thevariances and instantiate them in neural networks for regression andclassification. We navigate trade-offs for both estimators based on analyticaland numerical studies. We find that the variance quantities depend on thenon-linearity w.r.t. different parameter groups and should not be neglected whenestimating the Fisher information.
Cite
Text
Soen and Sun. "Trade-Offs of Diagonal Fisher Information Matrix Estimators." Neural Information Processing Systems, 2024. doi:10.52202/079017-0191Markdown
[Soen and Sun. "Trade-Offs of Diagonal Fisher Information Matrix Estimators." Neural Information Processing Systems, 2024.](https://mlanthology.org/neurips/2024/soen2024neurips-tradeoffs/) doi:10.52202/079017-0191BibTeX
@inproceedings{soen2024neurips-tradeoffs,
title = {{Trade-Offs of Diagonal Fisher Information Matrix Estimators}},
author = {Soen, Alexander and Sun, Ke},
booktitle = {Neural Information Processing Systems},
year = {2024},
doi = {10.52202/079017-0191},
url = {https://mlanthology.org/neurips/2024/soen2024neurips-tradeoffs/}
}