Hyper-Representations as Generative Models: Sampling Unseen Neural Network Weights
Abstract
Learning representations of neural network weights given a model zoo is an emerg- ing and challenging area with many potential applications from model inspection, to neural architecture search or knowledge distillation. Recently, an autoencoder trained on a model zoo was able to learn a hyper-representation, which captures intrinsic and extrinsic properties of the models in the zoo. In this work, we ex- tend hyper-representations for generative use to sample new model weights. We propose layer-wise loss normalization which we demonstrate is key to generate high-performing models and several sampling methods based on the topology of hyper-representations. The models generated using our methods are diverse, per- formant and capable to outperform strong baselines as evaluated on several down- stream tasks: initialization, ensemble sampling and transfer learning. Our results indicate the potential of knowledge aggregation from model zoos to new models via hyper-representations thereby paving the avenue for novel research directions.
Cite
Text
Schürholt et al. "Hyper-Representations as Generative Models: Sampling Unseen Neural Network Weights." Neural Information Processing Systems, 2022.Markdown
[Schürholt et al. "Hyper-Representations as Generative Models: Sampling Unseen Neural Network Weights." Neural Information Processing Systems, 2022.](https://mlanthology.org/neurips/2022/schurholt2022neurips-hyperrepresentations/)BibTeX
@inproceedings{schurholt2022neurips-hyperrepresentations,
title = {{Hyper-Representations as Generative Models: Sampling Unseen Neural Network Weights}},
author = {Schürholt, Konstantin and Knyazev, Boris and Giró-i-Nieto, Xavier and Borth, Damian},
booktitle = {Neural Information Processing Systems},
year = {2022},
url = {https://mlanthology.org/neurips/2022/schurholt2022neurips-hyperrepresentations/}
}