Reducing Transformer Depth on Demand with Structured Dropout
Cite
Text
Fan et al. "Reducing Transformer Depth on Demand with Structured Dropout." International Conference on Learning Representations, 2020.Markdown
[Fan et al. "Reducing Transformer Depth on Demand with Structured Dropout." International Conference on Learning Representations, 2020.](https://mlanthology.org/iclr/2020/fan2020iclr-reducing/)BibTeX
@inproceedings{fan2020iclr-reducing,
title = {{Reducing Transformer Depth on Demand with Structured Dropout}},
author = {Fan, Angela and Grave, Edouard and Joulin, Armand},
booktitle = {International Conference on Learning Representations},
year = {2020},
url = {https://mlanthology.org/iclr/2020/fan2020iclr-reducing/}
}