Explain My Surprise: Learning Efficient Long-Term Memory by Predicting Uncertain Outcomes

Abstract

In many sequential tasks, a model needs to remember relevant events from the distant past to make correct predictions. Unfortunately, a straightforward application of gradient based training requires intermediate computations to be stored for every element of a sequence. This requires to store prohibitively large intermediate data if a sequence consists of thousands or even millions elements, and as a result, makes learning of very long-term dependencies infeasible. However, the majority of sequence elements can usually be predicted by taking into account only temporally local information. On the other hand, predictions affected by long-term dependencies are sparse and characterized by high uncertainty given only local information. We propose \texttt{MemUP}, a new training method that allows to learn long-term dependencies without backpropagating gradients through the whole sequence at a time. This method can potentially be applied to any recurrent architecture. LSTM network trained with \texttt{MemUP} performs better or comparable to baselines while requiring to store less intermediate data.

PDF NeurIPS OpenReview Semantic Scholar

Cite

Text

Sorokin et al. "Explain My Surprise: Learning Efficient Long-Term Memory by Predicting Uncertain Outcomes." Neural Information Processing Systems, 2022.

Markdown

[Sorokin et al. "Explain My Surprise: Learning Efficient Long-Term Memory by Predicting Uncertain Outcomes." Neural Information Processing Systems, 2022.](https://mlanthology.org/neurips/2022/sorokin2022neurips-explain/)

BibTeX

@inproceedings{sorokin2022neurips-explain,
  title     = {{Explain My Surprise: Learning Efficient Long-Term Memory by Predicting Uncertain Outcomes}},
  author    = {Sorokin, Artyom and Buzun, Nazar and Pugachev, Leonid and Burtsev, Mikhail},
  booktitle = {Neural Information Processing Systems},
  year      = {2022},
  url       = {https://mlanthology.org/neurips/2022/sorokin2022neurips-explain/}
}