Akram, Yassir

6 publications

ICLR 2025 Attention as a Hypernetwork Simon Schug, Seijin Kobayashi, Yassir Akram, Joao Sacramento, Razvan Pascanu
ICLR 2025 Learning Randomized Algorithms with Transformers Johannes von Oswald, Seijin Kobayashi, Yassir Akram, Angelika Steger
NeurIPS 2025 Scaling Can Lead to Compositional Generalization Florian Redhardt, Yassir Akram, Simon Schug
ICLR 2024 Discovering Modular Solutions That Generalize Compositionally Simon Schug, Seijin Kobayashi, Yassir Akram, Maciej Wolczyk, Alexandra Maria Proca, Johannes von Oswald, Razvan Pascanu, Joao Sacramento, Angelika Steger
NeurIPS 2024 Weight Decay Induces Low-Rank Attention Layers Seijin Kobayashi, Yassir Akram, Johannes von Oswald
NeurIPSW 2022 Random Initialisations Performing Above Chance and How to Find Them Frederik Benzing, Simon Schug, Robert Meier, Johannes von Oswald, Yassir Akram, Nicolas Zucchet, Laurence Aitchison, Angelika Steger