Csordas, Robert

19 publications

NeurIPS 2025 Do Language Models Use Their Depth Efficiently? Róbert Csordás, Christopher D Manning, Christopher Potts
ICML 2025 Measuring In-Context Computation Complexity via Hidden State Prediction Vincent Herrmann, Róbert Csordás, Jürgen Schmidhuber
ICLRW 2025 Measuring In-Context Computation Complexity via Hidden State Prediction Vincent Herrmann, Róbert Csordás, Jürgen Schmidhuber
TMLR 2025 Metalearning Continual Learning Algorithms Kazuki Irie, Róbert Csordás, Jürgen Schmidhuber
ICLR 2025 MrT5: Dynamic Token Merging for Efficient Byte-Level Language Models Julie Kallini, Shikhar Murty, Christopher D Manning, Christopher Potts, Róbert Csordás
NeurIPS 2024 MoEUT: Mixture-of-Experts Universal Transformers Róbert Csordás, Kazuki Irie, Jürgen Schmidhuber, Christopher Potts, Christopher D. Manning
NeurIPS 2024 SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention Róbert Csordás, Piotr Piękos, Kazuki Irie, Jürgen Schmidhuber
NeurIPSW 2023 Mindstorms in Natural Language-Based Societies of Mind Mingchen Zhuge, Haozhe Liu, Francesco Faccio, Dylan R. Ashley, Róbert Csordás, Anand Gopalakrishnan, Abdullah Hamdi, Hasan Abed Al Kader Hammoud, Vincent Herrmann, Kazuki Irie, Louis Kirsch, Bing Li, Guohao Li, Shuming Liu, Jinjie Mai, Piotr Piękos, Aditya Ramesh, Imanol Schlag, Weimin Shi, Aleksandar Stanić, Wenyi Wang, Yuhui Wang, Mengmeng Xu, Deng-Ping Fan, Bernard Ghanem, Jürgen Schmidhuber
ICMLW 2023 Topological Neural Discrete Representation Learning À La Kohonen Kazuki Irie, Róbert Csordás, Jürgen Schmidhuber
LoG 2022 A Generalist Neural Algorithmic Learner Borja Ibarz, Vitaly Kurin, George Papamakarios, Kyriacos Nikiforou, Mehdi Bennani, Róbert Csordás, Andrew Joseph Dudzik, Matko Bošnjak, Alex Vitvitskyi, Yulia Rubanova, Andreea Deac, Beatrice Bevilacqua, Yaroslav Ganin, Charles Blundell, Petar Veličković
ICML 2022 A Modern Self-Referential Weight Matrix That Learns to Modify Itself Kazuki Irie, Imanol Schlag, Róbert Csordás, Jürgen Schmidhuber
ICML 2022 The Dual Form of Neural Networks Revisited: Connecting Test Time Predictions to Training Patterns via Spotlights of Attention Kazuki Irie, Róbert Csordás, Jürgen Schmidhuber
ICLR 2022 The Neural Data Router: Adaptive Control Flow in Transformers Improves Systematic Generalization Róbert Csordás, Kazuki Irie, Jürgen Schmidhuber
NeurIPSW 2021 A Modern Self-Referential Weight Matrix That Learns to Modify Itself Kazuki Irie, Imanol Schlag, Róbert Csordás, Jürgen Schmidhuber
ICLR 2021 Are Neural Nets Modular? Inspecting Functional Modularity Through Differentiable Weight Masks Róbert Csordás, Sjoerd van Steenkiste, Jürgen Schmidhuber
NeurIPS 2021 Going Beyond Linear Transformers with Recurrent Fast Weight Programmers Kazuki Irie, Imanol Schlag, Róbert Csordás, Jürgen Schmidhuber
NeurIPSW 2021 Improving Baselines in the Wild Kazuki Irie, Imanol Schlag, Róbert Csordás, Jürgen Schmidhuber
NeurIPSW 2021 Learning Adaptive Control Flow in Transformers for Improved Systematic Generalization Róbert Csordás, Kazuki Irie, Jürgen Schmidhuber
ICLR 2019 Improving Differentiable Neural Computers Through Memory Masking, De-Allocation, and Link Distribution Sharpness Control Robert Csordas, Juergen Schmidhuber