ML Anthology
Authors
Search
About
Saxe, Andrew M.
31 publications
ICLR
2025
A Theory of Initialisation's Impact on Specialisation
Devon Jarvis
,
Sebastian Lee
,
Clémentine Carla Juliette Dominé
,
Andrew M Saxe
,
Stefano Sarao Mannelli
ICML
2025
Algorithm Development in Neural Networks: Insights from the Streaming Parity Task
Loek Van Rossem
,
Andrew M Saxe
ICLRW
2025
Distinct Computations Emerge from Compositional Curricula in In-Context Learning
Jin Hwa Lee
,
Andrew Kyle Lampinen
,
Aaditya K Singh
,
Andrew M Saxe
ICLR
2025
From Lazy to Rich: Exact Learning Dynamics in Deep Linear Networks
Clémentine Carla Juliette Dominé
,
Nicolas Anguita
,
Alexandra Maria Proca
,
Lukas Braun
,
Daniel Kunin
,
Pedro A. M. Mediano
,
Andrew M Saxe
ICLR
2025
Make Haste Slowly: A Theory of Emergent Structured Mixed Selectivity in Feature Learning ReLU Networks
Devon Jarvis
,
Richard Klein
,
Benjamin Rosman
,
Andrew M Saxe
NeurIPS
2025
Memory by Accident: A Theory of Learning as a Byproduct of Network Stabilization
Basile Confavreux
,
Will Dorrell
,
Nishil Patel
,
Andrew M Saxe
ICML
2025
Not All Solutions Are Created Equal: An Analytical Dissociation of Functional and Representational Similarity in Deep Linear Neural Networks
Lukas Braun
,
Erin Grant
,
Andrew M Saxe
ICML
2025
Strategy Coopetition Explains the Emergence and Transience of In-Context Learning
Aaditya K Singh
,
Ted Moskovitz
,
Sara Dragutinović
,
Felix Hill
,
Stephanie C.Y. Chan
,
Andrew M Saxe
ICML
2025
Training Dynamics of In-Context Learning in Linear Attention
Yedi Zhang
,
Aaditya K Singh
,
Peter E. Latham
,
Andrew M Saxe
TMLR
2025
When Are Bias-Free ReLU Networks Effectively Linear Networks?
Yedi Zhang
,
Andrew M Saxe
,
Peter E. Latham
NeurIPSW
2024
A Linear Network Theory of Iterated Learning
Devon Jarvis
,
Richard Klein
,
Benjamin Rosman
,
Andrew M Saxe
NeurIPSW
2024
A Theory of Initialisation's Impact on Specialisation
Devon Jarvis
,
Sebastian Lee
,
Clémentine Carla Juliette Dominé
,
Andrew M Saxe
,
Stefano Sarao Mannelli
NeurIPS
2024
Flexible Task Abstractions Emerge in Linear Networks with Fast and Bounded Units
Kai Sandbrink
,
Jan P. Bauer
,
Alexandra M. Proca
,
Andrew M. Saxe
,
Christopher Summerfield
,
Ali Hummos
NeurIPSW
2024
From Lazy to Rich: Exact Learning Dynamics in Deep Linear Networks
Clémentine Carla Juliette Dominé
,
Nicolas Anguita
,
Alexandra Maria Proca
,
Lukas Braun
,
Daniel Kunin
,
Pedro A. M. Mediano
,
Andrew M Saxe
NeurIPSW
2024
From Lazy to Rich: Exact Learning Dynamics in Deep Linear Networks
Clémentine Carla Juliette Dominé
,
Nicolas Anguita
,
Alexandra Maria Proca
,
Lukas Braun
,
Daniel Kunin
,
Pedro A. M. Mediano
,
Andrew M Saxe
ICMLW
2024
Get Rich Quick: Exact Solutions Reveal How Unbalanced Initializations Promote Rapid Feature Learning
Daniel Kunin
,
Allan Raventos
,
Clémentine Carla Juliette Dominé
,
Feng Chen
,
David Klindt
,
Andrew M Saxe
,
Surya Ganguli
ICML
2024
Tilting the Odds at the Lottery: The Interplay of Overparameterisation and Curricula in Neural Networks
Stefano Sarao Mannelli
,
Yaraslau Ivashynka
,
Andrew M Saxe
,
Luca Saglietti
ICML
2024
Understanding Unimodal Bias in Multimodal Deep Linear Networks
Yedi Zhang
,
Peter E. Latham
,
Andrew M Saxe
ICML
2024
What Needs to Go Right for an Induction Head? a Mechanistic Study of In-Context Learning Circuits and Their Formation
Aaditya K Singh
,
Ted Moskovitz
,
Felix Hill
,
Stephanie C.Y. Chan
,
Andrew M Saxe
ICMLW
2024
When Are Bias-Free ReLU Networks like Linear Networks?
Yedi Zhang
,
Andrew M Saxe
,
Peter E. Latham
ICML
2024
When Representations Align: Universality in Representation Learning Dynamics
Loek Van Rossem
,
Andrew M Saxe
ICML
2024
Why Do Animals Need Shaping? a Theory of Task Composition and Curriculum Learning
Jin Hwa Lee
,
Stefano Sarao Mannelli
,
Andrew M Saxe
ICLR
2023
On the Specialization of Neural Modules
Devon Jarvis
,
Richard Klein
,
Benjamin Rosman
,
Andrew M Saxe
ICLRW
2023
The Rl Perceptron: Dynamics of Policy Learning in High Dimensions
Nishil Patel
,
Sebastian Lee
,
Stefano Sarao Mannelli
,
Sebastian Goldt
,
Andrew M Saxe
NeurIPS
2019
Dynamics of Stochastic Gradient Descent for Two-Layer Neural Networks in the Teacher-Student Setup
Sebastian Goldt
,
Madhu Advani
,
Andrew M Saxe
,
Florent Krzakala
,
Lenka Zdeborová
ICLR
2018
Hierarchical Subtask Discovery with Non-Negative Matrix Factorization
Adam C. Earle
,
Andrew M. Saxe
,
Benjamin Rosman
ICML
2017
Hierarchy Through Composition with Multitask LMDPs
Andrew M. Saxe
,
Adam C. Earle
,
Benjamin Rosman
NeurIPS
2016
Tensor Switching Networks
Chuan-Yung Tsai
,
Andrew M Saxe
,
Andrew M Saxe
,
David Cox
NeurIPS
2016
Tensor Switching Networks
Chuan-Yung Tsai
,
Andrew M Saxe
,
Andrew M Saxe
,
David Cox
ICLR
2014
Exact Solutions to the Nonlinear Dynamics of Learning in Deep Linear Neural Networks
Andrew M. Saxe
,
James L. McClelland
,
Surya Ganguli
ICML
2011
On Random Weights and Unsupervised Feature Learning
Andrew M. Saxe
,
Pang Wei Koh
,
Zhenghao Chen
,
Maneesh Bhand
,
Bipin Suresh
,
Andrew Y. Ng