ML Anthology
Authors
Search
About
Qiu, Shikai
12 publications
ICML
2025
Customizing the Inductive Biases of SoftMax Attention Using Structured Matrices
Yilun Kuang
,
Noah Amsel
,
Sanae Lotfi
,
Shikai Qiu
,
Andres Potapczynski
,
Andrew Gordon Wilson
NeurIPS
2025
Hyperparameter Transfer Enables Consistent Gains of Matrix-Preconditioned Optimizers Across Scales
Shikai Qiu
,
Zixi Chen
,
Hoang Phan
,
Qi Lei
,
Andrew Gordon Wilson
ICML
2025
Position: Supervised Classifiers Answer the Wrong Questions for OOD Detection
Yucen Lily Li
,
Daohan Lu
,
Polina Kirichenko
,
Shikai Qiu
,
Tim G. J. Rudner
,
C. Bayan Bruss
,
Andrew Gordon Wilson
ICML
2025
Scaling Collapse Reveals Universal Dynamics in Compute-Optimally Trained Neural Networks
Shikai Qiu
,
Lechao Xiao
,
Andrew Gordon Wilson
,
Jeffrey Pennington
,
Atish Agarwala
ICML
2024
Compute Better Spent: Replacing Dense Layers with Structured Matrices
Shikai Qiu
,
Andres Potapczynski
,
Marc Anton Finzi
,
Micah Goldblum
,
Andrew Gordon Wilson
NeurIPSW
2024
Scaling Collapse Reveals Universal Dynamics in Compute-Optimally Trained Neural Networks
Shikai Qiu
,
Atish Agarwala
,
Jeffrey Pennington
,
Lechao Xiao
NeurIPS
2024
Searching for Efficient Linear Layers over a Continuous Space of Structured Matrices
Andres Potapczynski
,
Shikai Qiu
,
Marc Finzi
,
Christopher Ferri
,
Zixi Chen
,
Micah Goldblum
,
C. Bayan Bruss
,
Christopher De Sa
,
Andrew Gordon Wilson
ICML
2024
Transferring Knowledge from Large Foundation Models to Small Downstream Models
Shikai Qiu
,
Boran Han
,
Danielle C. Maddix
,
Shuai Zhang
,
Bernie Wang
,
Andrew Gordon Wilson
ICML
2023
Function-Space Regularization in Neural Networks: A Probabilistic Perspective
Tim G. J. Rudner
,
Sanyam Kapoor
,
Shikai Qiu
,
Andrew Gordon Wilson
NeurIPS
2023
Large Language Models Are Zero-Shot Time Series Forecasters
Nate Gruver
,
Marc Finzi
,
Shikai Qiu
,
Andrew G Wilson
NeurIPS
2023
Should We Learn Most Likely Functions or Parameters?
Shikai Qiu
,
Tim G. J. Rudner
,
Sanyam Kapoor
,
Andrew G Wilson
ICML
2023
Simple and Fast Group Robustness by Automatic Feature Reweighting
Shikai Qiu
,
Andres Potapczynski
,
Pavel Izmailov
,
Andrew Gordon Wilson