ML Anthology
Authors
Search
About
Mohtashami, Amirkeivan
9 publications
ICLR
2025
CoTFormer: A Chain of Thought Driven Architecture with Budget-Adaptive Computation Cost at Inference
Amirkeivan Mohtashami
,
Matteo Pagliardini
,
Martin Jaggi
NeurIPS
2024
DenseFormer: Enhancing Information Flow in Transformers via Depth Weighted Averaging
Matteo Pagliardini
,
Amirkeivan Mohtashami
,
Francois Fleuret
,
Martin Jaggi
NeurIPS
2024
QuaRot: Outlier-Free 4-Bit Inference in Rotated LLMs
Saleh Ashkboos
,
Amirkeivan Mohtashami
,
Maximilian L. Croci
,
Bo Li
,
Pashmina Cameron
,
Martin Jaggi
,
Dan Alistarh
,
Torsten Hoefler
,
James Hensman
NeurIPSW
2023
CoTFormer: More Tokens with Attention Make up for Less Depth
Amirkeivan Mohtashami
,
Matteo Pagliardini
,
Martin Jaggi
ICMLW
2023
Landmark Attention: Random-Access Infinite Context Length for Transformers
Amirkeivan Mohtashami
,
Martin Jaggi
NeurIPS
2023
Random-Access Infinite Context Length for Transformers
Amirkeivan Mohtashami
,
Martin Jaggi
ICML
2023
Special Properties of Gradient Descent with Large Learning Rates
Amirkeivan Mohtashami
,
Martin Jaggi
,
Sebastian U Stich
AISTATS
2022
Masked Training of Neural Networks with Partial Gradients
Amirkeivan Mohtashami
,
Martin Jaggi
,
Sebastian Stich
AISTATS
2021
Critical Parameters for Scalable Distributed Learning with Large Batches and Asynchronous Updates
Sebastian Stich
,
Amirkeivan Mohtashami
,
Martin Jaggi