Zhmoginov, Andrey

16 publications

ICLR 2025 How New Data Permeates LLM Knowledge and How to Dilute It Chen Sun, Renat Aksitov, Andrey Zhmoginov, Nolan Andrew Miller, Max Vladymyrov, Ulrich Rueckert, Been Kim, Mark Sandler
ICLR 2025 MELODI: Exploring Memory Compression for Long Contexts Yinpeng Chen, DeLesley Hutchins, Aren Jansen, Andrey Zhmoginov, David Racz, Jesper Sparre Andersen
TMLR 2024 Continual HyperTransformer: A Meta-Learner for Continual Few-Shot Learning Max Vladymyrov, Andrey Zhmoginov, Mark Sandler
NeurIPSW 2024 How New Data Pollutes LLM Knowledge and How to Dilute It Chen Sun, Renat Aksitov, Andrey Zhmoginov, Nolan Andrew Miller, Max Vladymyrov, Ulrich Rueckert, Been Kim, Mark Sandler
ICMLW 2024 Learning Fast and Slow: Representations for In-Context Weight Modulation Andrey Zhmoginov, Jihwan Lee, Max Vladymyrov, Mark Sandler
ICMLW 2024 Learning and Unlearning of Fabricated Knowledge in Language Models Chen Sun, Nolan Andrew Miller, Andrey Zhmoginov, Max Vladymyrov, Mark Sandler
ICMLW 2024 Projectable Models: One-Shot Generation of Small Specialized Transformers from Large Ones Andrey Zhmoginov, Jihwan Lee, Mark Sandler
CVPR 2023 Decentralized Learning with Multi-Headed Distillation Andrey Zhmoginov, Mark Sandler, Nolan Miller, Gus Kristiansen, Max Vladymyrov
ICML 2023 Transformers Learn In-Context by Gradient Descent Johannes Von Oswald, Eyvind Niklasson, Ettore Randazzo, Joao Sacramento, Alexander Mordvintsev, Andrey Zhmoginov, Max Vladymyrov
CVPR 2022 Fine-Tuning Image Transformers Using Learnable Memory Mark Sandler, Andrey Zhmoginov, Max Vladymyrov, Andrew Jackson
ICML 2022 HyperTransformer: Model Generation for Supervised and Semi-Supervised Few-Shot Learning Andrey Zhmoginov, Mark Sandler, Maksym Vladymyrov
CVPRW 2021 BasisNet: Two-Stage Model Synthesis for Efficient Inference Mingda Zhang, Chun-Te Chu, Andrey Zhmoginov, Andrew Howard, Brendan Jou, Yukun Zhu, Li Zhang, Rebecca Hwa, Adriana Kovashka
ICML 2021 Meta-Learning Bidirectional Update Rules Mark Sandler, Max Vladymyrov, Andrey Zhmoginov, Nolan Miller, Tom Madams, Andrew Jackson, Blaise Agüera Y Arcas
ECML-PKDD 2020 Information-Bottleneck Approach to Salient Region Discovery Andrey Zhmoginov, Ian Fischer, Mark Sandler
ICLR 2019 K for the Price of 1: Parameter-Efficient Multi-Task and Transfer Learning Pramod Kaushik Mudrakarta, Mark Sandler, Andrey Zhmoginov, Andrew Howard
ICCVW 2019 Non-Discriminative Data or Weak Model? on the Relative Importance of Data and Model Resolution Mark Sandler, Jonathan Baccash, Andrey Zhmoginov, Andrew Howard