Aksenov, Yaroslav

2 publications

ICML 2025 Analyze Feature Flow to Enhance Interpretation and Steering in Language Models Daniil Laptev, Nikita Balagansky, Yaroslav Aksenov, Daniil Gavrilov
ICLR 2025 Learn Your Reference Model for Real Good Alignment Alexey Gorbatovski, Boris Shaposhnikov, Alexey Malakhov, Nikita Surnachev, Yaroslav Aksenov, Ian Maksimov, Nikita Balagansky, Daniil Gavrilov