Tyukin, Georgy

1 publications

ICMLW 2024 Attention Is All You Need but You Don’t Need All of It for Inference of Large Language Models Georgy Tyukin, Gbetondji Jean-Sebastien Dovonon, Jean Kaddour, Pasquale Minervini