ML Anthology
Authors
Search
About
Yao, Bowen
1 publications
NeurIPS
2024
NoMAD-Attention: Efficient LLM Inference on CPUs Through Multiply-Add-Free Attention
Tianyi Zhang
,
Jonah Yi
,
Bowen Yao
,
Zhaozhuo Xu
,
Anshumali Shrivastava