Wu, Banggu

5 publications

ICLR 2026 UltraMemV2: Memory Networks Scaling to 120b Parameters with Superior Long-Context Learning Zihao Huang, Yu Bao, Qiyang Min, Siyan Chen, Ran Guo, Hongzhi Huang, Defa Zhu, Banggu Wu, Yutao Zeng, Zhou Xun, Siyuan Qiao
ICLR 2025 Hyper-Connections Defa Zhu, Hongzhi Huang, Zihao Huang, Yutao Zeng, Yunyao Mao, Banggu Wu, Qiyang Min, Xun Zhou
ICML 2025 Over-Tokenized Transformer: Vocabulary Is Generally Worth Scaling Hongzhi Huang, Defa Zhu, Banggu Wu, Yutao Zeng, Ya Wang, Qiyang Min, Zhou Xun
CVPR 2020 ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks Qilong Wang, Banggu Wu, Pengfei Zhu, Peihua Li, Wangmeng Zuo, Qinghua Hu
CVPR 2020 What Deep CNNs Benefit from Global Covariance Pooling: An Optimization Perspective Qilong Wang, Li Zhang, Banggu Wu, Dongwei Ren, Peihua Li, Wangmeng Zuo, Qinghua Hu