Zhang, Sijun

2 publications

NeurIPS 2025 HybridNorm: Towards Stable and Efficient Transformer Training via Hybrid Normalization Zhijian Zhuo, Yutao Zeng, Ya Wang, Sijun Zhang, Xiaoqing Li, Jian Yang, Zhou Xun, Jinwen Ma
ICLR 2025 TC-MoE: Augmenting Mixture of Experts with Ternary Expert Choice Shen Yan, Xingyan Bin, Sijun Zhang, Yisen Wang, Zhouchen Lin