Zhang, Minjia

19 publications

ICCV 2025 InstantEdit: Text-Guided Few-Step Image Editing with Piecewise Rectified Flow Yiming Gong, Zhen Zhu, Minjia Zhang
AAAI 2024 DeepSpeed Data Efficiency: Improving Deep Learning Model Quality and Training Efficiency via Efficient Data Sampling and Routing Conglong Li, Zhewei Yao, Xiaoxia Wu, Minjia Zhang, Connor Holmes, Cheng Li, Yuxiong He
ICLR 2024 Model Tells You What to Discard: Adaptive KV Cache Compression for LLMs Suyu Ge, Yunan Zhang, Liyuan Liu, Minjia Zhang, Jiawei Han, Jianfeng Gao
NeurIPS 2024 UltraEdit: Instruction-Based Fine-Grained Image Editing at Scale Haozhe Zhao, Xiaojian Ma, Liang Chen, Shuzheng Si, Rujie Wu, Kaikai An, Peiyu Yu, Minjia Zhang, Qing Li, Baobao Chang
NeurIPSW 2023 DeepSpeed4Science Initiative: Enabling Large-Scale Scientific Discovery Through Sophisticated AI System Technologies Shuaiwen Leon Song, Bonnie Kruft, Minjia Zhang, Conglong Li, Shiyang Chen, Chengming Zhang, Masahiro Tanaka, Xiaoxia Wu, Mohammed AlQuraishi, Gustaf Ahdritz, Christina Floristean, Rick L. Stevens, Venkatram Vishwanath, Arvind Ramanathan, Sam Foreman, Kyle Hippe, Prasanna Balaprakash, Yuxiong He
ICLR 2023 Maximizing Communication Efficiency for Large-Scale Training via 0/1 Adam Yucheng Lu, Conglong Li, Minjia Zhang, Christopher De Sa, Yuxiong He
NeurIPSW 2023 Model Tells You What to Discard: Adaptive KV Cache Compression for LLMs Suyu Ge, Yunan Zhang, Liyuan Liu, Minjia Zhang, Jiawei Han, Jianfeng Gao
AAAI 2022 Adversarial Data Augmentation for Task-Specific Knowledge Distillation of Pre-Trained Transformers Minjia Zhang, Uma-Naresh Niranjan, Yuxiong He
ICML 2022 DeepSpeed-MoE: Advancing Mixture-of-Experts Inference and Training to Power Next-Generation AI Scale Samyam Rajbhandari, Conglong Li, Zhewei Yao, Minjia Zhang, Reza Yazdani Aminabadi, Ammar Ahmad Awan, Jeff Rasley, Yuxiong He
NeurIPS 2022 The Stability-Efficiency Dilemma: Investigating Sequence Length Warmup for Training GPT Models Conglong Li, Minjia Zhang, Yuxiong He
NeurIPS 2022 XTC: Extreme Compression for Pre-Trained Transformers Made Simple and Efficient Xiaoxia Wu, Zhewei Yao, Minjia Zhang, Conglong Li, Yuxiong He
NeurIPS 2022 ZeroQuant: Efficient and Affordable Post-Training Quantization for Large-Scale Transformers Zhewei Yao, Reza Yazdani Aminabadi, Minjia Zhang, Xiaoxia Wu, Conglong Li, Yuxiong He
ICLR 2021 DynaTune: Dynamic Tensor Program Optimization in Deep Neural Network Compilation Minjia Zhang, Menghao Li, Chi Wang, Mingqin Li
NeurIPS 2021 NxMTransformer: Semi-Structured Sparsification for Natural Language Understanding via ADMM Connor Holmes, Minjia Zhang, Yuxiong He, Bo Wu
NeurIPS 2020 Accelerating Training of Transformer-Based Language Models with Progressive Layer Dropping Minjia Zhang, Yuxiong He
NeurIPS 2020 AdaTune: Adaptive Tensor Program Compilation Made Efficient Menghao Li, Minjia Zhang, Chi Wang, Mingqin Li
NeurIPS 2020 HM-ANN: Efficient Billion-Point Nearest Neighbor Search on Heterogeneous Memory Jie Ren, Minjia Zhang, Dong Li
ICLR 2018 Learning Intrinsic Sparse Structures Within Long Short-Term Memory Wei Wen, Yuxiong He, Samyam Rajbhandari, Minjia Zhang, Wenhan Wang, Fang Liu, Bin Hu, Yiran Chen, Hai Li
NeurIPS 2018 Navigating with Graph Representations for Fast and Scalable Decoding of Neural Language Models Minjia Zhang, Wenhan Wang, Xiaodong Liu, Jianfeng Gao, Yuxiong He