Mo, Shentong

35 publications

CVPR 2025 Foley-Flow: Coordinated Video-to-Audio Generation with Masked Audio-Visual Alignment and Dynamic Conditional Flows Shentong Mo, Yibing Song
ICML 2025 GMAIL: Generative Modality Alignment for Generated Image Learning Shentong Mo, Sukmin Yun
AAAI 2025 Scaling Diffusion Mamba with Bidirectional SSMs for Efficient 3D Shape Generation Shentong Mo
AAAI 2025 The Dynamic Duo of Collaborative Masking and Target for Advanced Masked Autoencoder Learning Shentong Mo
ICLR 2025 pMoE: Prompting Diverse Experts Together Wins More in Visual Adaptation Shentong Mo, Xufang Luo, Dongsheng Li
NeurIPS 2024 Aligning Audio-Visual Joint Representations with an Agentic Workflow Shentong Mo, Yibing Song
ECCV 2024 Audio-Synchronized Visual Animation Lin Zhang, Shentong Mo, Yijing Zhang, Pedro Morgado
ECCV 2024 Audio-Visual Generalized Zero-Shot Learning the Easy Way Shentong Mo, Pedro Morgado
NeurIPS 2024 Connecting Joint-Embedding Predictive Architecture with Contrastive Self-Supervised Learning Shentong Mo, Shengbang Tong
NeurIPS 2024 Continual Audio-Visual Sound Separation Weiguo Pian, Yiyang Nan, Shijian Deng, Shentong Mo, Yunhui Guo, Yapeng Tian
ECCVW 2024 DailyMAE: Towards Pretraining Masked Autoencoders in One Day Jiantao Wu, Shentong Mo, Sara Atito, Zhenhua Feng, Josef Kittler, Muhammad Awais
ECCV 2024 Fast Training of Diffusion Transformer with Extreme Masking for 3D Point Clouds Generation Shentong Mo, Enze Xie, Yue Wu, Junsong Chen, Matthias Niessner, Zhenguo Li
NeurIPSW 2024 Federated Self-Supervised Single-Cell Clustering of scRNA-Seq Data Shentong Mo
CVPRW 2024 MA-AVT: Modality Alignment for Parameter-Efficient Audio-Visual Transformers Tanvir Mahmud, Shentong Mo, Yapeng Tian, Diana Marculescu
NeurIPSW 2024 Masked Modeling for Single-Cell Clustering of scRNA‐seq Data Shentong Mo
NeurIPSW 2024 Scaling Dense Representations for Single Cell Gene Expression with Transcriptome-Scale Context Nicholas Ho, Caleb Ellington, Jinyu Hou, Sohan Addagudi, Shentong Mo, Tianhua Tao, Dian Li, Yonghao Zhuang, Hongyi Wang, Xingyi Cheng, Le Song, Eric P. Xing
ICMLW 2024 Towards Efficient Large-Scale Language-3D Representation Learning Shentong Mo, Xiaogang Xu, Tongzhou Wang, Antonio Torralba, Shuang Li
CVPR 2024 Unveiling the Power of Audio-Visual Early Fusion Transformers with Dense Interactions Through Masked Modeling Shentong Mo, Pedro Morgado
ICML 2023 A Unified Audio-Visual Learning Framework for Localization, Separation, and Recognition Shentong Mo, Pedro Morgado
ICCV 2023 Audio-Visual Class-Incremental Learning Weiguo Pian, Shentong Mo, Yunhui Guo, Yapeng Tian
CVPR 2023 Audio-Visual Grouping Network for Sound Localization from Mixtures Shentong Mo, Yapeng Tian
ICCV 2023 Class-Incremental Grouping Network for Continual Audio-Visual Learning Shentong Mo, Weiguo Pian, Yapeng Tian
NeurIPS 2023 DiT-3D: Exploring Plain Diffusion Transformers for 3D Shape Generation Shentong Mo, Enze Xie, Ruihang Chu, Lanqing Hong, Matthias Niessner, Zhenguo Li
NeurIPS 2023 DiffComplete: Diffusion-Based Generative 3D Shape Completion Ruihang Chu, Enze Xie, Shentong Mo, Zhenguo Li, Matthias Niessner, Chi-Wing Fu, Jiaya Jia
TMLR 2023 High-Modality Multimodal Transformer: Quantifying Modality & Interaction Heterogeneity for High-Modality Representation Learning Paul Pu Liang, Yiwei Lyu, Xiang Fan, Jeffrey Tsaw, Yudong Liu, Shentong Mo, Dani Yogatama, Louis-Philippe Morency, Russ Salakhutdinov
WACV 2023 Multi-Level Contrastive Learning for Self-Supervised Vision Transformers Shentong Mo, Zhun Sun, Chao Li
WACV 2023 Representation Disentanglement in Generative Models with Contrastive Learning Shentong Mo, Zhun Sun, Chao Li
NeurIPS 2023 Weakly-Supervised Audio-Visual Segmentation Shentong Mo, Bhiksha Raj
NeurIPS 2022 A Closer Look at Weakly-Supervised Audio-Visual Source Localization Shentong Mo, Pedro Morgado
ECCV 2022 Localizing Visual Sounds the Easy Way Shentong Mo, Pedro Morgado
NeurIPS 2022 Multi-Modal Grouping Network for Weakly-Supervised Audio-Visual Video Parsing Shentong Mo, Yapeng Tian
ECCV 2022 Unitail: Detecting, Reading, and Matching in Retail Scene Fangyi Chen, Han Zhang, Zaiwang Li, Jiachen Dou, Shentong Mo, Hao Chen, Yongxin Zhang, Uzair Ahmed, Chenchen Zhu, Marios Savvides
CVPRW 2021 EVA-GCN: Head Pose Estimation Based on Graph Convolutional Networks Miao Xin, Shentong Mo, Yuanze Lin
CVPRW 2021 Long-Term Head Pose Forecasting Conditioned on the Gaze-Guiding Prior Shentong Mo, Miao Xin
NeurIPSW 2021 Multi-Modal Self-Supervised Pre-Training for Large-Scale Genome Data Shentong Mo, Xi Fu, Chenyang Hong, Yizhen Chen, Yuxuan Zheng, Xiangru Tang, Yanyan Lan, Zhiqiang Shen, Eric Xing