Watanabe, Shinji

22 publications

NeurIPS 2025 AHa-Bench: Benchmarking Audio Hallucinations in Large Audio-Language Models Xize Cheng, Dongjie Fu, Chenyuhao Wen, Shannon Yu, Zehan Wang, Shengpeng Ji, Siddhant Arora, Tao Jin, Shinji Watanabe, Zhou Zhao
NeurIPS 2025 ARECHO: Autoregressive Evaluation via Chain-Based Hypothesis Optimization for Speech Multi-Metric Estimation Jiatong Shi, Yifan Cheng, Bo-Hao Su, Hye-jin Shim, Jinchuan Tian, Samuele Cornell, Yiwen Zhao, Siddhant Arora, Shinji Watanabe
ICLR 2025 Context-Aware Dynamic Pruning for Speech Foundation Models Masao Someki, Yifan Peng, Siddhant Arora, Markus Müller, Athanasios Mouchtaris, Grant Strimel, Jing Liu, Shinji Watanabe
TMLR 2025 Discrete Audio Tokens: More than a Survey! Pooneh Mousavi, Gallil Maimon, Adel Moumen, Darius Petermann, Jiatong Shi, Haibin Wu, Haici Yang, Anastasia Kuznetsova, Artem Ploujnikov, Ricard Marxer, Bhuvana Ramabhadran, Benjamin Elizalde, Loren Lugosch, Jinyu Li, Cem Subakan, Phil Woodland, Minje Kim, Hung-yi Lee, Shinji Watanabe, Yossi Adi, Mirco Ravanelli
ICLR 2025 Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks Chien-yu Huang, Wei-Chih Chen, Shu-wen Yang, Andy T. Liu, Chen-An Li, Yu-Xiang Lin, Wei-Cheng Tseng, Anuj Diwan, Yi-Jen Shih, Jiatong Shi, William Chen, Chih-Kai Yang, Xuanjun Chen, Chi-Yuan Hsiao, Puyuan Peng, Shih-Heng Wang, Chun-Yi Kuan, Ke-Han Lu, Kai-Wei Chang, Fabian Alejandro Ritter Gutierrez, Huang Kuan-Po, Siddhant Arora, You-Kuan Lin, CHUANG Ming To, Eunjung Yeo, Kalvin Chang, Chung-Ming Chien, Kwanghee Choi, Cheng-Hsiu Hsieh, Yi-Cheng Lin, Chee-En Yu, I-Hsiang Chiu, Heitor Guimarães, Jionghao Han, Tzu-Quan Lin, Tzu-Yuan Lin, Homu Chang, Ting-Wu Chang, Chun Wei Chen, Shou-Jen Chen, Yu-Hua Chen, Hsi-Chun Cheng, Kunal Dhawan, Jia-Lin Fang, Shi-Xin Fang, Kuan Yu Fang Chiang, Chi An Fu, Hsien-Fu Hsiao, Ching Yu Hsu, Shao-Syuan Huang, Lee Chen Wei, Hsi-Che Lin, Hsuan-Hao Lin, Hsuan-Ting Lin, Jian-Ren Lin, Ting-Chun Liu, Li-Chun Lu, Tsung-Min Pai, Ankita Pasad, Shih-Yun Shan Kuan, Suwon Shon, Yuxun Tang, Yun-Shao Tsai, Wei Jui Chiang, Tzu-Chieh Wei, Chengxi Wu, Dien-Ruei Wu, Chao-Han Huck Yang, Chieh-Chi Yang, Jia Qi Yip, Shao-Xiang Yuan, Haibin Wu, Karen Livescu, David Harwath, Shinji Watanabe, Hung-yi Lee
AAAI 2025 Enhancing Audiovisual Speech Recognition Through Bifocal Preference Optimization Yihan Wu, Yichen Lu, Yifan Peng, Xihua Wang, Ruihua Song, Shinji Watanabe
ICML 2025 OWLS: Scaling Laws for Multilingual Speech Recognition and Translation Models William Chen, Jinchuan Tian, Yifan Peng, Brian Yan, Chao-Han Huck Yang, Shinji Watanabe
TMLR 2025 On the Landscape of Spoken Language Models: A Comprehensive Survey Siddhant Arora, Kai-Wei Chang, Chung-Ming Chien, Yifan Peng, Haibin Wu, Yossi Adi, Emmanuel Dupoux, Hung-yi Lee, Karen Livescu, Shinji Watanabe
ICLR 2025 Talking Turns: Benchmarking Audio Foundation Models on Turn-Taking Dynamics Siddhant Arora, Zhiyun Lu, Chung-Cheng Chiu, Ruoming Pang, Shinji Watanabe
AAAI 2024 AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head Rongjie Huang, Mingze Li, Dongchao Yang, Jiatong Shi, Xuankai Chang, Zhenhui Ye, Yuning Wu, Zhiqing Hong, Jiawei Huang, Jinglin Liu, Yi Ren, Yuexian Zou, Zhou Zhao, Shinji Watanabe
IJCAI 2024 Cross-Talk Reduction Zhong-Qiu Wang, Anurag Kumar, Shinji Watanabe
AAAI 2023 A Vector Quantized Approach for Text to Speech Synthesis on Real-World Spontaneous Speech Li-Wei Chen, Shinji Watanabe, Alexander Rudnicky
ICLR 2023 Bayes Risk Ctc: Controllable Ctc Alignment in Sequence-to-Sequence Tasks Jinchuan Tian, Brian Yan, Jianwei Yu, Chao Weng, Dong Yu, Shinji Watanabe
ICML 2023 Efficient Sequence Transduction by Jointly Predicting Tokens and Durations Hainan Xu, Fei Jia, Somshubra Majumdar, He Huang, Shinji Watanabe, Boris Ginsburg
IJCAI 2023 Learning to Speak from Text: Zero-Shot Multilingual Text-to-Speech with Unsupervised Text Pretraining Takaaki Saeki, Soumi Maiti, Xinjian Li, Shinji Watanabe, Shinnosuke Takamichi, Hiroshi Saruwatari
NeurIPS 2023 UNSSOR: Unsupervised Neural Speech Separation by Leveraging Over-Determined Training Mixtures Zhong-Qiu Wang, Shinji Watanabe
ICML 2022 Branchformer: Parallel MLP-Attention Architectures to Capture Local and Global Context for Speech Recognition and Understanding Yifan Peng, Siddharth Dalmia, Ian Lane, Shinji Watanabe
NeurIPS 2020 Sequence to Multi-Sequence Learning via Conditional Chain Mapping for Mixture Signals Jing Shi, Xuankai Chang, Pengcheng Guo, Shinji Watanabe, Yusuke Fujita, Jiaming Xu, Bo Xu, Lei Xie
ICML 2017 Multichannel End-to-End Speech Recognition Tsubasa Ochiai, Shinji Watanabe, Takaaki Hori, John R. Hershey
IJCAI 2011 Fashion Coordinates Recommender System Using Photographs from Fashion Magazines Tomoharu Iwata, Shinji Watanabe, Hiroshi Sawada
IJCAI 2009 Topic Tracking Model for Analyzing Consumer Purchase Behavior Tomoharu Iwata, Shinji Watanabe, Takeshi Yamada, Naonori Ueda
NeurIPS 2002 Application of Variational Bayesian Approach to Speech Recognition Shinji Watanabe, Yasuhiro Minami, Atsushi Nakamura, Naonori Ueda