Recent Advances in Direct Speech-to-Text Translation

Xu, Chen; Ye, Rong; Dong, Qianqian; Zhao, Chengqi; Ko, Tom; Wang, Mingxuan; Xiao, Tong; Zhu, Jingbo

doi:10.24963/IJCAI.2023/761

Recent Advances in Direct Speech-to-Text Translation

Chen Xu, Rong Ye, Qianqian Dong, Chengqi Zhao, Tom Ko, Mingxuan Wang, Tong Xiao, Jingbo Zhu

IJCAI 2023 pp. 6796-6804

doi:10.24963/IJCAI.2023/761 /ijcai/2023/xu2023ijcai-recent/

Abstract

Recently, speech-to-text translation has attracted more and more attention and many studies have emerged rapidly. In this paper, we present a comprehensive survey on direct speech translation aiming to summarize the current state-of-the-art techniques. First, we categorize the existing research work into three directions based on the main challenges --- modeling burden, data scarcity, and application issues. To tackle the problem of modeling burden, two main structures have been proposed, encoder-decoder framework (Transformer and the variants) and multitask frameworks. For the challenge of data scarcity, recent work resorts to many sophisticated techniques, such as data augmentation, pre-training, knowledge distillation, and multilingual modeling. We analyze and summarize the application issues, which include real-time, segmentation, named entity, gender bias, and code-switching. Finally, we discuss some promising directions for future work.

PDF IJCAI Semantic Scholar

Cite

Text

Xu et al. "Recent Advances in Direct Speech-to-Text Translation." International Joint Conference on Artificial Intelligence, 2023. doi:10.24963/IJCAI.2023/761

Markdown

[Xu et al. "Recent Advances in Direct Speech-to-Text Translation." International Joint Conference on Artificial Intelligence, 2023.](https://mlanthology.org/ijcai/2023/xu2023ijcai-recent/) doi:10.24963/IJCAI.2023/761

BibTeX

@inproceedings{xu2023ijcai-recent,
  title     = {{Recent Advances in Direct Speech-to-Text Translation}},
  author    = {Xu, Chen and Ye, Rong and Dong, Qianqian and Zhao, Chengqi and Ko, Tom and Wang, Mingxuan and Xiao, Tong and Zhu, Jingbo},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2023},
  pages     = {6796-6804},
  doi       = {10.24963/IJCAI.2023/761},
  url       = {https://mlanthology.org/ijcai/2023/xu2023ijcai-recent/}
}