Online Sequence Alignment for Real-Time Audio Transcription by Non-Experts

Abstract

Real-time transcription provides deaf and hard of hearing people visual access to spoken content, such as classroom instruction, and other live events. Currently, the only reliable source of real-time transcriptions are expensive, highly-trained experts who are able to keep up with speaking rates. Automatic speech recognition is cheaper but produces too many errors in realistic settings. We introduce a new approach in which partial captions from multiple non-experts are combined to produce a high-quality transcription in real-time. We demonstrate the potential of this approach with data collected from 20 non-expert captionists.

Cite

Text

Lasecki et al. "Online Sequence Alignment for Real-Time Audio Transcription by Non-Experts." AAAI Conference on Artificial Intelligence, 2012. doi:10.1609/AAAI.V26I1.8420

Markdown

[Lasecki et al. "Online Sequence Alignment for Real-Time Audio Transcription by Non-Experts." AAAI Conference on Artificial Intelligence, 2012.](https://mlanthology.org/aaai/2012/lasecki2012aaai-online/) doi:10.1609/AAAI.V26I1.8420

BibTeX

@inproceedings{lasecki2012aaai-online,
  title     = {{Online Sequence Alignment for Real-Time Audio Transcription by Non-Experts}},
  author    = {Lasecki, Walter S. and Miller, Christopher D. and Borrello, Donato and Bigham, Jeffrey P.},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2012},
  pages     = {2437-2438},
  doi       = {10.1609/AAAI.V26I1.8420},
  url       = {https://mlanthology.org/aaai/2012/lasecki2012aaai-online/}
}