Abstract
Spoken dialogue systems enable people to interact with machines using speech, many of which involve the use of automatic speech recognition and language understanding in order to react to and determine a decision about how to respond. Unlike humans, many systems operate on complete sentences, waiting for a length of silence before attempting to process the input. In contrast, incremental spoken dialogue systems enable faster and more natural interaction by operating at a more fine-grained level. In this work, we evaluate six speech recognizers and RASA for language understanding in an incremental spoken dialogue system. The results suggest that, for speech recognition, online/cloud models can be slower and less stable than local models and we show that incremental language understanding can enable a system to make decisions earlier than waiting for the end of the utterance.
Original language | American English |
---|---|
Title of host publication | Proceedings of the 27th Workshop on the Semantics and Pragmatics of Dialogue - Full Papers |
State | Published - 16 Aug 2023 |
Keywords
- automatic speech recognition
- incremental
- natural language understanding
EGS Disciplines
- Artificial Intelligence and Robotics