Papers
arxiv:2212.07850

Attention as a Guide for Simultaneous Speech Translation

Published on Dec 15, 2022
Authors:
,

Abstract

The study of the attention mechanism has sparked interest in many fields, such as language modeling and machine translation. Although its patterns have been exploited to perform different tasks, from neural network understanding to textual alignment, no previous work has analysed the encoder-decoder attention behavior in speech translation (ST) nor used it to improve ST on a specific task. In this paper, we fill this gap by proposing an attention-based policy (EDAtt) for simultaneous ST (SimulST) that is motivated by an analysis of the existing attention relations between audio input and textual output. Its goal is to leverage the encoder-decoder attention scores to guide inference in real time. Results on en->{de, es} show that the EDAtt policy achieves overall better results compared to the SimulST state of the art, especially in terms of computational-aware latency.

Community

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2212.07850 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2212.07850 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2212.07850 in a Space README.md to link it from this page.

Collections including this paper 1