Papers
arxiv:2305.05084

Fast Conformer with Linearly Scalable Attention for Efficient Speech Recognition

Published on May 8, 2023
Authors:
,
,
,
,
,
,

Abstract

Conformer-based models have become the most dominant end-to-end architecture for speech processing tasks. In this work, we propose a carefully redesigned Conformer with a new down-sampling schema. The proposed model, named Fast Conformer, is 2.8x faster than original Conformer, while preserving state-of-the-art accuracy on Automatic Speech Recognition benchmarks. Also we replace the original Conformer global attention with limited context attention post-training to enable transcription of an hour-long audio. We further improve long-form speech transcription by adding a global token. Fast Conformer combined with a Transformer decoder also outperforms the original Conformer in accuracy and in speed for Speech Translation and Spoken Language Understanding.

Community

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment

Models citing this paper 46

Browse 46 models citing this paper

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2305.05084 in a dataset README.md to link it from this page.

Spaces citing this paper 57

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.