fronx
/

Fast-FullSubNet

speech enhancement

speech separation

noise suppression

Model card Files Files and versions Community

fronx commited on Feb 9, 2024

Commit

cbbcba4

·

verified ·

1 Parent(s): d346fa8

Update README.md

Files changed (1) hide show

README.md +6 -4

README.md CHANGED Viewed

@@ -11,22 +11,24 @@ tags:
 This is a pre-trained version of Fast FullSubNet, a real-time denoising model trained on the Deep Noise Suppression Challenge dataset of 2020 (DNS-INTERSPEECH-2020).
-# Instructions
 https://fullsubnet.readthedocs.io/en/latest/usage/getting_started.html
-# Code
 https://github.com/Audio-WestlakeU/FullSubNet
 Note: The code doesn't support real-time streaming out of the box. See [issue-67](https://github.com/Audio-WestlakeU/FullSubNet/issues/67) for details.
-# Paper
 [Fast FullSubNet: Accelerate Full-band and Sub-band Fusion Model for Single-channel Speech Enhancement
 Xiang Hao, Xiaofei Li](https://arxiv.org/abs/2212.09019)
-# Performance
 |   | With Reverb |   |   |   | No Reverb |   |   |
 -- | -- | -- | -- | -- | -- | -- | --

 This is a pre-trained version of Fast FullSubNet, a real-time denoising model trained on the Deep Noise Suppression Challenge dataset of 2020 (DNS-INTERSPEECH-2020).
+## How to run
 https://fullsubnet.readthedocs.io/en/latest/usage/getting_started.html
+## Code
 https://github.com/Audio-WestlakeU/FullSubNet
 Note: The code doesn't support real-time streaming out of the box. See [issue-67](https://github.com/Audio-WestlakeU/FullSubNet/issues/67) for details.
+## Paper
 [Fast FullSubNet: Accelerate Full-band and Sub-band Fusion Model for Single-channel Speech Enhancement
 Xiang Hao, Xiaofei Li](https://arxiv.org/abs/2212.09019)
+> For many speech enhancement applications, a key feature is that system runs on a real-time, latency-sensitive, battery-powered platform, which strictly limits the algorithm latency and computational complexity. In this work, we propose a new architecture named Fast FullSubNet dedicated to accelerating the computation of FullSubNet. Specifically, Fast FullSubNet processes sub-band speech spectra in the mel-frequency domain by using cascaded linear-to-mel full-band, sub-band, and mel-to-linear full-band models such that frequencies involved in the sub-band computation are vastly reduced. After that, a down-sampling operation is proposed for the sub-band input sequence to further reduce the computational complexity along the time axis. Experimental results show that, compared to FullSubNet, Fast FullSubNet has only 13\% computational complexity and 16\% processing time, and achieves comparable or even better performance.
+## Performance
 |   | With Reverb |   |   |   | No Reverb |   |   |
 -- | -- | -- | -- | -- | -- | -- | --