--- language: - "en" thumbnail: tags: - audio-to-audio - Speech Enhancement - Voicebank-DEMAND - UNIVERSE - UNIVERSE++ - Diffusion - pytorch - open-universe license: "apache-2.0" datasets: - Voicebank-DEMAND metrics: - SI-SNR - PESQ - SIG - BAK - OVRL model-index: - name: universe++ results: - task: name: Speech Enhancement type: speech-enhancement dataset: name: Voicebank-DEMAND type: demand split: test-set args: language: en metrics: - name: DNSMOS SIG type: sig value: '3.493' - name: DNSMOS BAK type: bak value: '4.042' - name: DNSMOS OVRL type: ovrl value: '3.205' - name: PESQ type: pesq value: 3.017 - name: SI-SDR type: si-sdr value: 18.629 --- # Open-UNIVERSE: Generative Speech Enhancement with Score-based Diffusion and Adversarial Training This repository contains the configurations and weights for the [UNIVERSE++](tba) and [UNIVERSE](https://arxiv.org/abs/2206.03065) models implemented in [open-universe](https://github.com/line/open-universe). The models were trained on the [Voicebank-DEMAND](https://datashare.ed.ac.uk/handle/10283/2791) dataset at 16 kHz. The performance on the test split of Voicebank-DEMAND is given in the following table. | model | si-sdr | pesq-wb | stoi-ext | lsd | lps | OVRL | SIG | BAK | |------------|----------|-----------|------------|-------|-------|--------|-------|-------| | UNIVERSE++ | 18.629 | 3.017 | 0.865 | 4.868 | 0.937 | 3.205 | 3.493 | 4.042 | | UNIVERSE | 17.594 | 2.834 | 0.845 | 6.318 | 0.920 | 3.156 | 3.455 | 4.013 | ## Usage Start by installing `open-universe`. We use conda to simplify the installation. ```sh git clone https://github.com/line/open-universe.git cd open-universe conda env create -f environment.yaml conda activate open-universe python -m pip install . ``` Then the models can be used as follows. ```sh # UNIVERSE++ (default model) python -m open_universe.bin.enhance \ --model line-corporation/open-universe:plusplus # UNIVERSE python -m open_universe.bin.enhance \ --model line-corporation/open-universe:original ``` ## Referencing open-universe and UNIVERSE++ If you use these models in your work, please consider citing the following paper. ```latex @inproceedings{universepp, authors={Scheibler, Robin and Fujita, Yusuke and Shirahata, Yuma and Komatsu, Tatsuya}, title={Universal Score-based Speech Enhancement with High Content Preservation}, booktitle={Proc. Interspeech 2024}, month=sep, year=2024 } ``` ## Referencing UNIVERSE ```latex @misc{universe, authors={Serr\'a, Joan and Santiago, Pascual and Pons, Jordi and Araz, Oguz R. and Scaini, David}, title={Universal Speech Enhancement with Score-based Diffusion}, howpublished={arXiv:2206.03065}, month=sep, year=2022 } ```