File size: 2,999 Bytes
f0d6f7d 2cb7ba3 f0d6f7d d18ac80 2cb7ba3 f0d6f7d 2cb7ba3 f0d6f7d d44ecaf f0d6f7d 50c7cde f0d6f7d 1009311 f0d6f7d 2cb7ba3 f0d6f7d |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 |
---
language:
- "en"
thumbnail:
tags:
- audio-to-audio
- Speech Enhancement
- Voicebank-DEMAND
- UNIVERSE
- UNIVERSE++
- Diffusion
- pytorch
- open-universe
license: "apache-2.0"
datasets:
- Voicebank-DEMAND
metrics:
- SI-SNR
- PESQ
- SIG
- BAK
- OVRL
model-index:
- name: universe++
results:
- task:
name: Speech Enhancement
type: speech-enhancement
dataset:
name: DEMAND
type: demand
split: test-set
args:
language: en
metrics:
- name: DNSMOS SIG
type: sig
value: '3.493'
- name: DNSMOS BAK
type: bak
value: '4.042'
- name: DNSMOS OVRL
type: ovrl
value: '3.205'
- name: PESQ
type: pesq
value: 3.017
- name: SI-SDR
type: si-sdr
value: 18.629
---
# open-universe: Generative Speech Enhancement with Score-based Diffusion and Adversarial Training
This repository contains the configurations and weights for the [UNIVERSE++](https://arxiv.org/abs/2406.12194) and
[UNIVERSE](https://arxiv.org/abs/2206.03065) models implemented in [open-universe](https://github.com/line/open-universe).
The models were trained on the [Voicebank-DEMAND](https://datashare.ed.ac.uk/handle/10283/2791) dataset at 16 kHz.
The performance on the test split of Voicebank-DEMAND is given in the following table.
| model | si-sdr | pesq-wb | stoi-ext | lsd | lps | OVRL | SIG | BAK |
|------------|----------|-----------|------------|-------|-------|--------|-------|-------|
| UNIVERSE++ | 18.624 | 3.017 | 0.864 | 4.867 | 0.937 | 3.200 | 3.489 | 4.040 |
| UNIVERSE | 17.600 | 2.830 | 0.844 | 6.318 | 0.920 | 3.157 | 3.457 | 4.013 |
## Usage
Start by installing `open-universe`.
We use conda to simplify the installation.
```sh
git clone https://github.com/line/open-universe.git
cd open-universe
conda env create -f environment.yaml
conda activate open-universe
python -m pip install .
```
Then the models can be used as follows.
```sh
# UNIVERSE++ (default model)
python -m open_universe.bin.enhance <input/folder> <output/folder> \
--model line-corporation/open-universe:plusplus
# UNIVERSE
python -m open_universe.bin.enhance <input/folder> <output/folder> \
--model line-corporation/open-universe:original
```
## Referencing open-universe and UNIVERSE++
If you use these models in your work, please consider citing the following paper.
```latex
@inproceedings{universepp,
authors={Scheibler, Robin and Fujita, Yusuke and Shirahata, Yuma and Komatsu, Tatsuya},
title={Universal Score-based Speech Enhancement with High Content Preservation},
booktitle={Proc. Interspeech 2024},
month=sep,
year=2024
}
```
## Referencing UNIVERSE
```latex
@misc{universe,
authors={Serr\'a, Joan and Santiago, Pascual and Pons, Jordi and Araz, Oguz R. and Scaini, David},
title={Universal Speech Enhancement with Score-based Diffusion},
howpublished={arXiv:2206.03065},
month=sep,
year=2022
}
```
|