alibabasglab
/

MossFormer2_SR_48K

Model card Files Files and versions Community

alibabasglab commited on Jan 21

Commit

c5200a2

·

verified ·

1 Parent(s): 5f3864b

Update README.md

Files changed (1) hide show

README.md +4 -2

README.md CHANGED Viewed

@@ -3,7 +3,7 @@ license: apache-2.0
 ---
 # Introduction
-The MossFormer2_SR_48K model weights for 48 kHz speech super-resolution in [ClearerVoice-Studio](https://github.com/modelscope/ClearerVoice-Studio/tree/main) repo.
 This model is trained on large scale datasets inclduing open-sourced and private data.
@@ -64,4 +64,6 @@ myClearVoice(input_path='samples/scp/audio_samples_sr.scp', online_write=True, o
 ```
 Model Limitations: The current speech super-resolution model is trained on a clean speech dataset and is designed to work with clean speech inputs. For speech super-resolution on noisy speech audio,
-we recommend using our 'MossFormer2_SE_48K' model for speech enhancement first, followed by 'MossFormer2_SR_48K' for speech super-resolution.

 ---
 # Introduction
+The MossFormer2_SR_48K model weights for 48 kHz speech super-resolution [1] provdied in [ClearerVoice-Studio](https://github.com/modelscope/ClearerVoice-Studio/tree/main) repo.
 This model is trained on large scale datasets inclduing open-sourced and private data.
 ```
 Model Limitations: The current speech super-resolution model is trained on a clean speech dataset and is designed to work with clean speech inputs. For speech super-resolution on noisy speech audio,
+we recommend using our 'MossFormer2_SE_48K' model for speech enhancement first, followed by 'MossFormer2_SR_48K' for speech super-resolution.
+[1] Shengkui Zhao, Kun Zhou, Zexu Pan, Yukun Ma, Chong Zhang, and Bin Ma, "HiFi-SR: A Unified Generative Transformer-Convolutional Adversarial Network for High-Fidelity Speech Super-Resolution", ICASSP 2025.