Spaces:
Sleeping
Sleeping
update files
Browse files- app.py +3 -1
- requirements.txt +0 -1
app.py
CHANGED
@@ -11,7 +11,7 @@ DESCRIPTION='''
|
|
11 |
OWSM is an Open Whisper-style Speech Model from [CMU WAVLab](https://www.wavlab.org/).
|
12 |
It reproduces Whisper-style training using publicly available data and an open-source toolkit [ESPnet](https://github.com/espnet/espnet).
|
13 |
|
14 |
-
OWSM v3 is trained on 180k hours of paired speech data. It supports various speech-to-text tasks:
|
15 |
- Speech recognition for 151 languages
|
16 |
- Any-to-any language speech translation
|
17 |
- Timestamp prediction
|
@@ -20,6 +20,8 @@ OWSM v3 is trained on 180k hours of paired speech data. It supports various spee
|
|
20 |
|
21 |
For more details, please check out our [paper](https://arxiv.org/abs/2309.13876) (Peng et al., ASRU 2023).
|
22 |
|
|
|
|
|
23 |
```
|
24 |
@article{peng2023owsm,
|
25 |
title={Reproducing Whisper-Style Training Using an Open-Source Toolkit and Publicly Available Data},
|
|
|
11 |
OWSM is an Open Whisper-style Speech Model from [CMU WAVLab](https://www.wavlab.org/).
|
12 |
It reproduces Whisper-style training using publicly available data and an open-source toolkit [ESPnet](https://github.com/espnet/espnet).
|
13 |
|
14 |
+
OWSM v3 has 889M parameters and is trained on 180k hours of paired speech data. It supports various speech-to-text tasks:
|
15 |
- Speech recognition for 151 languages
|
16 |
- Any-to-any language speech translation
|
17 |
- Timestamp prediction
|
|
|
20 |
|
21 |
For more details, please check out our [paper](https://arxiv.org/abs/2309.13876) (Peng et al., ASRU 2023).
|
22 |
|
23 |
+
We also have a [Colab demo](https://colab.research.google.com/drive/1zKI3ZY_OtZd6YmVeED6Cxy1QwT1mqv9O?usp=sharing) where you can use a free GPU.
|
24 |
+
|
25 |
```
|
26 |
@article{peng2023owsm,
|
27 |
title={Reproducing Whisper-Style Training Using an Open-Source Toolkit and Publicly Available Data},
|
requirements.txt
CHANGED
@@ -1,4 +1,3 @@
|
|
1 |
torch==2.1.0
|
2 |
torchaudio
|
3 |
espnet @ git+https://github.com/espnet/espnet@d3254133c595ea8271072ee49a1b4ceb3ed4fd7a
|
4 |
-
espnet_model_zoo
|
|
|
1 |
torch==2.1.0
|
2 |
torchaudio
|
3 |
espnet @ git+https://github.com/espnet/espnet@d3254133c595ea8271072ee49a1b4ceb3ed4fd7a
|
|