bofenghuang
commited on
Commit
•
02a26b0
1
Parent(s):
b9072de
updt README.md
Browse files
README.md
CHANGED
@@ -1,15 +1,14 @@
|
|
1 |
---
|
2 |
-
language:
|
3 |
-
- fr
|
4 |
license: apache-2.0
|
|
|
|
|
|
|
5 |
tags:
|
6 |
- automatic-speech-recognition
|
7 |
- hf-asr-leaderboard
|
8 |
- robust-speech-event
|
9 |
-
-
|
10 |
-
-
|
11 |
-
- facebook/voxpopuli
|
12 |
-
- gigant/african_accented_french
|
13 |
datasets:
|
14 |
- common_voice
|
15 |
- mozilla-foundation/common_voice_11_0
|
@@ -91,12 +90,27 @@ model-index:
|
|
91 |
- name: Test WER (+LM)
|
92 |
type: wer
|
93 |
value: 12.96
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
94 |
---
|
95 |
|
96 |
# Fine-tuned wav2vec2-FR-7K-large model for ASR in French
|
97 |
|
98 |
-
|
99 |
|
|
|
100 |
|
101 |
## Usage
|
102 |
|
@@ -160,7 +174,6 @@ predicted_ids = torch.argmax(logits, dim=-1)
|
|
160 |
predicted_sentence = processor.batch_decode(predicted_ids)[0]
|
161 |
```
|
162 |
|
163 |
-
|
164 |
## Evaluation
|
165 |
|
166 |
1. To evaluate on `mozilla-foundation/common_voice_11_0`
|
|
|
1 |
---
|
|
|
|
|
2 |
license: apache-2.0
|
3 |
+
language: fr
|
4 |
+
library_name: transformers
|
5 |
+
thumbnail: null
|
6 |
tags:
|
7 |
- automatic-speech-recognition
|
8 |
- hf-asr-leaderboard
|
9 |
- robust-speech-event
|
10 |
+
- CTC
|
11 |
+
- Wav2vec2
|
|
|
|
|
12 |
datasets:
|
13 |
- common_voice
|
14 |
- mozilla-foundation/common_voice_11_0
|
|
|
90 |
- name: Test WER (+LM)
|
91 |
type: wer
|
92 |
value: 12.96
|
93 |
+
- task:
|
94 |
+
name: Automatic Speech Recognition
|
95 |
+
type: automatic-speech-recognition
|
96 |
+
dataset:
|
97 |
+
name: Fleurs
|
98 |
+
type: google/fleurs
|
99 |
+
args: fr_fr
|
100 |
+
metrics:
|
101 |
+
- name: Test WER
|
102 |
+
type: wer
|
103 |
+
value: 10.10
|
104 |
+
- name: Test WER (+LM)
|
105 |
+
type: wer
|
106 |
+
value: 8.84
|
107 |
---
|
108 |
|
109 |
# Fine-tuned wav2vec2-FR-7K-large model for ASR in French
|
110 |
|
111 |
+
![Model architecture](https://img.shields.io/badge/Model_Architecture-Wav2Vec2--CTC-lightgrey)
|
112 |
|
113 |
+
This model is a fine-tuned version of [LeBenchmark/wav2vec2-FR-7K-large](https://huggingface.co/LeBenchmark/wav2vec2-FR-7K-large), trained on a composite dataset comprising of over 2200 hours of French speech audio, using the train and validation splits of [Common Voice 11.0](https://huggingface.co/datasets/mozilla-foundation/common_voice_11_0), [Multilingual LibriSpeech](https://huggingface.co/datasets/facebook/multilingual_librispeech), [Voxpopuli](https://github.com/facebookresearch/voxpopuli), [Multilingual TEDx](http://www.openslr.org/100), [MediaSpeech](https://www.openslr.org/108), and [African Accented French](https://huggingface.co/datasets/gigant/african_accented_french). When using the model make sure that your speech input is also sampled at 16Khz.
|
114 |
|
115 |
## Usage
|
116 |
|
|
|
174 |
predicted_sentence = processor.batch_decode(predicted_ids)[0]
|
175 |
```
|
176 |
|
|
|
177 |
## Evaluation
|
178 |
|
179 |
1. To evaluate on `mozilla-foundation/common_voice_11_0`
|