bofenghuang
/

asr-wav2vec2-ctc-french

@@ -1,15 +1,14 @@
 ---
-language:
-- fr
 license: apache-2.0
 tags:
 - automatic-speech-recognition
 - hf-asr-leaderboard
 - robust-speech-event
-- mozilla-foundation/common_voice_11_0
-- facebook/multilingual_librispeech
-- facebook/voxpopuli
-- gigant/african_accented_french
 datasets:
 - common_voice
 - mozilla-foundation/common_voice_11_0
@@ -91,12 +90,27 @@ model-index:
     - name: Test WER (+LM)
       type: wer
       value: 12.96
 ---
 # Fine-tuned wav2vec2-FR-7K-large model for ASR in French
-This model is a fine-tuned version of [LeBenchmark/wav2vec2-FR-7K-large](https://huggingface.co/LeBenchmark/wav2vec2-FR-7K-large) on French using the train and validation splits of [Common Voice 11.0](https://huggingface.co/datasets/mozilla-foundation/common_voice_11_0), [Multilingual LibriSpeech](https://huggingface.co/datasets/facebook/multilingual_librispeech), [Voxpopuli](https://github.com/facebookresearch/voxpopuli), [Multilingual TEDx](http://www.openslr.org/100), [MediaSpeech](https://www.openslr.org/108), and [African Accented French](https://huggingface.co/datasets/gigant/african_accented_french) on 16kHz sampled speech audio. When using the model make sure that your speech input is also sampled at 16Khz.
 ## Usage
@@ -160,7 +174,6 @@ predicted_ids = torch.argmax(logits, dim=-1)
 predicted_sentence = processor.batch_decode(predicted_ids)[0]
 ```
 ## Evaluation
 1. To evaluate on `mozilla-foundation/common_voice_11_0`

 ---
 license: apache-2.0
+language: fr
+library_name: transformers
+thumbnail: null
 tags:
 - automatic-speech-recognition
 - hf-asr-leaderboard
 - robust-speech-event
+- CTC
+- Wav2vec2
 datasets:
 - common_voice
 - mozilla-foundation/common_voice_11_0
     - name: Test WER (+LM)
       type: wer
       value: 12.96
+  - task:
+      name: Automatic Speech Recognition
+      type: automatic-speech-recognition
+    dataset:
+      name: Fleurs
+      type: google/fleurs
+      args: fr_fr
+    metrics:
+    - name: Test WER
+      type: wer
+      value: 10.10
+    - name: Test WER (+LM)
+      type: wer
+      value: 8.84
 ---
 # Fine-tuned wav2vec2-FR-7K-large model for ASR in French
+![Model architecture](https://img.shields.io/badge/Model_Architecture-Wav2Vec2--CTC-lightgrey)
+This model is a fine-tuned version of [LeBenchmark/wav2vec2-FR-7K-large](https://huggingface.co/LeBenchmark/wav2vec2-FR-7K-large), trained on a composite dataset comprising of over 2200 hours of French speech audio, using the train and validation splits of [Common Voice 11.0](https://huggingface.co/datasets/mozilla-foundation/common_voice_11_0), [Multilingual LibriSpeech](https://huggingface.co/datasets/facebook/multilingual_librispeech), [Voxpopuli](https://github.com/facebookresearch/voxpopuli), [Multilingual TEDx](http://www.openslr.org/100), [MediaSpeech](https://www.openslr.org/108), and [African Accented French](https://huggingface.co/datasets/gigant/african_accented_french). When using the model make sure that your speech input is also sampled at 16Khz.
 ## Usage
 predicted_sentence = processor.batch_decode(predicted_ids)[0]
 ```
 ## Evaluation
 1. To evaluate on `mozilla-foundation/common_voice_11_0`