WER
Browse files
README.md
CHANGED
@@ -22,12 +22,12 @@ model-index:
|
|
22 |
metrics:
|
23 |
- name: Test WER
|
24 |
type: wer
|
25 |
-
value: 15.
|
26 |
---
|
27 |
|
28 |
# Wav2Vec2-Large-XLSR-53-German
|
29 |
|
30 |
-
Fine-tuned [facebook/wav2vec2-large-xlsr-53](https://huggingface.co/facebook/wav2vec2-large-xlsr-53) on German using
|
31 |
When using this model, make sure that your speech input is sampled at 16kHz.
|
32 |
|
33 |
## Usage
|
@@ -79,7 +79,7 @@ from datasets import load_dataset, load_metric
|
|
79 |
from transformers import Wav2Vec2ForCTC, Wav2Vec2Processor
|
80 |
import re
|
81 |
|
82 |
-
test_dataset = load_dataset("common_voice", "de", split="test
|
83 |
wer = load_metric("wer")
|
84 |
|
85 |
processor = Wav2Vec2Processor.from_pretrained("marcel/wav2vec2-large-xlsr-53-german")
|
@@ -140,11 +140,10 @@ result = test_dataset.map(evaluate, batched=True, batch_size=8)
|
|
140 |
print("WER: {:2f}".format(100 * wer.compute(predictions=result["pred_strings"], references=result["sentence"])))
|
141 |
```
|
142 |
|
143 |
-
**Test Result**: 15.
|
144 |
|
145 |
|
146 |
## Training
|
147 |
|
148 |
-
The first
|
149 |
|
150 |
-
The script used for training can be found TODO
|
|
|
22 |
metrics:
|
23 |
- name: Test WER
|
24 |
type: wer
|
25 |
+
value: 15.80
|
26 |
---
|
27 |
|
28 |
# Wav2Vec2-Large-XLSR-53-German
|
29 |
|
30 |
+
Fine-tuned [facebook/wav2vec2-large-xlsr-53](https://huggingface.co/facebook/wav2vec2-large-xlsr-53) on German using the [Common Voice](https://huggingface.co/datasets/common_voice) dataset.
|
31 |
When using this model, make sure that your speech input is sampled at 16kHz.
|
32 |
|
33 |
## Usage
|
|
|
79 |
from transformers import Wav2Vec2ForCTC, Wav2Vec2Processor
|
80 |
import re
|
81 |
|
82 |
+
test_dataset = load_dataset("common_voice", "de", split="test")
|
83 |
wer = load_metric("wer")
|
84 |
|
85 |
processor = Wav2Vec2Processor.from_pretrained("marcel/wav2vec2-large-xlsr-53-german")
|
|
|
140 |
print("WER: {:2f}".format(100 * wer.compute(predictions=result["pred_strings"], references=result["sentence"])))
|
141 |
```
|
142 |
|
143 |
+
**Test Result**: 15.80 %
|
144 |
|
145 |
|
146 |
## Training
|
147 |
|
148 |
+
The first 50% of the Common Voice `train`, and 12% of the `validation` datasets were used for training (30 epochs on first 12% and 3 epochs on the remainder).
|
149 |
|
|