Iskaj
/

xlsr300m_cv_8.0_nl

Automatic Speech Recognition

mozilla-foundation/common_voice_8_0

mozilla-foundation/common_voice_7_0

robust-speech-event

hf-asr-leaderboard

Inference Endpoints

Model card Files Files and versions Community

xlsr300m_cv_8.0_nl / README.md

anton-l's picture

anton-l HF staff

Upload README.md

f9a12bb over 2 years ago

|

history blame contribute delete

No virus

2.65 kB

	---
	language:
	- nl
	license: apache-2.0
	tags:
	- automatic-speech-recognition
	- mozilla-foundation/common_voice_8_0
	- mozilla-foundation/common_voice_7_0
	- nl
	- robust-speech-event
	- model_for_talk
	- hf-asr-leaderboard
	datasets:
	- mozilla-foundation/common_voice_8_0
	model-index:
	- name: XLS-R-300M - Dutch
	results:
	- task:
	name: Automatic Speech Recognition
	type: automatic-speech-recognition
	dataset:
	name: Common Voice 8 NL
	type: mozilla-foundation/common_voice_8_0
	args: nl
	metrics:
	- name: Test WER
	type: wer
	value: 46.94
	- name: Test CER
	type: cer
	value: 21.65
	- task:
	name: Automatic Speech Recognition
	type: automatic-speech-recognition
	dataset:
	name: Robust Speech Event - Dev Data
	type: speech-recognition-community-v2/dev_data
	args: nl
	metrics:
	- name: Test WER
	type: wer
	value: ???
	- name: Test CER
	type: cer
	value: ???
	- task:
	name: Automatic Speech Recognition
	type: automatic-speech-recognition
	dataset:
	name: Robust Speech Event - Test Data
	type: speech-recognition-community-v2/eval_data
	args: nl
	metrics:
	- name: Test WER
	type: wer
	value: 42.56
	---

	# xlsr300m_cv_8.0_nl


	#### Evaluation Commands
	1. To evaluate on `mozilla-foundation/common_voice_8_0` with split `test`

	```bash
	python eval.py --model_id Iskaj/xlsr300m_cv_8.0_nl --dataset mozilla-foundation/common_voice_8_0 --config nl --split test
	```

	2. To evaluate on `speech-recognition-community-v2/dev_data`

	```bash
	python eval.py --model_id Iskaj/xlsr300m_cv_8.0_nl --dataset speech-recognition-community-v2/dev_data --config nl --split validation --chunk_length_s 5.0 --stride_length_s 1.0
	```

	### Inference

	```python
	import torch
	from datasets import load_dataset
	from transformers import AutoModelForCTC, AutoProcessor
	import torchaudio.functional as F

	model_id = "Iskaj/xlsr300m_cv_8.0_nl"

	sample_iter = iter(load_dataset("mozilla-foundation/common_voice_8_0", "nl", split="test", streaming=True, use_auth_token=True))

	sample = next(sample_iter)
	resampled_audio = F.resample(torch.tensor(sample["audio"]["array"]), 48_000, 16_000).numpy()

	model = AutoModelForCTC.from_pretrained(model_id)
	processor = AutoProcessor.from_pretrained(model_id)

	inputs = processor(resampled_audio, sampling_rate=16_000, return_tensors="pt")
	with torch.no_grad():
	logits = model(**inputs).logits
	predicted_ids = torch.argmax(logits, dim=-1)
	transcription = processor.batch_decode(predicted_ids)

	transcription[0].lower()
	#'het kontine schip lag aangemeert in de aven'
	```