carlosdanielhernandezmena commited on
Commit
cc9dd3d
·
verified ·
1 Parent(s): 1e2b0dd

Adding more information to the model card.

Browse files
Files changed (1) hide show
  1. README.md +92 -7
README.md CHANGED
@@ -201,13 +201,54 @@ library_name: transformers
201
  # whisper-large-v3-ca-3catparla
202
  - **Paper:** [3CatParla: A New Open-Source Corpus of Broadcast TV in Catalan for Automatic Speech Recognition](https://iberspeech.tech/)
203
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
204
  The "whisper-large-v3-ca-3catparla" is an acoustic model suitable for Automatic Speech Recognition in Catalan. It is the result of finetuning the model ["openai/whisper-large-v3"](https://huggingface.co/openai/whisper-large-v3) with 710 hours of Catalan data released by the [Projecte AINA](https://projecteaina.cat/) from Barcelona, Spain.
205
 
206
- The specific dataset used to create the model is called ["3CatParla"](https://huggingface.co/datasets/projecte-aina/3catparla_asr).
 
 
207
 
208
- The fine-tuning process was perform during July (2024) in the servers of the [Barcelona Supercomputing Center](https://www.bsc.es/) by [Carlos Daniel Hernández Mena](https://huggingface.co/carlosdanielhernandezmena).
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
209
 
210
- # Evaluation
211
  ```python
212
  import torch
213
  from transformers import WhisperForConditionalGeneration, WhisperProcessor
@@ -250,8 +291,34 @@ print(WER)
250
  ```
251
  **Test Result**: 0.96
252
 
253
- # BibTeX entry and citation info
254
- * When publishing results based on these models please refer to:
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
255
  ```bibtex
256
  @misc{mena2024whisperlarge3catparla,
257
  title={Acoustic Model in Catalan: whisper-large-v3-ca-3catparla.},
@@ -261,6 +328,24 @@ print(WER)
261
  year={2024}
262
  }
263
  ```
264
- # Acknowledgements
265
 
266
- This model has been promoted and financed by the Government of Catalonia through the Aina project.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
201
  # whisper-large-v3-ca-3catparla
202
  - **Paper:** [3CatParla: A New Open-Source Corpus of Broadcast TV in Catalan for Automatic Speech Recognition](https://iberspeech.tech/)
203
 
204
+ ## Table of Contents
205
+ <details>
206
+ <summary>Click to expand</summary>
207
+
208
+ - [Model description](#model-description)
209
+ - [Intended uses and limitations](#intended-uses-and-limitations)
210
+ - [How to use](#how-to-use)
211
+ - [Training](#training)
212
+ - [Evaluation](#evaluation)
213
+ - [Citation](#citation)
214
+ - [Additional information](#additional-information)
215
+
216
+ </details>
217
+
218
+ ## Summary
219
+
220
+ The "whisper-large-v3-ca-3catparla" is an acoustic model based on ["openai/whisper-large-v3"](https://huggingface.co/openai/whisper-large-v3) suitable for Automatic Speech Recognition in Catalan.
221
+
222
+ ## Model Description
223
+
224
  The "whisper-large-v3-ca-3catparla" is an acoustic model suitable for Automatic Speech Recognition in Catalan. It is the result of finetuning the model ["openai/whisper-large-v3"](https://huggingface.co/openai/whisper-large-v3) with 710 hours of Catalan data released by the [Projecte AINA](https://projecteaina.cat/) from Barcelona, Spain.
225
 
226
+ ## Intended Uses and Limitations
227
+
228
+ This model can used for Automatic Speech Recognition (ASR) in Catalan. The model is intended to transcribe audio files in Catalan to plain text without punctuation.
229
 
230
+ ## How to Get Started with the Model
231
+
232
+ ### Installation
233
+
234
+ In order to use this model, you may install [datasets](https://huggingface.co/docs/datasets/installation) and [transformers](https://huggingface.co/docs/transformers/installation):
235
+
236
+ Create a virtual environment:
237
+ ```bash
238
+ python -m venv /path/to/venv
239
+ ```
240
+ Activate the environment:
241
+ ```bash
242
+ source /path/to/venv/bin/activate
243
+ ```
244
+ Install the modules:
245
+ ```bash
246
+ pip install datasets transformers
247
+ ```
248
+
249
+ ### For Inference
250
+ In order to transcribe audio in Catalan using this model, you can follow this example:
251
 
 
252
  ```python
253
  import torch
254
  from transformers import WhisperForConditionalGeneration, WhisperProcessor
 
291
  ```
292
  **Test Result**: 0.96
293
 
294
+ ## Training Details
295
+
296
+ ### Training data
297
+
298
+ The specific dataset used to create the model is called ["3CatParla"](https://huggingface.co/datasets/projecte-aina/3catparla_asr).
299
+
300
+ ### Training procedure
301
+
302
+ This model is the result of finetuning the model ["openai/whisper-large-v3"](https://huggingface.co/openai/whisper-large-v3) by following this [tutorial](https://huggingface.co/blog/fine-tune-whisper) provided by Hugging Face.
303
+
304
+ ### Training Hyperparameters
305
+
306
+ * language: catalan
307
+ * hours of training audio: 710
308
+ * learning rate: 1.95e-07
309
+ * sample rate: 16000
310
+ * train batch size: 32 (x4 GPUs)
311
+ * gradient accumulation steps: 1
312
+ * eval batch size: 32
313
+ * save total limit: 3
314
+ * max steps: 19842
315
+ * warmup steps: 1984
316
+ * eval steps: 3307
317
+ * save steps: 3307
318
+ * shuffle buffer size: 480
319
+
320
+ ## Citation
321
+ If this code contributes to your research, please cite the work:
322
  ```bibtex
323
  @misc{mena2024whisperlarge3catparla,
324
  title={Acoustic Model in Catalan: whisper-large-v3-ca-3catparla.},
 
328
  year={2024}
329
  }
330
  ```
 
331
 
332
+ ## Additional Information
333
+
334
+ ### Author
335
+
336
+ The fine-tuning process was perform during July (2024) in the [Language Technologies Unit](https://huggingface.co/BSC-LT) of the [Barcelona Supercomputing Center](https://www.bsc.es/) by [Carlos Daniel Hernández Mena](https://huggingface.co/carlosdanielhernandezmena).
337
+
338
+ ### Contact
339
+ For further information, please send an email to <[email protected]>.
340
+
341
+ ### Copyright
342
+ Copyright(c) 2024 by Language Technologies Unit, Barcelona Supercomputing Center.
343
+
344
+ ### License
345
+
346
+ [Apache-2.0](https://www.apache.org/licenses/LICENSE-2.0)
347
+
348
+ ### Funding
349
+ This work has been promoted and financed by the Generalitat de Catalunya through the [Aina project](https://projecteaina.cat/).
350
+
351
+ The training of the model was possible thanks to the compute time provided by [Barcelona Supercomputing Center](https://www.bsc.es/) through MareNostrum 5.