irlab-udc
/

Llama-3.1-8B-Instruct-Galician

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Llama-3.1-8B-Instruct-Galician / README.md

eliseobao's picture

Update README.md

960cd74 verified about 1 month ago

|

3.98 kB

	---
	base_model:
	- meta-llama/Llama-3.1-8B-Instruct
	license: llama3.1
	language:
	- gl
	metrics:
	- bleu
	- rouge
	model-index:
	- name: Llama-3.1-8B-Instruct-Galician
	results:
	- task:
	type: text-generation
	dataset:
	name: alpaca_data_galician
	type: alpaca_data_galician
	metrics:
	- name: bleu
	type: bleu-4
	value: 23.13
	- name: rouge
	type: rouge-l
	value: 21.84
	pipeline_tag: text-generation
	library_name: transformers
	widget:
	- text: "Onde está o concello de Frades?"
	output:
	text: Frades é un concello da provincia da Coruña, pertencente á comarca de Ordes. Está situado a 15 quilómetros de Santiago de Compostela.
	---

	# Llama-3.1-8B-Instruct-Galician

	This model is a continued pretraining version of [meta-llama/Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct) on the [CorpusNós](https://zenodo.org/records/11655219) dataset.

	## Model Description

	- Developed by: [UDC Information Retrieval Lab (IRLab)](https://huggingface.co/irlab-udc)
	- Language(s) (NLP): Multilingual, adapted to Galician
	- License: llama3.1
	- Finetuned from model: [meta-llama/Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct)
	- Repository: [Adapting Large Language Models for Underrepresented Languages](https://gitlab.irlab.org/eliseo.bao/xovetic-llms-underrepresented-languages)
	- Paper: _Coming soon_

	## How to Get Started with the Model

	```python
	import transformers
	import torch

	model_id = "irlab-udc/Llama-3.1-8B-Instruct-Galician"

	pipeline = transformers.pipeline(
	"text-generation",
	model=model_id,
	model_kwargs={"torch_dtype": torch.bfloat16},
	device_map="auto",
	)

	messages = [
	{"role": "system", "content": "You are a conversational AI that always responds in Galician."},
	{"role": "user", "content": "Cal é a principal vantaxe de usar Scrum?"},
	]

	outputs = pipeline(messages, max_new_tokens=512)

	print(outputs[0]["generated_text"][-1]["content"])
	```

	#### Training Hyperparameters

	\| Parameter \| Value \|
	\|--------------------------------\|--------------------------------------\|
	\| learning_rate \| 0.0001 \|
	\| train_batch_size \| 32 \|
	\| eval_batch_size \| 1 \|
	\| seed \| 42 \|
	\| distributed_type \| multi-GPU \|
	\| num_devices \| 4 \|
	\| gradient_accumulation_steps \| 2 \|
	\| total_train_batch_size \| 256 \|
	\| total_eval_batch_size \| 4 \|
	\| optimizer \| Adam with betas=(0.9, 0.999), epsilon=1e-08 \|
	\| lr_scheduler_type \| cosine \|
	\| lr_scheduler_warmup_ratio \| 0.1 \|
	\| num_epochs \| 1.0 \|


	#### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \|
	\|:-------------:\|:------:\|:----:\|:---------------:\|
	\| 2.0606 \| 0.1682 \| 900 \| 2.0613 \|
	\| 1.9898 \| 0.3363 \| 1800 \| 1.9929 \|
	\| 1.9847 \| 0.5045 \| 2700 \| 1.9613 \|
	\| 1.9577 \| 0.6726 \| 3600 \| 1.9445 \|
	\| 1.9287 \| 0.8408 \| 4500 \| 1.9368 \|

	## Environmental Impact

	Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).

	- Hardware Type: 4x NVIDIA A100 SXM4 80 GB (TDP of 400W)
	- Hours used: 60
	- Cloud Provider: Private infrastructure
	- Carbon Emitted: 10.37 Kg. CO₂ eq.

	## Citation

	_Coming soon_