StarscreamDeceptions
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -1,7 +1,64 @@
|
|
1 |
---
|
2 |
-
license: apache-2.0
|
3 |
language:
|
4 |
- es
|
5 |
-
|
6 |
-
|
7 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
|
|
2 |
language:
|
3 |
- es
|
4 |
+
pipeline_tag: text-generation
|
5 |
+
tags:
|
6 |
+
- pretrained
|
7 |
+
license: apache-2.0
|
8 |
+
---
|
9 |
+
|
10 |
+
# Marco-LLM-ES-7B
|
11 |
+
|
12 |
+
## Introduction
|
13 |
+
|
14 |
+
Marco-LLM-ES is a series of enhanced language models specifically fine-tuned for common languages used in Spain, including Catalan, Basque, Galician, and Spanish. This repository contains the 7B Marco-LLM-ES base language model.
|
15 |
+
|
16 |
+
Compared with the state-of-the-art open-source language models, Marco-LLM-ES has undergone extensive continued pretraining on a dataset containing approximately 50 billion tokens, enhancing its capabilities in the targeted languages while maintaining competitiveness in general benchmarks.
|
17 |
+
|
18 |
+
For more details, please refer to our [Hugging Face page](https://huggingface.co/AIDC-AI/Marco-LLM-ES).
|
19 |
+
|
20 |
+
## Model Details
|
21 |
+
|
22 |
+
Marco-LLM-ES series includes models of varying sizes, from 7B to 72B parameters, including both base and instruction-tuned (Instruct) models. The models are based on the Transformer architecture with SwiGLU activation, attention QKV bias, and group query attention. Additionally, the models employ an improved tokenizer adaptive to multiple languages.
|
23 |
+
|
24 |
+
## Requirements
|
25 |
+
|
26 |
+
The Marco-LLM-ES models are compatible with the latest Hugging Face Transformers library. We recommend installing `transformers>=4.37.0` to ensure full functionality and avoid potential errors like:
|
27 |
+
```
|
28 |
+
KeyError: 'qwen2'
|
29 |
+
```
|
30 |
+
## Usage
|
31 |
+
|
32 |
+
It is not advised to use the base language models for direct text generation tasks. Instead, it is recommended to apply post-training methods such as Supervised Fine-tuning (SFT), Reinforcement Learning with Human Feedback (RLHF), or continued pretraining to adapt the models for specific use cases.
|
33 |
+
|
34 |
+
### Performance
|
35 |
+
|
36 |
+
The evaluation of Marco-LLM-ES models focuses on performance in natural language understanding, general question answering, coding, mathematics, scientific knowledge, reasoning, multilingual capability, and their enhanced performance in the targeted languages: Catalan, Basque, Galician, and Spanish.
|
37 |
+
|
38 |
+
The datasets used for evaluation include:
|
39 |
+
|
40 |
+
**Spanish-specific Tasks**: Evaluations in Catalan, Basque, Galician, and Spanish at LaLeaderboard (5-shot)
|
41 |
+
|
42 |
+
|
43 |
+
#### Marco-LLM-ES-7B performance
|
44 |
+
|
45 |
+
| Datasets | Marco-LLM-ES-7B |
|
46 |
+
| :---------------- | :-----------------: |
|
47 |
+
| **Spanish** | |
|
48 |
+
| Spanish | **44.49** |
|
49 |
+
| Catalan | **39.45** |
|
50 |
+
| Basque | **28.66** |
|
51 |
+
| Galician | **24.04** |
|
52 |
+
| Average | **34.16** |
|
53 |
+
|
54 |
+
## Citation
|
55 |
+
|
56 |
+
If you find our work helpful, please give us a citation.
|
57 |
+
|
58 |
+
@article{marco_llm_es,
|
59 |
+
|
60 |
+
title={Marco-LLM-ES Technical Report},
|
61 |
+
|
62 |
+
year={2024}
|
63 |
+
|
64 |
+
}
|