StarscreamDeceptions commited on
Commit
6cd5800
·
verified ·
1 Parent(s): 2e36ce0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +12 -10
README.md CHANGED
@@ -21,12 +21,6 @@ For more details, please refer to our [Hugging Face page](https://huggingface.co
21
 
22
  Marco-LLM-ES series includes models of varying sizes, from 7B to 72B parameters, including both base and instruction-tuned (Instruct) models. The models are based on the Transformer architecture with SwiGLU activation, attention QKV bias, and group query attention. Additionally, the models employ an improved tokenizer adaptive to multiple languages.
23
 
24
- ## Requirements
25
-
26
- The Marco-LLM-ES models are compatible with the latest Hugging Face Transformers library. We recommend installing `transformers>=4.37.0` to ensure full functionality and avoid potential errors like:
27
- ```
28
- KeyError: 'qwen2'
29
- ```
30
  ## Usage
31
 
32
  It is not advised to use the base language models for direct text generation tasks. Instead, it is recommended to apply post-training methods such as Supervised Fine-tuning (SFT), Reinforcement Learning with Human Feedback (RLHF), or continued pretraining to adapt the models for specific use cases.
@@ -44,7 +38,6 @@ The datasets used for evaluation include:
44
 
45
  | Datasets | Marco-LLM-ES-7B |
46
  | :---------------- | :-----------------: |
47
- | **Spanish** | |
48
  | Spanish | **44.49** |
49
  | Catalan | **39.45** |
50
  | Basque | **28.66** |
@@ -54,11 +47,20 @@ The datasets used for evaluation include:
54
  ## Citation
55
 
56
  If you find our work helpful, please give us a citation.
 
 
 
 
 
 
 
 
57
 
58
- @article{marco_llm_es,
59
 
60
- title={Marco-LLM-ES Technical Report},
61
 
62
- year={2024}
63
 
64
  }
 
 
21
 
22
  Marco-LLM-ES series includes models of varying sizes, from 7B to 72B parameters, including both base and instruction-tuned (Instruct) models. The models are based on the Transformer architecture with SwiGLU activation, attention QKV bias, and group query attention. Additionally, the models employ an improved tokenizer adaptive to multiple languages.
23
 
 
 
 
 
 
 
24
  ## Usage
25
 
26
  It is not advised to use the base language models for direct text generation tasks. Instead, it is recommended to apply post-training methods such as Supervised Fine-tuning (SFT), Reinforcement Learning with Human Feedback (RLHF), or continued pretraining to adapt the models for specific use cases.
 
38
 
39
  | Datasets | Marco-LLM-ES-7B |
40
  | :---------------- | :-----------------: |
 
41
  | Spanish | **44.49** |
42
  | Catalan | **39.45** |
43
  | Basque | **28.66** |
 
47
  ## Citation
48
 
49
  If you find our work helpful, please give us a citation.
50
+ ```
51
+ @article{unique_identifier,
52
+
53
+ title={Marco-LLM: Bridging Languages via Massive Multilingual Training for Cross-Lingual Enhancement},
54
+
55
+ journal={arXiv},
56
+
57
+ volume={},
58
 
59
+ number={2412.04003},
60
 
61
+ year={2024},
62
 
63
+ url={https://arxiv.org/abs/2412.04003}
64
 
65
  }
66
+ ```