yoonniverse
commited on
Commit
·
9e919d9
1
Parent(s):
cc35fb5
First model version
Browse files- README.md +114 -0
- config.json +37 -0
- generation_config.json +8 -0
- pytorch_model-00001-of-00008.bin +3 -0
- pytorch_model-00002-of-00008.bin +3 -0
- pytorch_model-00003-of-00008.bin +3 -0
- pytorch_model-00004-of-00008.bin +3 -0
- pytorch_model-00005-of-00008.bin +3 -0
- pytorch_model-00006-of-00008.bin +3 -0
- pytorch_model-00007-of-00008.bin +3 -0
- pytorch_model-00008-of-00008.bin +3 -0
- pytorch_model.bin.index.json +0 -0
- tokenizer.json +0 -0
- tokenizer.model +3 -0
- tokenizer_config.json +34 -0
README.md
ADDED
@@ -0,0 +1,114 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
language:
|
3 |
+
- en
|
4 |
+
tags:
|
5 |
+
- upstage
|
6 |
+
- llama-2
|
7 |
+
- instruct
|
8 |
+
- instruction
|
9 |
+
pipeline_tag: text-generation
|
10 |
+
---
|
11 |
+
# SOLAR-0-70b-8bit model card
|
12 |
+
|
13 |
+
This is a 8bit quantized version of [upstage/Llama-2-70b-instruct-v2](https://huggingface.co/upstage/Llama-2-70b-instruct-v2)
|
14 |
+
|
15 |
+
## Model Details
|
16 |
+
|
17 |
+
* **Developed by**: [Upstage](https://en.upstage.ai)
|
18 |
+
* **Backbone Model**: [LLaMA-2](https://github.com/facebookresearch/llama/tree/main)
|
19 |
+
* **Language(s)**: English
|
20 |
+
* **Library**: [HuggingFace Transformers](https://github.com/huggingface/transformers)
|
21 |
+
* **License**: Fine-tuned checkpoints is licensed under the Non-Commercial Creative Commons license ([CC BY-NC-4.0](https://creativecommons.org/licenses/by-nc/4.0/))
|
22 |
+
* **Where to send comments**: Instructions on how to provide feedback or comments on a model can be found by opening an issue in the [Hugging Face community's model repository](https://huggingface.co/upstage/SOLAR-0-70b-8bit/discussions)
|
23 |
+
* **Contact**: For questions and comments about the model, please email [[email protected]](mailto:[email protected])
|
24 |
+
|
25 |
+
## Dataset Details
|
26 |
+
|
27 |
+
### Used Datasets
|
28 |
+
- Orca-style dataset
|
29 |
+
- Alpaca-style dataset
|
30 |
+
- No other dataset was used except for the dataset mentioned above
|
31 |
+
- No benchmark test set or the training set are used
|
32 |
+
|
33 |
+
|
34 |
+
### Prompt Template
|
35 |
+
```
|
36 |
+
### System:
|
37 |
+
{System}
|
38 |
+
|
39 |
+
### User:
|
40 |
+
{User}
|
41 |
+
|
42 |
+
### Assistant:
|
43 |
+
{Assistant}
|
44 |
+
```
|
45 |
+
|
46 |
+
## Usage
|
47 |
+
|
48 |
+
- The followings are tested on A100 80GB
|
49 |
+
- Our model can handle up to 10k+ input tokens, thanks to the `rope_scaling` option
|
50 |
+
|
51 |
+
```python
|
52 |
+
import torch
|
53 |
+
from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer
|
54 |
+
|
55 |
+
tokenizer = AutoTokenizer.from_pretrained("upstage/SOLAR-0-70b-8bit")
|
56 |
+
model = AutoModelForCausalLM.from_pretrained(
|
57 |
+
"upstage/SOLAR-0-70b-8bit",
|
58 |
+
device_map="auto",
|
59 |
+
torch_dtype=torch.float16,
|
60 |
+
load_in_8bit=True,
|
61 |
+
rope_scaling={"type": "dynamic", "factor": 2} # allows handling of longer inputs
|
62 |
+
)
|
63 |
+
|
64 |
+
prompt = "### User:\nThomas is healthy, but he has to go to the hospital. What could be the reasons?\n\n### Assistant:\n"
|
65 |
+
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
|
66 |
+
del inputs["token_type_ids"]
|
67 |
+
streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)
|
68 |
+
|
69 |
+
output = model.generate(**inputs, streamer=streamer, use_cache=True, max_new_tokens=float('inf'))
|
70 |
+
output_text = tokenizer.decode(output[0], skip_special_tokens=True)
|
71 |
+
```
|
72 |
+
|
73 |
+
## Hardware and Software
|
74 |
+
|
75 |
+
* **Hardware**: We utilized an A100x8 * 4 for training our model
|
76 |
+
* **Training Factors**: We fine-tuned this model using a combination of the [DeepSpeed library](https://github.com/microsoft/DeepSpeed) and the [HuggingFace Trainer](https://huggingface.co/docs/transformers/main_classes/trainer) / [HuggingFace Accelerate](https://huggingface.co/docs/accelerate/index)
|
77 |
+
|
78 |
+
## Evaluation Results
|
79 |
+
|
80 |
+
### Overview
|
81 |
+
- We conducted a performance evaluation following the tasks being evaluated on the [Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard).
|
82 |
+
We evaluated our model on four benchmark datasets, which include `ARC-Challenge`, `HellaSwag`, `MMLU`, and `TruthfulQA`
|
83 |
+
We used the [lm-evaluation-harness repository](https://github.com/EleutherAI/lm-evaluation-harness), specifically commit [b281b0921b636bc36ad05c0b0b0763bd6dd43463](https://github.com/EleutherAI/lm-evaluation-harness/tree/b281b0921b636bc36ad05c0b0b0763bd6dd43463).
|
84 |
+
- We used [MT-bench](https://github.com/lm-sys/FastChat/tree/main/fastchat/llm_judge), a set of challenging multi-turn open-ended questions, to evaluate the models
|
85 |
+
|
86 |
+
### Main Results
|
87 |
+
| Model | H4(Avg) | ARC | HellaSwag | MMLU | TruthfulQA | | MT_Bench |
|
88 |
+
|--------------------------------------------------------------------|----------|----------|----------|------|----------|-|-------------|
|
89 |
+
| **[Llama-2-70b-instruct-v2](https://huggingface.co/upstage/Llama-2-70b-instruct-v2)**(***Ours***, ***Open LLM Leaderboard***) | **73** | **71.1** | **87.9** | **70.6** | **62.2** | | **7.44063** |
|
90 |
+
| [Llama-2-70b-instruct](https://huggingface.co/upstage/Llama-2-70b-instruct) (Ours, Open LLM Leaderboard) | 72.3 | 70.9 | 87.5 | 69.8 | 61 | | 7.24375 |
|
91 |
+
| [llama-65b-instruct](https://huggingface.co/upstage/llama-65b-instruct) (Ours, Open LLM Leaderboard) | 69.4 | 67.6 | 86.5 | 64.9 | 58.8 | | |
|
92 |
+
| Llama-2-70b-hf | 67.3 | 67.3 | 87.3 | 69.8 | 44.9 | | |
|
93 |
+
| [llama-30b-instruct-2048](https://huggingface.co/upstage/llama-30b-instruct-2048) (Ours, Open LLM Leaderboard) | 67.0 | 64.9 | 84.9 | 61.9 | 56.3 | | |
|
94 |
+
| [llama-30b-instruct](https://huggingface.co/upstage/llama-30b-instruct) (Ours, Open LLM Leaderboard) | 65.2 | 62.5 | 86.2 | 59.4 | 52.8 | | |
|
95 |
+
| llama-65b | 64.2 | 63.5 | 86.1 | 63.9 | 43.4 | | |
|
96 |
+
| falcon-40b-instruct | 63.4 | 61.6 | 84.3 | 55.4 | 52.5 | | |
|
97 |
+
|
98 |
+
### Scripts for H4 Score Reproduction
|
99 |
+
- Prepare evaluation environments:
|
100 |
+
```
|
101 |
+
# clone the repository
|
102 |
+
git clone https://github.com/EleutherAI/lm-evaluation-harness.git
|
103 |
+
# check out the specific commit
|
104 |
+
git checkout b281b0921b636bc36ad05c0b0b0763bd6dd43463
|
105 |
+
# change to the repository directory
|
106 |
+
cd lm-evaluation-harness
|
107 |
+
```
|
108 |
+
|
109 |
+
## Contact Us
|
110 |
+
|
111 |
+
### About Upstage
|
112 |
+
- [Upstage](https://en.upstage.ai) is a company specialized in Large Language Models (LLMs) and AI. We will help you build private LLMs and related applications.
|
113 |
+
If you have a dataset to build domain specific LLMs or make LLM applications, please contact us at ► [click here to contact](https://www.upstage.ai/private-llm?utm_source=huggingface&utm_medium=link&utm_campaign=privatellm)
|
114 |
+
- As of August 1st, our 70B model has reached the top spot in openLLM rankings, marking itself as the current leading performer globally.
|
config.json
ADDED
@@ -0,0 +1,37 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"_name_or_path": "/data/project/public/checkpoints/Llama-2-70b-instruct-v2",
|
3 |
+
"architectures": [
|
4 |
+
"LlamaForCausalLM"
|
5 |
+
],
|
6 |
+
"bos_token_id": 1,
|
7 |
+
"eos_token_id": 2,
|
8 |
+
"hidden_act": "silu",
|
9 |
+
"hidden_size": 8192,
|
10 |
+
"initializer_range": 0.02,
|
11 |
+
"intermediate_size": 28672,
|
12 |
+
"max_position_embeddings": 4096,
|
13 |
+
"model_type": "llama",
|
14 |
+
"num_attention_heads": 64,
|
15 |
+
"num_hidden_layers": 80,
|
16 |
+
"num_key_value_heads": 8,
|
17 |
+
"pad_token_id": 0,
|
18 |
+
"pretraining_tp": 1,
|
19 |
+
"quantization_config": {
|
20 |
+
"bnb_4bit_compute_dtype": "float32",
|
21 |
+
"bnb_4bit_quant_type": "fp4",
|
22 |
+
"bnb_4bit_use_double_quant": false,
|
23 |
+
"llm_int8_enable_fp32_cpu_offload": false,
|
24 |
+
"llm_int8_has_fp16_weight": false,
|
25 |
+
"llm_int8_skip_modules": null,
|
26 |
+
"llm_int8_threshold": 6.0,
|
27 |
+
"load_in_4bit": false,
|
28 |
+
"load_in_8bit": true
|
29 |
+
},
|
30 |
+
"rms_norm_eps": 1e-05,
|
31 |
+
"rope_scaling": null,
|
32 |
+
"tie_word_embeddings": false,
|
33 |
+
"torch_dtype": "float16",
|
34 |
+
"transformers_version": "4.31.0",
|
35 |
+
"use_cache": false,
|
36 |
+
"vocab_size": 32000
|
37 |
+
}
|
generation_config.json
ADDED
@@ -0,0 +1,8 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"_from_model_config": true,
|
3 |
+
"bos_token_id": 1,
|
4 |
+
"eos_token_id": 2,
|
5 |
+
"pad_token_id": 0,
|
6 |
+
"transformers_version": "4.31.0",
|
7 |
+
"use_cache": false
|
8 |
+
}
|
pytorch_model-00001-of-00008.bin
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:17a163853b236855e25b296a3bb2fd07e7c89514a7937cebfd7ec051179967dc
|
3 |
+
size 9940429183
|
pytorch_model-00002-of-00008.bin
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:28829d4f35b35225531f0278b273e7a7498baf3a92f3cbfd6167343b6f16ebb7
|
3 |
+
size 9802209345
|
pytorch_model-00003-of-00008.bin
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:4727d3addcb6c8e0c1dcf95c40ba5c7964d3f74df9df313e3f9fef721759fd7b
|
3 |
+
size 9970014616
|
pytorch_model-00004-of-00008.bin
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:3c2109f2c8be2bb1782e4aaec0b7ade8030e3c8dc86aa748b8e0f0220d293832
|
3 |
+
size 9953276573
|
pytorch_model-00005-of-00008.bin
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:525bf7db9e293b57d789990d464bcb7e5b812cd1be5f8b56f4d79c60c3bc16f9
|
3 |
+
size 9802161039
|
pytorch_model-00006-of-00008.bin
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:bd0727e3bc05dc7fa859b06a846ff4fff696fe2436b256808dbd26691e1affed
|
3 |
+
size 9886133756
|
pytorch_model-00007-of-00008.bin
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:00e04e63b2ddc64fb2b1caebef7399b5da9e040ff3f8f5c9c646e20a99ab14b4
|
3 |
+
size 9651105687
|
pytorch_model-00008-of-00008.bin
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:2fcb41ca27a3c9169f86a60636958f4199f1907dc43ab3c21a15dd8cd6b5470a
|
3 |
+
size 524288938
|
pytorch_model.bin.index.json
ADDED
The diff for this file is too large to render.
See raw diff
|
|
tokenizer.json
ADDED
The diff for this file is too large to render.
See raw diff
|
|
tokenizer.model
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:9e556afd44213b6bd1be2b850ebbbd98f5481437a8021afaf58ee7fb1818d347
|
3 |
+
size 499723
|
tokenizer_config.json
ADDED
@@ -0,0 +1,34 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"add_bos_token": true,
|
3 |
+
"add_eos_token": false,
|
4 |
+
"bos_token": {
|
5 |
+
"__type": "AddedToken",
|
6 |
+
"content": "<s>",
|
7 |
+
"lstrip": false,
|
8 |
+
"normalized": false,
|
9 |
+
"rstrip": false,
|
10 |
+
"single_word": false
|
11 |
+
},
|
12 |
+
"clean_up_tokenization_spaces": false,
|
13 |
+
"eos_token": {
|
14 |
+
"__type": "AddedToken",
|
15 |
+
"content": "</s>",
|
16 |
+
"lstrip": false,
|
17 |
+
"normalized": false,
|
18 |
+
"rstrip": false,
|
19 |
+
"single_word": false
|
20 |
+
},
|
21 |
+
"legacy": false,
|
22 |
+
"model_max_length": 1000000000000000019884624838656,
|
23 |
+
"pad_token": null,
|
24 |
+
"sp_model_kwargs": {},
|
25 |
+
"tokenizer_class": "LlamaTokenizer",
|
26 |
+
"unk_token": {
|
27 |
+
"__type": "AddedToken",
|
28 |
+
"content": "<unk>",
|
29 |
+
"lstrip": false,
|
30 |
+
"normalized": false,
|
31 |
+
"rstrip": false,
|
32 |
+
"single_word": false
|
33 |
+
}
|
34 |
+
}
|