|
--- |
|
language: |
|
- ja |
|
tags: |
|
- merge |
|
- mergekit |
|
- lazymergekit |
|
- SakanaAI/EvoLLM-JP-A-v1-7B |
|
- stabilityai/japanese-stablelm-base-gamma-7b |
|
base_model: |
|
- SakanaAI/EvoLLM-JP-A-v1-7B |
|
- stabilityai/japanese-stablelm-base-gamma-7b |
|
--- |
|
|
|
# 🌲 Hinoki-Sak-Sta-slerp-7B |
|
|
|
Hinoki-Sak-Sta-slerp-7B is a merge of the following models using the [LazyMergekit](https://colab.research.google.com/drive/1obulZ1ROXHjYLn6PPZJwRR6GzgQogxxb?usp=sharing) of [Maxime Labonne](https://huggingface.co/mlabonne) powered by [MergeKit](https://github.com/arcee-ai/mergekit) of [Arcee AI](https://www.arcee.ai): |
|
* [SakanaAI/EvoLLM-JP-A-v1-7B](https://huggingface.co/SakanaAI/EvoLLM-JP-A-v1-7B) (Base model) |
|
* [stabilityai/japanese-stablelm-base-gamma-7b](https://huggingface.co/stabilityai/japanese-stablelm-base-gamma-7b) |
|
|
|
## 💻 Configuration |
|
|
|
```yaml |
|
slices: |
|
- sources: |
|
- model: SakanaAI/EvoLLM-JP-A-v1-7B |
|
layer_range: [0, 32] |
|
- model: stabilityai/japanese-stablelm-base-gamma-7b |
|
layer_range: [0, 32] |
|
merge_method: slerp |
|
base_model: SakanaAI/EvoLLM-JP-A-v1-7B |
|
parameters: |
|
t: |
|
- filter: self_attn |
|
value: [0, 0.5, 0.3, 0.7, 1] |
|
- filter: mlp |
|
value: [1, 0.5, 0.7, 0.3, 0] |
|
- value: 0.5 |
|
dtype: bfloat16 |
|
``` |
|
|
|
## 🤗 Usage for HuggingFace |
|
|
|
```python |
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
from transformers import pipeline |
|
import torch |
|
|
|
model_name = "AkimfromParis/Hinoki-Sak-Sta-slerp-7B" |
|
|
|
tokenizer = AutoTokenizer.from_pretrained(model_name) |
|
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.bfloat16) |
|
|
|
pipe = pipeline("text-generation", model=model, tokenizer=tokenizer, pad_token_id=tokenizer.eos_token_id) |
|
|
|
messages = [ |
|
{"role": "system","content": "あなたは誠実で優秀な日本人のアシスタントです。以下のトピックに関する詳細な情報を提供してください。"}, |
|
{"role": "user", "content": "大谷翔平選手は誰ですか?"}, |
|
] |
|
print(pipe(messages, max_new_tokens=512)[0]['generated_text'][-1]) |
|
``` |
|
|
|
# 🔖 Citation |
|
``` |
|
@misc{goddard2024arcee, |
|
title={Arcee's MergeKit: A Toolkit for Merging Large Language Models}, |
|
author={Goddard, Charles and Siriwardhana, Shamane and Ehghaghi, Malikeh and Meyers, Luke and Karpukhin, Vlad and Benedict, Brian and McQuade, Mark and Solawetz, Jacob}, |
|
journal={arXiv preprint arXiv:2403.13257}, |
|
year={2024} |
|
} |
|
``` |
|
|
|
arxiv.org/abs/2403.13257 |