Text Generation
Transformers
Safetensors
mixtral
Mixture of Experts
frankenmoe
Merge
mergekit
lazymergekit
Locutusque/TinyMistral-248M-v2
Locutusque/TinyMistral-248M-v2.5
Locutusque/TinyMistral-248M-v2.5-Instruct
jtatman/tinymistral-v2-pycoder-instruct-248m
Felladrin/TinyMistral-248M-SFT-v4
Locutusque/TinyMistral-248M-v2-Instruct
text-generation-inference
Inference Endpoints
File size: 6,351 Bytes
6501fae 7fbb961 74e98a0 6501fae 9aa780b 6501fae 2b4c079 e88f23b 6501fae 9aa780b 6501fae 9aa780b 6501fae 9aa780b 6501fae 9aa780b 6501fae 9aa780b 6501fae 9aa780b 6501fae |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 |
---
license: apache-2.0
tags:
- moe
- frankenmoe
- merge
- mergekit
- lazymergekit
- Locutusque/TinyMistral-248M-v2
- Locutusque/TinyMistral-248M-v2.5
- Locutusque/TinyMistral-248M-v2.5-Instruct
- jtatman/tinymistral-v2-pycoder-instruct-248m
- Felladrin/TinyMistral-248M-SFT-v4
- Locutusque/TinyMistral-248M-v2-Instruct
base_model:
- Locutusque/TinyMistral-248M-v2
- Locutusque/TinyMistral-248M-v2.5
- Locutusque/TinyMistral-248M-v2.5-Instruct
- jtatman/tinymistral-v2-pycoder-instruct-248m
- Felladrin/TinyMistral-248M-SFT-v4
- Locutusque/TinyMistral-248M-v2-Instruct
inference:
parameters:
do_sample: true
temperature: 0.2
top_p: 0.14
top_k: 12
max_new_tokens: 250
repetition_penalty: 1.15
widget:
- text: |
<|im_start|>user
Write me a Python program that calculates the factorial of n. <|im_end|>
<|im_start|>assistant
- text: >-
An emerging clinical approach to treat substance abuse disorders involves a
form of cognitive-behavioral therapy whereby addicts learn to reduce their
reactivity to drug-paired stimuli through cue-exposure or extinction
training. It is, however,
datasets:
- nampdn-ai/mini-peS2o
---
# TinyMistral-6x248M
TinyMistral-6x248M is a Mixure of Experts (MoE) made with the following models using [LazyMergekit](https://colab.research.google.com/drive/1obulZ1ROXHjYLn6PPZJwRR6GzgQogxxb?usp=sharing):
* [Locutusque/TinyMistral-248M-v2](https://huggingface.co/Locutusque/TinyMistral-248M-v2)
* [Locutusque/TinyMistral-248M-v2.5](https://huggingface.co/Locutusque/TinyMistral-248M-v2.5)
* [Locutusque/TinyMistral-248M-v2.5-Instruct](https://huggingface.co/Locutusque/TinyMistral-248M-v2.5-Instruct)
* [jtatman/tinymistral-v2-pycoder-instruct-248m](https://huggingface.co/jtatman/tinymistral-v2-pycoder-instruct-248m)
* [Felladrin/TinyMistral-248M-SFT-v4](https://huggingface.co/Felladrin/TinyMistral-248M-SFT-v4)
* [Locutusque/TinyMistral-248M-v2-Instruct](https://huggingface.co/Locutusque/TinyMistral-248M-v2-Instruct)
The resulting model is then pre-trained on 600,000 examples of nampdn-ai/mini-peS2o.
We don't recommend using the Inference API as the model has serious performance degradation.
### Recommended inference parameters
```
do_sample: true
temperature: 0.2
top_p: 0.14
top_k: 12
repetition_penalty: 1.15
```
## 🧩 Configuration
```yaml
base_model: Locutusque/TinyMistral-248M-v2.5
experts:
- source_model: Locutusque/TinyMistral-248M-v2
positive_prompts:
- "An emerging trend in global economics is"
- "TITLE: The Next Generation of Internet Connectivity"
- "begin a comprehensive analysis on the sociopolitical effects of"
negative_prompts:
- "Code a simple"
- "Explain the Krebs cycle in detail"
- "Compose a sonnet about"
- source_model: Locutusque/TinyMistral-248M-v2.5
positive_prompts:
- "Advanced C++ memory management techniques"
- "C# asynchronous programming best practices"
- "AI's role in predictive analytics"
- "textbook review on machine learning algorithms"
- "## Exercise: Design a C# interface for a CRM system"
- "## Solution: Optimize an AI-powered recommendation engine"
negative_prompts:
- "Narrate the story of"
- "The ethical considerations in"
- "Review the latest art exhibition by"
- source_model: Locutusque/TinyMistral-248M-v2.5-Instruct
positive_prompts:
- "What is the chemical formula for photosynthesis?"
- "Identification of a new mineral found on Mars"
- "physics: Explaining the concept of relativity"
- "Solve for x using differential equations:"
- "history: Analyze the causes of the French Revolution"
negative_prompts:
- "Devise a business plan for"
- "The evolution of culinary arts"
- "Orchestrate a piece for a string quartet"
- source_model: jtatman/tinymistral-v2-pycoder-instruct-248m
positive_prompts:
- "Write a Python program for facial recognition"
- "Explain dynamic typing in programming languages"
- "algorithm development for efficient data sorting"
negative_prompts:
- "Who was the first Emperor of Rome?"
- "Discuss the political dynamics in"
- "Provide a proof for Fermat's Last Theorem"
- "physics: The principles of thermodynamics"
- source_model: Felladrin/TinyMistral-248M-SFT-v4
positive_prompts:
- "Escreba sobre a influência da música no Brasil"
- "Voici un guide pour les voyageurs en France"
- "Para entender la política de México, se debe considerar"
- "Cuales son los efectos de la globalización en Argentina"
- "Welche gesellschaftlichen Veränderungen gibt es in Deutschland"
- "If you had to imagine a utopian city, what would be its core values?"
negative_prompts:
- "Calculate the integral of"
- "Describe the process of cell division"
- "Review the latest advancements in quantum computing"
- source_model: Locutusque/TinyMistral-248M-v2-Instruct
positive_prompts:
- "Write an essay on the evolution of international trade laws"
- "What are the key components of a sustainable urban ecosystem?"
- "instruct on effective negotiation techniques in diplomacy"
- "How does cognitive bias affect decision making in high-pressure environments?"
- "Identify the architectural significance of the Sydney Opera House"
negative_prompts:
- "Develop a script to automate"
- "Understanding inheritance in object-oriented programming"
- "philosophy of existentialism in contemporary society"
```
## 💻 Usage
```python
!pip install -qU transformers bitsandbytes accelerate
from transformers import AutoTokenizer
import transformers
import torch
model = "M4-ai/TinyMistral-6x248M"
tokenizer = AutoTokenizer.from_pretrained(model)
pipeline = transformers.pipeline(
"text-generation",
model=model,
model_kwargs={"torch_dtype": torch.float16, "load_in_4bit": True},
)
messages = [{"role": "user", "content": "Explain what a Mixture of Experts is in less than 100 words."}]
prompt = pipeline.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])
``` |