|
--- |
|
license: other |
|
license_name: qwen-research |
|
license_link: >- |
|
https://raw.githubusercontent.com/QwenLM/Qwen/refs/heads/main/Tongyi%20Qianwen%20RESEARCH%20LICENSE%20AGREEMENT |
|
base_model: |
|
- Qwen/Qwen2.5-3B-Instruct |
|
--- |
|
|
|
# Chirp-3b |
|
|
|
## Overview |
|
|
|
Chirp-3b is a high-performing 3B parameter language model crafted by the Ozone Research team. Fine-tuned from a robust base model (Qwen2.5 3B Instruct), it was trained on 50 million tokens of distilled data from GPT-4o. This compact yet powerful model delivers exceptional results, outperforming expectations on benchmarks like MMLU Pro and IFEval. |
|
|
|
Chirp-3b is an open-source effort to push the limits of what small-scale LLMs can achieve, making it a valuable tool for researchers and enthusiasts alike. |
|
|
|
## Key Features |
|
|
|
- **Parameters**: 3 billion |
|
- **Training Data**: 50M tokens distilled from GPT-4o |
|
|
|
## Benchmarks |
|
|
|
Chirp-3b excels on rigorous evaluation datasets, showcasing its strength for a 3B model. |
|
|
|
### MMLU Pro |
|
|
|
| Subject | Average Accuracy | |
|
|---------------------|------------------| |
|
| Biology | 0.6234 | |
|
| Business | 0.5032 | |
|
| Chemistry | 0.3701 | |
|
| Computer Science | 0.4268 | |
|
| Economics | 0.5284 | |
|
| Engineering | 0.3013 | |
|
| Health | 0.3900 | |
|
| History | 0.3885 | |
|
| Law | 0.2252 | |
|
| Math | 0.5736 | |
|
| Other | 0.4145 | |
|
| Philosophy | 0.3687 | |
|
| Physics | 0.3995 | |
|
| Psychology | 0.5589 | |
|
| **Overall Average** | **0.4320** | |
|
|
|
- **Improvement**: 9 points above the base model. |
|
|
|
### IFEval |
|
|
|
- **Score**: 72% |
|
- **Improvement**: 14% better than the base model. |
|
|
|
More benchmarks are in the works and will be shared soon! |
|
|
|
## Download |
|
|
|
Access Chirp-3b here: |
|
https://huggingface.co/ozone-research/Chirp-01 |
|
|
|
## Usage |
|
|
|
### Requirements |
|
|
|
- Recommended GPU: 8 GB VRAM Minimum |
|
|
|
### Example |
|
|
|
```python |
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
|
model_name = "ozone-research/Chirp-01" |
|
tokenizer = AutoTokenizer.from_pretrained(model_name) |
|
model = AutoModelForCausalLM.from_pretrained(model_name) |
|
|
|
input_text = "What’s the future of AI?" |
|
inputs = tokenizer(input_text, return_tensors="pt") |
|
outputs = model.generate(**inputs, max_length=50) |
|
print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |
|
``` |
|
|
|
## Future Work |
|
|
|
The Ozone AI team is exploring additional models, including 2B and larger variants. Keep an eye out for upcoming releases! |
|
|
|
## Feedback |
|
|
|
We’re eager for your input! Try Chirp-3b and let us know your thoughts, use cases, or ideas for improvement. Open an issue here or contact us via [contact method—update as needed]. |
|
|
|
## Acknowledgments |
|
|
|
A big thanks to the open-source community for driving projects like this forward. Chirp-3b is our contribution to making AI research more accessible. |