Chirp-01 / README.md
ozone-research's picture
Update README.md
457ce70 verified
---
license: other
license_name: qwen-research
license_link: >-
https://raw.githubusercontent.com/QwenLM/Qwen/refs/heads/main/Tongyi%20Qianwen%20RESEARCH%20LICENSE%20AGREEMENT
base_model:
- Qwen/Qwen2.5-3B-Instruct
---
# Chirp-3b
## Overview
Chirp-3b is a high-performing 3B parameter language model crafted by the Ozone Research team. Fine-tuned from a robust base model (Qwen2.5 3B Instruct), it was trained on 50 million tokens of distilled data from GPT-4o. This compact yet powerful model delivers exceptional results, outperforming expectations on benchmarks like MMLU Pro and IFEval.
Chirp-3b is an open-source effort to push the limits of what small-scale LLMs can achieve, making it a valuable tool for researchers and enthusiasts alike.
## Key Features
- **Parameters**: 3 billion
- **Training Data**: 50M tokens distilled from GPT-4o
## Benchmarks
Chirp-3b excels on rigorous evaluation datasets, showcasing its strength for a 3B model.
### MMLU Pro
| Subject | Average Accuracy |
|---------------------|------------------|
| Biology | 0.6234 |
| Business | 0.5032 |
| Chemistry | 0.3701 |
| Computer Science | 0.4268 |
| Economics | 0.5284 |
| Engineering | 0.3013 |
| Health | 0.3900 |
| History | 0.3885 |
| Law | 0.2252 |
| Math | 0.5736 |
| Other | 0.4145 |
| Philosophy | 0.3687 |
| Physics | 0.3995 |
| Psychology | 0.5589 |
| **Overall Average** | **0.4320** |
- **Improvement**: 9 points above the base model.
### IFEval
- **Score**: 72%
- **Improvement**: 14% better than the base model.
More benchmarks are in the works and will be shared soon!
## Download
Access Chirp-3b here:
https://huggingface.co/ozone-research/Chirp-01
## Usage
### Requirements
- Recommended GPU: 8 GB VRAM Minimum
### Example
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "ozone-research/Chirp-01"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
input_text = "What’s the future of AI?"
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**inputs, max_length=50)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```
## Future Work
The Ozone AI team is exploring additional models, including 2B and larger variants. Keep an eye out for upcoming releases!
## Feedback
We’re eager for your input! Try Chirp-3b and let us know your thoughts, use cases, or ideas for improvement. Open an issue here or contact us via [contact method—update as needed].
## Acknowledgments
A big thanks to the open-source community for driving projects like this forward. Chirp-3b is our contribution to making AI research more accessible.