ozone-research
/

Chirp-01

Model card Files Files and versions Community

Chirp-01 / README.md

ozone-research's picture

Update README.md

457ce70 verified 4 days ago

|

history blame contribute delete

3 kB

	---
	license: other
	license_name: qwen-research
	license_link: >-
	https://raw.githubusercontent.com/QwenLM/Qwen/refs/heads/main/Tongyi%20Qianwen%20RESEARCH%20LICENSE%20AGREEMENT
	base_model:
	- Qwen/Qwen2.5-3B-Instruct
	---

	# Chirp-3b

	## Overview

	Chirp-3b is a high-performing 3B parameter language model crafted by the Ozone Research team. Fine-tuned from a robust base model (Qwen2.5 3B Instruct), it was trained on 50 million tokens of distilled data from GPT-4o. This compact yet powerful model delivers exceptional results, outperforming expectations on benchmarks like MMLU Pro and IFEval.

	Chirp-3b is an open-source effort to push the limits of what small-scale LLMs can achieve, making it a valuable tool for researchers and enthusiasts alike.

	## Key Features

	- Parameters: 3 billion
	- Training Data: 50M tokens distilled from GPT-4o

	## Benchmarks

	Chirp-3b excels on rigorous evaluation datasets, showcasing its strength for a 3B model.

	### MMLU Pro

	\| Subject \| Average Accuracy \|
	\|---------------------\|------------------\|
	\| Biology \| 0.6234 \|
	\| Business \| 0.5032 \|
	\| Chemistry \| 0.3701 \|
	\| Computer Science \| 0.4268 \|
	\| Economics \| 0.5284 \|
	\| Engineering \| 0.3013 \|
	\| Health \| 0.3900 \|
	\| History \| 0.3885 \|
	\| Law \| 0.2252 \|
	\| Math \| 0.5736 \|
	\| Other \| 0.4145 \|
	\| Philosophy \| 0.3687 \|
	\| Physics \| 0.3995 \|
	\| Psychology \| 0.5589 \|
	\| Overall Average \| 0.4320 \|

	- Improvement: 9 points above the base model.

	### IFEval

	- Score: 72%
	- Improvement: 14% better than the base model.

	More benchmarks are in the works and will be shared soon!

	## Download

	Access Chirp-3b here:
	https://huggingface.co/ozone-research/Chirp-01

	## Usage

	### Requirements

	- Recommended GPU: 8 GB VRAM Minimum

	### Example

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer

	model_name = "ozone-research/Chirp-01"
	tokenizer = AutoTokenizer.from_pretrained(model_name)
	model = AutoModelForCausalLM.from_pretrained(model_name)

	input_text = "What’s the future of AI?"
	inputs = tokenizer(input_text, return_tensors="pt")
	outputs = model.generate(**inputs, max_length=50)
	print(tokenizer.decode(outputs[0], skip_special_tokens=True))
	```

	## Future Work

	The Ozone AI team is exploring additional models, including 2B and larger variants. Keep an eye out for upcoming releases!

	## Feedback

	We’re eager for your input! Try Chirp-3b and let us know your thoughts, use cases, or ideas for improvement. Open an issue here or contact us via [contact method—update as needed].

	## Acknowledgments

	A big thanks to the open-source community for driving projects like this forward. Chirp-3b is our contribution to making AI research more accessible.