Qwen2.5-7B-CLIPPER

Qwen2.5-7B-CLIPPER is a fine-tuned version of https://huggingface.co/Qwen/Qwen2.5-7B-Instruct using supervised finetuning over chtmp223/CLIPPER dataset. Please check our paper for more details on the method.

📒 Model Details

Model Description

Language(s) (NLP): English
License: Apache-2.0
Finetuned from model: https://huggingface.co/Qwen/Qwen2.5-7B-Instruct](https://huggingface.co/https://huggingface.co/Qwen/Qwen2.5-7B-Instruct)

Model Sources

Repository: Github repository.
Paper: https://arxiv.org/abs/2502.14854

💻 Training Details

Training Data

chtmp223/CLIPPER

Training Procedure

Configurations	Values
Hardware (Training and Inference)	8xA100s
Tracking	wandb
batch size	16
gradient_checkpointing	True
learning_rate	1.0e-6
lr_scheduler_type	cosine
max_length	131072
num_train_epochs	1
optim	adamw_torch

Software

Training code is adapted from https://github.com/Qihoo360/360-LLaMA-Factory/tree/1b5398f539c7d94a530f3f32b53553a3b1928314.

🤗 Inference

Inference is done with vLLM on 1 A100-80GB.

📜 Citation

@misc{pham2025clippercompressionenableslongcontext,
      title={CLIPPER: Compression enables long-context synthetic data generation}, 
      author={Chau Minh Pham and Yapei Chang and Mohit Iyyer},
      year={2025},
      eprint={2502.14854},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2502.14854}, 
}

chtmp223
/

Qwen2.5-7B-CLIPPER