--- base_model: - Qwen/Qwen2.5-7B-Instruct license: apache-2.0 language: - en datasets: - chtmp223/CLIPPER --- # Qwen2.5-7B-CLIPPER Qwen2.5-7B-CLIPPER is a fine-tuned version of https://huggingface.co/Qwen/Qwen2.5-7B-Instruct using supervised finetuning over chtmp223/CLIPPER dataset. Please check [our paper](https://arxiv.org/abs/2502.14854) for more details on the method. ## 📒 Model Details ### Model Description - **Language(s) (NLP):** English - **License:** Apache-2.0 - **Finetuned from model:** https://huggingface.co/Qwen/Qwen2.5-7B-Instruct](https://huggingface.co/https://huggingface.co/Qwen/Qwen2.5-7B-Instruct) ### Model Sources - **Repository:** [Github repository](https://github.com/chtmp223/CLIPPER). - **Paper:** [https://arxiv.org/abs/2502.14854](https://arxiv.org/abs/2502.14854) ## 💻 Training Details ### Training Data [chtmp223/CLIPPER](https://huggingface.co/datasets/chtmp223/CLIPPER) ### Training Procedure | **Configurations** | **Values** | |----------------------------------|--------------| | Hardware (Training and Inference)| 8xA100s | | Tracking | wandb | | batch size | 16 | | gradient_checkpointing | True | | learning_rate | 1.0e-6 | | lr_scheduler_type | cosine | | max_length | 131072 | | num_train_epochs | 1 | | optim | adamw_torch | #### Software Training code is adapted from [https://github.com/Qihoo360/360-LLaMA-Factory/tree/1b5398f539c7d94a530f3f32b53553a3b1928314](https://github.com/Qihoo360/360-LLaMA-Factory/tree/1b5398f539c7d94a530f3f32b53553a3b1928314). ## 🤗 Inference Inference is done with [vLLM](https://github.com/vllm-project/vllm) on 1 A100-80GB. ## 📜 Citation ``` @misc{pham2025clippercompressionenableslongcontext, title={CLIPPER: Compression enables long-context synthetic data generation}, author={Chau Minh Pham and Yapei Chang and Mohit Iyyer}, year={2025}, eprint={2502.14854}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2502.14854}, } ```