Pinkstack
/

PARM-V2-QwQ-Qwen-2.5-o1-3B-GGUF

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Pinkstack commited on 28 days ago

Commit

a8ea539

·

verified ·

1 Parent(s): 01e5ffa

Update README.md

Files changed (1) hide show

README.md +1 -0

README.md CHANGED Viewed

@@ -23,6 +23,7 @@ pipeline_tag: text-generation
 - ***Q8:*** This model should be used on most high end modern devices like rtx 3080, Responses are very high quality, but its noticeably slower than q4
 This Parm v2 is based on Qwen 2.5 3B which has gotten many extra reasoning training parameters so it would have similar outputs to qwen QwQ / O.1 mini (only much, smaller.), We trained using [this](https://huggingface.co/datasets/gghfez/QwQ-LongCoT-130K-cleaned) dataset, opus dataset and sonnet 3.5 dataset from huggingface.
 This is a pretty heavy to run model if you want on device ai's for phones I'd recommend using the 0.5B version of this model (coming soon)

 - ***Q8:*** This model should be used on most high end modern devices like rtx 3080, Responses are very high quality, but its noticeably slower than q4
 This Parm v2 is based on Qwen 2.5 3B which has gotten many extra reasoning training parameters so it would have similar outputs to qwen QwQ / O.1 mini (only much, smaller.), We trained using [this](https://huggingface.co/datasets/gghfez/QwQ-LongCoT-130K-cleaned) dataset, opus dataset and sonnet 3.5 dataset from huggingface.
 This is a pretty heavy to run model if you want on device ai's for phones I'd recommend using the 0.5B version of this model (coming soon)