New request. :(
Can we, the community, please have a f16 and/or f8 model/s?
The imatrix Quant of this model is made to preserve ideal qualities of the original model, the Q8 and f16 should have most of the same qualities and should be available from a direct Quantization or reduction of the Qwen2.5 model. If you want a larger model which functions the same while offering better performance using more resources I would recommend Reasoning-Rabbit
or Thoth
If this model has some unique quality you would like expanded on(The dataset has been distilled for QAT but is made to increase inference overall) I can create an 8bit Quant of it. also we can work with the community on custom State Changes in small models. if you have a specific use case I'd love to help.
The page is mainly for creating GGUF models for easy use in applications like GPT4ALL this model is basically Qwen's Coder instruct model with a little nudging towards being able to reason more effectively inside GPT4ALL using or not using it's recursive "Analyzing" function or code interpreter and JavaScript execution tool for applying mathematics formula to a problem before answering. it's basically using the tool capabilities to give the LLM a calculator and teaching it ideal reasoning with and without it. ideal for Ollama, BackyardAI, GPT4ALL and all other GGUF applications. I am not fully training a new model simply dropping a second point in it's reasoning matrix and trying to get a state change/phase transition out of it.
Getting complex reasoning even out of the new Gemini models was close to impossible when this was released in november. But getting these small models to work well is a bit of a pain. I think there are a few videos out on the process
but the long context problems like
or when it uses tools(JavaScripy_interpreter/code execution in GPT4ALL) like the haversign function to find the distance between two points on the map(not sure how it gets the exact Long/lat but it calculates the haversine function(not super accurate but often more so than most websites)
If you are lookinig for similar CoT type Qwen2.5 models in safetensors format try; Skywork/Skywork-o1-Open-PRM-Qwen-2.5-7B or rombodawg/Rombos-LLM-V2.5.1-Qwen-3b
these both are great models that could fit your needs I made "Replicant because with so much functionality and great results from the 1.5B Qwen model I knew there was room for improvement in their 3B model. both of these guys do something similar in a more traditional way
Can we, the community, please have a f16 and/or f8 model/s?
Her is a Q8 versionit should do everything just a bit better, I'm trying to nail down a prompt that will use the tools effectively but here you go... https://huggingface.co/IntelligentEstate/Replicant_Operator_ed-Qw25-Q8_0-GGUF