Model Fails to run on featherless.ai

by Darok-recursal - opened Aug 21, 2024

Aug 21, 2024

I dont know if it's the same as for your benchmarks, i suspect the model could be quantized and not specified. Anyone found out why

netcat420

Owner Sep 20, 2024

im so sorry for this late response! this was an earlier version of my model that is still currently being updated and i always make sure to release 4-bit medium and small gguf quants for each model release as these are the models i use personally!

Darok-recursal

Sep 25, 2024

Thanks for your response. Having an fp8 version sounds standard operation to me. If the model is quanted, it should be specified in the model name so it avoids any confusion. Could you provide an fp8 version? People on featherless want to try your model :D

netcat420

Owner Oct 2, 2024

sorry again for the late response! ive been very busy with work (i work a kitchen job as of right now lmao) yeah, would you like a GGUF version of the model? or a different quant?

also in the meantime, check out the model merge of this model with its base: https://huggingface.co/netcat420/MFANN-Phigments12-slerp

and here is the latest version of the 3b model: https://huggingface.co/netcat420/MFANN3bv0.21

q4_k_X quants: https://huggingface.co/netcat420/MFANN3bv0.21-GGUF

netcat420

Owner Oct 2, 2024

also here is the v0.6 gguf quant: https://huggingface.co/netcat420/MFANN3bv0.6-GGUF/tree/main which at the time i was only doing q4_k_m quants, but im about to add q8_0 quants as requested. although due to phi-2 no longer being supported by llama.cpp, i have to use my own custom fork to keep the 3b phi-2 models going, hence the longer time it takes to create the quants

netcat420

Owner Oct 2, 2024

ok the q8_0 and the q6_k quants are now live at this link: https://huggingface.co/netcat420/MFANN3bv0.6-GGUF/tree/main

and version 0.21 is having q8_0 and q6_k quants uploaded as we speak! im going to start doing q4_k_m, q4_k_s, q6_k and q8_0 in my future releases! and these new quants should be live on this link once uploaded: https://huggingface.co/netcat420/MFANN3bv0.21-GGUF

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment