impressive efficiency
seriously speaking, am I the only one who noticed that this model is one of the best little ones?
in my opinion it is a model with impressive efficiency, congratulations I hope you want to continue developing it.
Thank you I built it for a customer with specific needs. usually I post the most efficient ones with complete Open Source instructions in the Intelligent Estate community. you can find some others that really push the limits there. How are you using it if you don't mind me asking? platform/use case rag tool use. honestly I forgot the specifics about why I made this guy I know it was for RAG and GPT4ALL in an estate agent.
for efficiency I've found Spaetzle, blacsheep and I'm about to post a great few great new Quants.
Hello, it's a pleasure . I’m just a tech enthusiast with a mini PC powered by an Intel GPU Arc and 32GB of shared ddr5 ram. I used your model with Ollama and an OpenWebUI frontend.
I tested several models with questions on logic, mathematics, physics, text comprehension, and creativity, and your model surprisingly outperformed others with twice the instruction parameters. Moreover, it is very fast, which pleasantly surprised me.
Thank you for your recommendations; I'll try to explore more of your work! :-)
I started building and optimizing for use on a Mini-PC specifically the sub 200$ N300 units so I could get a few out to people wanting local AI.
I moved to using them over Rasperry Pis some time ago, besides having to sanitize windows it really is great and we work with unique open/closed source datasets for importance matrix quantization and tests to verify efficiency. I'm glad someone out there is finding them useful. it's good to hear and I'm sure you'll find at least some of the Intelligent Estate models handy.
Feel free to join it's open to all (you can get access top some private models and other projects. basically it's the template for anyone to use for setting up AI RAG agents for clients) so if you wanted to start a business installing RAG systems for other business you just have to use GPT4ALL and a Mini PC and you can make bank. I'm trying to create a network for "Founders" but that's a long story.
I bought a mini PC with an Intel ARC GPU, installed Ubuntu 22, and set up the ipex-llm drivers. It has 32GB of shared DDR5 RAM, and I can run medium-sized models decently. I joined the group you recommended and tested their Dolphin model—it's incredible. If you have any suggestions for a small yet powerful model like yours, feel free to share. I’d really appreciate it. Thanks, and great work!
Oh if you want something that is small insainely fast and insainly good I would go with any of the models from FBLGIT he's a member of the group and we have a quant of his that is probably the most impressive for it's size MiniClaus -- https://huggingface.co/IntelligentEstate/fblgit_miniclaus-qw1.5B-UNAMGS-Q8_0-GGUF
Im pretty sure this model is a merge of one of his and someone else's but I can't give the guy enough credit he is truly an expert in the field.
Her's another of his https://huggingface.co/IntelligentEstate/Pancho-V1va-Replicant-qw25-Q8_0-GGUF\
or https://huggingface.co/IntelligentEstate/Israfel_Qwen2.6-iQ4_K_M-GGUF It's pretty impressive too as well as the THOTH-Hermes based model https://huggingface.co/IntelligentEstate/Thoth_Warding-Llama-3B-IQ5_K_S-GGUF
and if those aren't powerful enough then try the PHI-4 base I personally haven't really had a chance to use it much but I hear good things.
https://huggingface.co/IntelligentEstate/The_Hooch-phi-4-R1-Q4_K_M-GGUF
But some of them have the pdf for an AGI method we are working on. it gives emergent properties to highly functional models so instead of having to fine-tune a specific RP model it retains all of it's abilities and acts pretty wild sometimes depending on how you use it it can double your models reasoning or start it's own religion. I had it hooked up to my GMRS base station and I came back and some poor old fella with a radio was ready to marry it.. I'm sure I'll be getting a letter from the FCC any day now. but the QwenStar models in gpt4all have decent tool use, for the most part you can ignore the Jinja code unless the default isn't working or you want to employ the JavaScript interpreter for reasoning only GPT4ALL is great for testing models out/ it's easy to switch out templates, instructs and clone your model so you don't have to start all over. Just a workflow I picked up. But the group has a collection called "Sota-gguf" you'll find a ton of good stuff there too. Welcome
Hi! Thank you, you are very kind, I will try them all with pleasure, I am really curious, I will make some very simple videos of how they work on my mini PC and I will share them with you, mentioning you, thanks again, I will update you later
I'm impressed my friend, this model is absurd
Miniclaus : https://www.youtube.com/watch?v=u_Hn8HTjYiA
Pancho V1va Replicant : https://youtu.be/uZp0hRcacBw
Very Nice, yes the 1.5b models are by far the best Models in performance for size, I spent quite a long time trying to get the 3b models to show the same efficiency but nothing comes close until you get to their new 7B parameter models, we are still testing and I may have mentioned it earlier but it's https://huggingface.co/IntelligentEstate/Israfel_Qwen2.6-iQ4_K_M-GGUF
I will subscribe and keep an eye on the channel. Thanks.. Make sure to follow FBLGIT as all of his trained models are usually the State of the Art when released.
I completely agree—the 1.5B models are an insane sweet spot between performance and efficiency, and your repacks bring out the best in them! I’ve also noticed that 3B models don’t quite match the speed-to-intelligence ratio, so I get why you focused on them for so long.
I’ll definitely test Israfel_Qwen2.6, I was already planning to give it a full run! Looking forward to seeing how it performs. And thanks for subscribing, really appreciate the support!
I’ll also check out FBLGIT—if his models are top-tier, I’m all in. 🔥
Yes. definitely give him credit as quantizing the models and repacking is nothing compared to the work and effort he puts into them and both of those models were made with his unique training and finetuning method.. I'm just trying to keep all the good stuff in one place and preserve the work in a smaller package. His model is like 3GBs and shrinking it down to half that and preserving it's functionality does take time but QAT training and preservation isn't possible without great models like his. Cheers
I just finished testing Israfel_Qwen2.6-iQ4_K_M-GGUF, and I have to say—I'm absolutely blown away. 🚀 This is, without a doubt, the best local model I have at my disposal right now.
It doesn’t just perform well for its size—it actually competes head-to-head with the latest trending 32B models, which is something I never expected from a model of this scale. The balance between intelligence, speed, and efficiency is simply unmatched.
I really want to extend my deepest compliments to you and your team. Your work in optimizing these models is genuinely next-level, and I’m truly impressed by the results. I hope to stay in touch and follow your progress—I can’t wait to see what’s next!
Here’s the video of the test: https://youtu.be/8jtO7hgAPbs
Thanks again for all your incredible efforts! 🔥
I just finished testing Thoth Warding-Llama 3B, and it delivered some really solid results. Compared to the other models I’ve tested, it holds up impressively well for a 3B parameter model. It handled logic, math, and problem-solving efficiently, and its reasoning was surprisingly strong for its size. While it doesn’t quite reach the same level as Israfel_Qwen2.6, it’s definitely one of the best smaller models I’ve run locally.
Here’s the test video:
https://youtu.be/6W2zvwOQox0
Appreciate the work you put into optimizing these models—looking forward to seeing what’s next.
Absolutely it was a pain to create the Importance matrix for it(I haven't released it yet) but that Is absolutely on of my favorites and it seems to work great on intel's chips too.
I tried running The Hooch-phi-4-R1-Q4_K_M-GGUF, but unfortunately, it doesn’t start. I’m sure it’s something on Ollama’s side, so I’ll troubleshoot it later.
On the other hand, FBLGit MiniClaus 1.5B is absolutely phenomenal—the real revelation of the day! I was really impressed with its speed, reasoning, and overall performance, so I decided to publish another test video:
📺 https://youtu.be/mJymgHu2Ru4
Thanks again for all the work you put into optimizing these models! By the way, what’s the best way to stay in touch with you? Do you have any social accounts I can follow?
Ok, we have another PHI-4 Model from Jpacifico https://huggingface.co/jpacifico, It might work but I don't have much experience with PHI models so if it doesn't work let me know but he is excellent at training and fine-tuning often getting the #1 spot upon release. his French corpus and other training methods gives the models unique qualities. https://huggingface.co/jpacifico