Tested Q6, uses 567Gb Ram

#2
by krustik - opened

Q5-K-M uses 502GB RAM, Q6-K almost 567GB RAM. Tested on CPU, on my ancient 10 years old enterprise Xeon motherboard by Gigabyte with 12 RAM slots, which on used market costs just 100$.
Tested and worked in oobabooga launcher (text generation webui).
LM Studio launcher can't work with such large models still, with any updates, it's hopeless (crashing at loading at 500Gb Ram mark).
My review, unfortunately quantization of this model is not preserving it's strength, it downgraded seriously.
It's very talkative, even too much, that on Q5 quality observed notorius problem of "internal dialogue" with itself (first observed in Falcon 180B quants), non-stop,so R1 need to be precisely tuned in chat parameters (slightly helped frequency penalty 0.05, mirostat 1 kinda works, but i suspect mini_p 0 in deterministic preset is the key).
Q6 quality which is much better in V3 model here not improved it at all.
In test of repairing logical problem in Chuck code both Q5 and Q6 of R1 model failed.
In test or creating Bach melody in Chuck code - Q5 produced many examples but broken code, Q6 also failed (example below).
So, in the end i think V3 model of same size in Q6 is much more better than R1 model in Q6, for now.

R1-Q6-K 567Gb

You

undefined type Midiout at line 2

AI
My apologies for the oversight. The error occurs because the MidiOut class isn't properly capitalized. Here's the corrected version:

// Setup MIDI output
MidiOut mout;
if(!mout.open(0)) me.exit();

// Rest of the code remains the same...
The class name should be MidiOut (capital 'M', lowercase 'idi', capital 'O'). This should resolve the undefined type error. The rest of the code remains valid ChucK syntax for MIDI output.

You
<compiled.code>:2:1: error: undefined type 'MidiOut'...
[2] MidiOut mout;
^
<compiled.code>:2:1: error: ...in declaration
[2] MidiOut mout;
^

AI
My apologies for the confusion earlier. The error occurs because the MidiOut class name should be capitalized differently. Here's the corrected version:

// Setup MIDI output
MidiOut mout;
if(!mout.open(0)) me.exit();

// Rest of the code...
The correct class name is MidiOut (note the capitalization). This should resolve the "undefined type" error. The rest of the MIDI code will then work as intended with any MIDI output device.

You
not working

AI
Let's try a different approach using the ChucK language's built-in MIDI functionality. Here's a corrected and simplified version of the Bach Minuet:

// Set up MIDI output
MidiOut mout;
if(!mout.open(0)) me.exit();

hi ,any plans to test 4_k_m and q2 quants?

hi ,any plans to test 4_k_m and q2 quants?

I will try Unsloth Q2K-XL today, the non-distilled version, mainly to test quality.
But the main problem is that this "reasoning" not working in current LLM launchers like oobabooga, it produces a "diarrhea" of text non-stop. In Deepseek web-ui, they've made this process in 2 portions, first it thinks for specific time and secondly making the decision on the previous process results, there's no such mechanism in current LLM launchers, i'm not sure if they will implement such in updates.

So, i've tested Q2-K-XL (using 253Gb RAM, it can be fitted in 256GB RAM PC with GPU offloading) it produces very logical reasoning (see example below) but the code examples (all three of them) are completely broken (also in Q6-K again same). It's possible to repair code by this same model writing it all errors but it would take 5-7 attempts. For really good reasoning AND coding model (or "super-model" which can make perfect code in a first try) i think we should wait this year. It's good maybe for other uses, for example reasoning can make very good texts of something.

You
write a Mozart song in Chuck, in full size, can be piano

AI
Okay, so the user asked to "write a Mozart song in Chuck, in full size, can be piano." Let me unpack this. First, Mozart's style is Classical, characterized by clarity, balance, and form. Chuck is ChucK, a programming language for real-time sound synthesis. The user wants a piano piece. So I need to generate ChucK code that emulates a Mozart-style composition.

First, structure. Mozart's piano pieces often have clear structures like sonata form, but maybe a simpler minuet or sonatina for brevity. Let's aim for a sonatina form with exposition, development, and recapitulation. Next, harmony: diatonic, with Alberti bass patterns. Melody: balanced phrases, maybe sequences.

In ChucK, I need to set up a piano sound. Use StifKarp or maybe.... (continued)

In LLM launchers kinda "Chatbox" extention for Ollama is supporting separate reasoning process, but i wasn't able to test it because Ollama don't support models in parts still, no way i would re-download all this again (if you want in Ollama you should get through Ollama) and merge impossible by big storage sizes.
1)AnythingLLM launcher can work even with oobabooga by local api, but there's no "continue" text function.
2)oobabooga updated and become better with reasoning models, new direct continue button.
3)LM Studio still not working, no matter any updates, for some reason it reducing total system 576GB Ram to just 566GB, kinda very big use, oobabooga don't do that and allowing using 567GB in Q6 R1.

UPDATE
It's not double checked yet but maybe i found solution to make in Q6-K a working code in 1st try (what model usually failed). You just need to use from previous chat a "reasoning thinking" part and put it in new chat right into the prompt. Such way it makes code without any errors right away in my test. Not yet tested in Q2.
The model need some tuning and special techniques, there's a new research paper "Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs" which finds that R1 lacks certain tuning and can be much much better. Paper: https://huggingface.co/papers/2501.18585

hi krustik thanks for reply, i read that someone to run R1 on SSD nvme 4.0(use new clean ssd or old with no impormant data ) on gaming computer with 96gb ram, 1tb ssd is cheaper than ram and have warranty from 2 to 5 Years,what you think about this idea,

Sign up or log in to comment