THE THREAD OF DOOM

#12
by jukofyork - opened

Just realised I deleted the old "thread of doom" as it was attached to the earliest alpha version of the control vectors :(

jukofyork pinned discussion

Okay, I was wondering if we crossed some sort of line.

Anyway.. the INCREDIBLY important thing I was saying before the thread disappeared was... I have a feeling it is going to be just like they say. They are going to be liberal with grants. I suspect they will target people who are using the space outside the purpose that was intended... somewhere out there, someone has all their RAW 8k videos of their cats...

Anyway.. the INCREDIBLY important thing I was saying before the thread disappeared was... I have a feeling it is going to be just like they say. They are going to be liberal with grants. I suspect they will target people who are using the space outside the purpose that was intended... somewhere out there, someone has all their RAW 8k videos of their cats...

Yeah, it's a pity it got deleted (I should have checked more carefully what was linked), but it was getting a bit out of hand with all that scrolling so perhaps not such a bad thing.

I'm just gonna keep up the models that people have downloaded the most and get rid of all the "experimental, but likely broken" stuff with 15 downloads as they really weren't serving much of a purpose.

Also, all the old versions of the control vectors were vastly inferior to the final version due to me figuring out how to get them working as I went along, so it's probably better to just keep up the final v3.0 ones to avoid a lot of the confusion.


image.png

image.png

It looks a lot more like I'm just uploading quality models that people like/use now at least... The creative-writer-v0.1-35b and creative-writer-v0.2-35b models will be going as soon as I get the v1.0 version uploaded, and possible Dusk-Miqu-70B if they do set a hard-limit (I still think Dark-Miqu-70B is worth keeping whatever though).


Also if anybody really misses any I have uploaded, then I can in theory recreate them and upload a LoRA created from the delta using extract_lora.py, but I strongly suspect most of the models nobody will even notice they have gone... Of all that I have created I've only ever used Dark-Miqu-70B myself!

:( Damn there was some good info in that thread.

If you've still got Firefox tabs open somewhere, you'll be able to save some of the thread.

Unfortunately, I cleaned my browser tabs up about an hour ago.

And yeah, if people were using it as free cloud storage then it makes sense. I just think they could have gone about it better, rather than having us wake up and see the limit.

I'm curious, did your quota drop after deleting that? I wonder if all the PNG files attached there were "billed" to you.

@jukofyork I think you're good man. If they start enforcing it, you'll get an exemption for sure.

I come across your contributions randomly all over the place, even on github repos like some fine tuning tool lol

I should probably deduplicate my quants. Often, I was making one because I could not find what I was looking for, then it would turn out a few of us just happened to be making them at the same time, Then I started getting requests. So I just decided I would make a bunch. Need a Huggingverse quant global dedupe...

Dual Epyc home server@3t/s Q8_0.

Drops to 1.5t/s at longer context :(

I've been looking through the llama.cpp code and think it might be very hard for them to fix...

I'm away from home but when I get back I'll see if I can get this:

https://huggingface.co/deepseek-ai/DeepSeek-R1/blob/main/modeling_deepseek.py

Working with the bitsandbytes CPU-only backend:

https://github.com/bitsandbytes-foundation/bitsandbytes

I don't really know of any other viable CPU-only inference backends?

Dual Epyc home server@3t/s Q8_0.

Drops to 1.5t/s at longer context :(

Yeah, there was a thread about this in the llama.cpp GitHub that I read this afternoon. It seems to be a regression:

https://github.com/ggerganov/llama.cpp/issues/10981#issuecomment-2571427214

(See the discussion above this post)

https://www.reddit.com/r/LocalLLaMA/comments/1i7jpmb/deepseek_r1_goes_cormac_mccarthy/

Impressive!

@ChuckMcSneed You might want to give mistral.rs a try - they just announced support for v3 (and r1), and looking at the code it keeps the low rank attention matrices (and I assume the KV cache). I've never used it but it claims to support CPU only and loads GGUF.

I've never used it but it claims to support CPU only and loads GGUF.

Claims to support deepseek in safetensors only, and supports FP8. Will try it out tomorrow.

./mistralrs-server -i --isq Q4K plain -m /path/to/DeepSeek-R1:

Error: unknown variant `sigmoid`, expected `softmax` at line 60 column 27

Nice DeepSeek support. Complains about https://huggingface.co/deepseek-ai/DeepSeek-R1/blob/main/config.json. Did the dev at least try running it before claiming his program supports it?

./mistralrs-server -i --isq Q4K plain -m /path/to/DeepSeek-R1:

Error: unknown variant `sigmoid`, expected `softmax` at line 60 column 27

Nice DeepSeek support. Complains about https://huggingface.co/deepseek-ai/DeepSeek-R1/blob/main/config.json. Did the dev at least try running it before claiming his program supports it?

Claims tested and verified? That's asking a bit much...

./mistralrs-server -i --isq Q4K plain -m /path/to/DeepSeek-R1:

Error: unknown variant `sigmoid`, expected `softmax` at line 60 column 27

Nice DeepSeek support. Complains about https://huggingface.co/deepseek-ai/DeepSeek-R1/blob/main/config.json. Did the dev at least try running it before claiming his program supports it?

That's shit :/

What happens if you edit the config to "softmax"? If it runs and is a bit broken then maybe worth asking them about it, but if another retarded bug/oversight then I'd just give up as they obviously haven't even tried it.

Sign up or log in to comment