Sweaterdog
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -47,10 +47,10 @@ The model *may* experience bugs, such as not saying your name, getting previous
|
|
47 |
|
48 |
# What models can I choose?
|
49 |
|
50 |
-
There are going to be 3 model sizes avaliable, Regular, Mini
|
51 |
* Regular is a 7B parameter model, tuned from [Deepseek-R1 Distilled](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B)
|
52 |
* Mini is a 1.5B parameter model, also tuned from [Deepseek-R1 Distilled](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B)
|
53 |
-
*
|
54 |
|
55 |
Out of all of the models, Teensy had the largest percent of parameters tuned, being 1/2 the models total size
|
56 |
|
@@ -70,11 +70,8 @@ A. No, if you are making a post about MindCraft, and using this model, you only
|
|
70 |
|
71 |
## Important notes and considerations
|
72 |
|
73 |
-
The preview model of Andy-3.5
|
|
|
74 |
|
75 |
-
The Base model of Andy-3.5-mini-preview was a distilled version of Deepseek-R1, which was a tuned model of Qwen-2.5-1.5b
|
76 |
|
77 |
-
|
78 |
-
|
79 |
-
|
80 |
-
When the full versions of Andy-3.5 and Andy-3.5-preview release, they will both be trained on a context length of 128,000 to ensure proper usage during playing.
|
|
|
47 |
|
48 |
# What models can I choose?
|
49 |
|
50 |
+
There are going to be 2 *(maybe 3)* model sizes avaliable, Regular, Mini *(And Maybe large)*
|
51 |
* Regular is a 7B parameter model, tuned from [Deepseek-R1 Distilled](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B)
|
52 |
* Mini is a 1.5B parameter model, also tuned from [Deepseek-R1 Distilled](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B)
|
53 |
+
* Large *(Might)* be a 32b parameter model, again tuned from [Deepseek-R1 Distilled](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B) *- This model may not exist,* ***ever***
|
54 |
|
55 |
Out of all of the models, Teensy had the largest percent of parameters tuned, being 1/2 the models total size
|
56 |
|
|
|
70 |
|
71 |
## Important notes and considerations
|
72 |
|
73 |
+
The preview model of Andy-3.5, is Andy-3.5-teensy, a small model and tune with only 360 million parameters, it ***"understand Minecraft"***.
|
74 |
+
I would not recommend Andy-3.5-teensy, I felt like making a joke, and a joke was made, *(The Andy-3.5-teensy model was a big hope, but it sucks, try out the q2_k model!)*
|
75 |
|
|
|
76 |
|
77 |
+
When the full versions of Andy-3.5 and Andy-3.5-mini *(And possibly Andy-3.5-large)* release, they will both be trained on a context length of 32,000 to ensure proper usage during playing.
|
|
|
|
|
|