Triangle104
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -27,79 +27,87 @@ This model was converted to GGUF format from [`EVA-UNIT-01/EVA-Qwen2.5-14B-v0.2`
|
|
27 |
Refer to the [original model card](https://huggingface.co/EVA-UNIT-01/EVA-Qwen2.5-14B-v0.2) for more details on the model.
|
28 |
|
29 |
---
|
|
|
|
|
30 |
|
31 |
-
-
|
32 |
-
|
33 |
-
|
34 |
-
|
35 |
-
|
36 |
-
|
37 |
-
|
38 |
-
|
39 |
-
|
40 |
-
|
41 |
-
|
42 |
-
-
|
43 |
-
|
44 |
-
|
45 |
-
|
46 |
-
|
47 |
-
|
48 |
-
|
49 |
-
|
50 |
-
|
51 |
-
|
52 |
-
|
53 |
-
|
54 |
-
|
55 |
-
|
56 |
-
|
57 |
-
|
58 |
-
|
59 |
-
|
60 |
-
|
61 |
-
|
62 |
-
|
63 |
-
|
64 |
-
|
65 |
-
|
66 |
-
|
67 |
-
|
68 |
-
|
69 |
-
|
70 |
-
|
71 |
-
|
72 |
-
|
73 |
-
|
74 |
-
|
75 |
-
|
76 |
-
|
77 |
-
|
78 |
-
|
79 |
-
|
80 |
-
|
81 |
-
|
82 |
-
|
83 |
-
|
84 |
-
|
85 |
-
|
86 |
-
|
87 |
-
|
88 |
-
|
89 |
-
|
90 |
-
|
91 |
-
|
92 |
-
|
93 |
-
|
94 |
-
|
95 |
-
|
96 |
-
|
97 |
-
|
98 |
-
|
99 |
-
|
100 |
-
|
101 |
-
|
102 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
103 |
|
104 |
---
|
105 |
## Use with llama.cpp
|
|
|
27 |
Refer to the [original model card](https://huggingface.co/EVA-UNIT-01/EVA-Qwen2.5-14B-v0.2) for more details on the model.
|
28 |
|
29 |
---
|
30 |
+
Model details:
|
31 |
+
-
|
32 |
|
33 |
+
A RP/storywriting specialist model, full-parameter finetune of Qwen2.5-14B on mixture of synthetic and natural data.
|
34 |
+
|
35 |
+
It uses Celeste 70B 0.1 data mixture, greatly expanding it to improve
|
36 |
+
versatility, creativity and "flavor" of the resulting model.
|
37 |
+
|
38 |
+
|
39 |
+
|
40 |
+
|
41 |
+
|
42 |
+
Version notes for 0.2: Now using the refined dataset from 32B
|
43 |
+
0.2. Major improvements in coherence, instruction following and
|
44 |
+
long-context comprehension over 14B v0.1.
|
45 |
+
|
46 |
+
|
47 |
+
|
48 |
+
|
49 |
+
|
50 |
+
Prompt format is ChatML.
|
51 |
+
|
52 |
+
|
53 |
+
|
54 |
+
Recommended sampler values:
|
55 |
+
|
56 |
+
|
57 |
+
Temperature: 0.8
|
58 |
+
Min-P: 0.05
|
59 |
+
Top-A: 0.3
|
60 |
+
Repetition Penalty: 1.03
|
61 |
+
|
62 |
+
|
63 |
+
|
64 |
+
Recommended SillyTavern presets (via CalamitousFelicitousness):
|
65 |
+
|
66 |
+
|
67 |
+
|
68 |
+
Context
|
69 |
+
Instruct and System Prompt
|
70 |
+
|
71 |
+
|
72 |
+
|
73 |
+
|
74 |
+
|
75 |
+
|
76 |
+
|
77 |
+
|
78 |
+
Training data:
|
79 |
+
|
80 |
+
|
81 |
+
|
82 |
+
Celeste 70B 0.1 data mixture minus Opus Instruct subset. See that model's card for details.
|
83 |
+
Kalomaze's Opus_Instruct_25k dataset, filtered for refusals.
|
84 |
+
A subset (1k rows) of ChatGPT-4o-WritingPrompts by Gryphe
|
85 |
+
A subset (2k rows) of Sonnet3.5-Charcards-Roleplay by Gryphe
|
86 |
+
Synthstruct and SynthRP datasets by Epiculous
|
87 |
+
A subset from Dolphin-2.9.3, including filtered version of not_samantha and a small subset of systemchat.
|
88 |
+
|
89 |
+
|
90 |
+
|
91 |
+
Training time and hardware:
|
92 |
+
|
93 |
+
|
94 |
+
|
95 |
+
3 hours on 8xH100 SXM, provided by FeatherlessAI
|
96 |
+
|
97 |
+
|
98 |
+
|
99 |
+
|
100 |
+
|
101 |
+
|
102 |
+
Model was created by Kearm, Auri and Cahvay.
|
103 |
+
|
104 |
+
|
105 |
+
Special thanks:
|
106 |
+
to Cahvay for his work on investigating and reprocessing the
|
107 |
+
corrupted dataset, removing the single biggest source of data poisoning.
|
108 |
+
to FeatherlessAI for generously providing 8xH100 SXM node for training of this model
|
109 |
+
to Gryphe, Lemmy, Kalomaze, Nopm, Epiculous and CognitiveComputations for the data
|
110 |
+
and to Allura-org for support, feedback, beta-testing and doing quality control of EVA models.
|
111 |
|
112 |
---
|
113 |
## Use with llama.cpp
|