BigHuggyD commited on
Commit
105214c
·
verified ·
1 Parent(s): 69e24cb

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +105 -0
README.md ADDED
@@ -0,0 +1,105 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ - fr
5
+ - de
6
+ - es
7
+ - it
8
+ - pt
9
+ - zh
10
+ - ja
11
+ - ru
12
+ - ko
13
+ license: other
14
+ license_name: mrl
15
+ license_link: https://mistral.ai/licenses/MRL-0.1.md
16
+ base_model:
17
+ - mistralai/Mistral-Large-Instruct-2411
18
+ ---
19
+ # Writer-Large-2411-v2.1
20
+
21
+ EXL2-Quant available here: [gghfez/Writer-Large-2411-v2.1-exl2-4.5bpw](https://huggingface.co/gghfez/Writer-Large-2411-v2.1-exl2-4.5bpw)
22
+
23
+ Creative-Writing Control-Vectors available here: [gghfez/Writer-Large-2411-v2.1-control-vectors](https://huggingface.co/gghfez/Writer-Large-2411-v2.1-control-vectors)
24
+
25
+ ## Overview
26
+
27
+ This model is built on Mistral-Large-Instruct-2411 and optimized for creative writing purposes. The base model excels at following instructions and handling details in long context when using the [new prompt template](https://huggingface.co/gghfez/Mistral-Large-Instruct-2411/blob/main/tokenizer_config.json#L6177).
28
+
29
+ ### Key Improvements
30
+ - Reduced positivity bias
31
+ - Reduced AI tropes and repetitive language patterns in story generation
32
+ - Enhanced performance with longer context stories (multiple chapters) and roleplay sessions
33
+ - Improved steering capabilities for roleplay via [OOC] instructions
34
+ - Better handling of "group chat" scenarios
35
+
36
+
37
+ <img src="https://files.catbox.moe/hisiua.png" width="400"/>
38
+
39
+ ## Usage
40
+
41
+ ### Prompt Template
42
+ **The model works best with a system prompt in the Mistral-V7 format.**
43
+ If you omit [`SYSTEM_PROMPT] [/SYSTEM_PROMPT]`, the model:
44
+ - May not follow instructions properly at short contexts
45
+ - Can become repetitive at longer contexts
46
+
47
+ Example:
48
+ ```python
49
+ [SYSTEM_PROMPT]You are an award winning writer. Assist the user.[/SYSTEM_PROMPT][INST] Write the opening chapter of ... [/INST]
50
+ ```
51
+
52
+ ### SillyTavern Integration
53
+
54
+ #### With System Prompt:
55
+
56
+ Story String:
57
+ ```python
58
+ [SYSTEM_PROMPT] {{#if system}}{{system}}[/SYSTEM_PROMPT] [INST]
59
+ {{/if}}{{#if wiBefore}}{{wiBefore}}
60
+ {{/if}}{{#if description}}{{description}}
61
+ {{/if}}{{#if personality}}{{personality}}
62
+ {{/if}}{{#if scenario}}{{scenario}}
63
+ {{/if}}{{#if wiAfter}}{{wiAfter}}
64
+ {{/if}}{{#if persona}}{{persona}}
65
+ {{/if}}{{trim}}[/INST] Understood.</s>
66
+ ```
67
+
68
+ #### Without System Prompt:
69
+
70
+ Story String:
71
+ ```python
72
+ [INST]{{#if system}}{{system}}[/SYSTEM_PROMPT]
73
+ {{/if}}{{#if wiBefore}}{{wiBefore}}
74
+ {{/if}}{{#if description}}{{description}}
75
+ {{/if}}{{#if personality}}{{personality}}
76
+ {{/if}}{{#if scenario}}{{scenario}}
77
+ {{/if}}{{#if wiAfter}}{{wiAfter}}
78
+ {{/if}}{{#if persona}}{{persona}}
79
+ {{/if}}{{trim}}[/INST] Understood.</s>
80
+ ```
81
+
82
+ For response steering, use `[OOC]` commands, e.g.:
83
+ - `[OOC] Have them interrupted by a loud explosion in a nearby factory`
84
+ - `[OOC] Have her refuse to sell it and suggest another merchant instead`
85
+
86
+ ## Technical Details
87
+
88
+ ### Training
89
+ - QLoRA training at 32768 context
90
+ - Merged with [gghfez/Mistral-Large-Instruct-2411](https://huggingface.co/gghfez/Mistral-Large-Instruct-2411) at bf16
91
+ - [jukofyork/Creative writing control vectors](https://huggingface.co/jukofyork/creative-writing-control-vectors-v3.0) were applied during synthetic dataset generation
92
+ - Includes standard assistant instruct data for long-context stability
93
+ - Note: Performance on code tasks may be reduced compared to base model
94
+ - Note: No attempt was made to remove 'Name-Slop', so you'll still encounter Lily and Elara if you don't specify character names
95
+
96
+ ### Context Length
97
+ - Base model: 131,072 tokens
98
+ - Training range: 1024-32728 tokens
99
+ - Training context window: 32768 tokens
100
+
101
+ ## Testing Environments
102
+ Tested with exllamav2 4.5bpw on:
103
+ - [tabbyAPI](https://github.com/theroyallab/tabbyAPI) + [MikuPad](https://github.com/lmg-anon/mikupad)
104
+ - [tabbyAPI](https://github.com/theroyallab/tabbyAPI) + [SillyTavern](https://github.com/SillyTavern/SillyTavern)
105
+ - [exui](https://github.com/turboderp/exui)