TheBloke commited on
Commit
571bac0
1 Parent(s): 1db4754

Update for Transformers GPTQ support

Browse files
README.md CHANGED
@@ -9,17 +9,20 @@ license: bigcode-openrail-m
9
  ---
10
 
11
  <!-- header start -->
12
- <div style="width: 100%;">
13
- <img src="https://i.imgur.com/EBdldam.jpg" alt="TheBlokeAI" style="width: 100%; min-width: 400px; display: block; margin: auto;">
 
14
  </div>
15
  <div style="display: flex; justify-content: space-between; width: 100%;">
16
  <div style="display: flex; flex-direction: column; align-items: flex-start;">
17
- <p><a href="https://discord.gg/Jq4vkcDakD">Chat & support: my new Discord server</a></p>
18
  </div>
19
  <div style="display: flex; flex-direction: column; align-items: flex-end;">
20
- <p><a href="https://www.patreon.com/TheBlokeAI">Want to contribute? TheBloke's Patreon page</a></p>
21
  </div>
22
  </div>
 
 
23
  <!-- header end -->
24
 
25
  # HuggingFaceH4's Starchat Beta GPTQ
@@ -120,11 +123,12 @@ It was created without group_size to lower VRAM requirements, and with --act-ord
120
  * Parameters: Groupsize = -1. Act Order / desc_act = True.
121
 
122
  <!-- footer start -->
 
123
  ## Discord
124
 
125
  For further support, and discussions on these models and AI in general, join us at:
126
 
127
- [TheBloke AI's Discord server](https://discord.gg/Jq4vkcDakD)
128
 
129
  ## Thanks, and how to contribute.
130
 
@@ -139,12 +143,15 @@ Donaters will get priority support on any and all AI/LLM/model questions and req
139
  * Patreon: https://patreon.com/TheBlokeAI
140
  * Ko-Fi: https://ko-fi.com/TheBlokeAI
141
 
142
- **Special thanks to**: Luke from CarbonQuill, Aemon Algiz, Dmitriy Samsonov.
 
 
143
 
144
- **Patreon special mentions**: Ajan Kanaga, Kalila, Derek Yates, Sean Connelly, Luke, Nathan LeClaire, Trenton Dambrowitz, Mano Prime, David Flickinger, vamX, Nikolai Manek, senxiiz, Khalefa Al-Ahmad, Illia Dulskyi, trip7s trip, Jonathan Leane, Talal Aujan, Artur Olbinski, Cory Kujawski, Joseph William Delisle, Pyrater, Oscar Rangel, Lone Striker, Luke Pendergrass, Eugene Pentland, Johann-Peter Hartmann.
145
 
146
  Thank you to all my generous patrons and donaters!
147
 
 
 
148
  <!-- footer end -->
149
 
150
  # Original model card: HuggingFaceH4's Starchat Beta
@@ -176,7 +183,7 @@ StarChat is a series of language models that are trained to act as helpful codin
176
 
177
  ## Intended uses & limitations
178
 
179
- The model was fine-tuned on a variant of the [`OpenAssistant/oasst1`](https://huggingface.co/datasets/OpenAssistant/oasst1) dataset, which contains a diverse range of dialogues in over 35 languages. As a result, the model can be used for chat and you can check out our [demo](https://huggingface.co/spaces/HuggingFaceH4/starchat-playground) to test its coding capabilities.
180
 
181
  Here's how you can run the model using the `pipeline()` function from 🤗 Transformers:
182
 
@@ -197,16 +204,16 @@ outputs = pipe(prompt, max_new_tokens=256, do_sample=True, temperature=0.2, top_
197
 
198
  <!-- This section is meant to convey both technical and sociotechnical limitations. -->
199
 
200
- StarChat Alpha has not been aligned to human preferences with techniques like RLHF or deployed with in-the-loop filtering of responses like ChatGPT, so the model can produce problematic outputs (especially when prompted to do so).
201
  Models trained primarily on code data will also have a more skewed demographic bias commensurate with the demographics of the GitHub community, for more on this see the [StarCoder dataset](https://huggingface.co/datasets/bigcode/starcoderdata) which is derived from The Stack.
202
 
203
 
204
- Since the base model was pretrained on a large corpus of code, it may produce code snippets that are syntactically valid but semantically incorrect.
205
- For example, it may produce code that does not compile or that produces incorrect results.
206
- It may also produce code that is vulnerable to security exploits.
207
  We have observed the model also has a tendency to produce false URLs which should be carefully inspected before clicking.
208
 
209
- StarChat Alpha was fine-tuned from the base model [StarCoder Base](https://huggingface.co/bigcode/starcoderbase), please refer to its model card's [Limitations Section](https://huggingface.co/bigcode/starcoderbase#limitations) for relevant information.
210
  In particular, the model was evaluated on some categories of gender biases, propensity for toxicity, and risk of suggesting code completions with known security flaws; these evaluations are reported in its [technical report](https://drive.google.com/file/d/1cN-b9GnWtHzQRoE7M7gAEyivY0kl4BYs/view).
211
 
212
  ## Training and evaluation data
 
9
  ---
10
 
11
  <!-- header start -->
12
+ <!-- 200823 -->
13
+ <div style="width: auto; margin-left: auto; margin-right: auto">
14
+ <img src="https://i.imgur.com/EBdldam.jpg" alt="TheBlokeAI" style="width: 100%; min-width: 400px; display: block; margin: auto;">
15
  </div>
16
  <div style="display: flex; justify-content: space-between; width: 100%;">
17
  <div style="display: flex; flex-direction: column; align-items: flex-start;">
18
+ <p style="margin-top: 0.5em; margin-bottom: 0em;"><a href="https://discord.gg/theblokeai">Chat & support: TheBloke's Discord server</a></p>
19
  </div>
20
  <div style="display: flex; flex-direction: column; align-items: flex-end;">
21
+ <p style="margin-top: 0.5em; margin-bottom: 0em;"><a href="https://www.patreon.com/TheBlokeAI">Want to contribute? TheBloke's Patreon page</a></p>
22
  </div>
23
  </div>
24
+ <div style="text-align:center; margin-top: 0em; margin-bottom: 0em"><p style="margin-top: 0.25em; margin-bottom: 0em;">TheBloke's LLM work is generously supported by a grant from <a href="https://a16z.com">andreessen horowitz (a16z)</a></p></div>
25
+ <hr style="margin-top: 1.0em; margin-bottom: 1.0em;">
26
  <!-- header end -->
27
 
28
  # HuggingFaceH4's Starchat Beta GPTQ
 
123
  * Parameters: Groupsize = -1. Act Order / desc_act = True.
124
 
125
  <!-- footer start -->
126
+ <!-- 200823 -->
127
  ## Discord
128
 
129
  For further support, and discussions on these models and AI in general, join us at:
130
 
131
+ [TheBloke AI's Discord server](https://discord.gg/theblokeai)
132
 
133
  ## Thanks, and how to contribute.
134
 
 
143
  * Patreon: https://patreon.com/TheBlokeAI
144
  * Ko-Fi: https://ko-fi.com/TheBlokeAI
145
 
146
+ **Special thanks to**: Aemon Algiz.
147
+
148
+ **Patreon special mentions**: Sam, theTransient, Jonathan Leane, Steven Wood, webtim, Johann-Peter Hartmann, Geoffrey Montalvo, Gabriel Tamborski, Willem Michiel, John Villwock, Derek Yates, Mesiah Bishop, Eugene Pentland, Pieter, Chadd, Stephen Murray, Daniel P. Andersen, terasurfer, Brandon Frisco, Thomas Belote, Sid, Nathan LeClaire, Magnesian, Alps Aficionado, Stanislav Ovsiannikov, Alex, Joseph William Delisle, Nikolai Manek, Michael Davis, Junyu Yang, K, J, Spencer Kim, Stefan Sabev, Olusegun Samson, transmissions 11, Michael Levine, Cory Kujawski, Rainer Wilmers, zynix, Kalila, Luke @flexchar, Ajan Kanaga, Mandus, vamX, Ai Maven, Mano Prime, Matthew Berman, subjectnull, Vitor Caleffi, Clay Pascal, biorpg, alfie_i, 阿明, Jeffrey Morgan, ya boyyy, Raymond Fosdick, knownsqashed, Olakabola, Leonard Tan, ReadyPlayerEmma, Enrico Ros, Dave, Talal Aujan, Illia Dulskyi, Sean Connelly, senxiiz, Artur Olbinski, Elle, Raven Klaugh, Fen Risland, Deep Realms, Imad Khwaja, Fred von Graf, Will Dee, usrbinkat, SuperWojo, Alexandros Triantafyllidis, Swaroop Kallakuri, Dan Guido, John Detwiler, Pedro Madruga, Iucharbius, Viktor Bowallius, Asp the Wyvern, Edmond Seymore, Trenton Dambrowitz, Space Cruiser, Spiking Neurons AB, Pyrater, LangChain4j, Tony Hughes, Kacper Wikieł, Rishabh Srivastava, David Ziegler, Luke Pendergrass, Andrey, Gabriel Puliatti, Lone Striker, Sebastain Graf, Pierre Kircher, Randy H, NimbleBox.ai, Vadim, danny, Deo Leter
149
 
 
150
 
151
  Thank you to all my generous patrons and donaters!
152
 
153
+ And thank you again to a16z for their generous grant.
154
+
155
  <!-- footer end -->
156
 
157
  # Original model card: HuggingFaceH4's Starchat Beta
 
183
 
184
  ## Intended uses & limitations
185
 
186
+ The model was fine-tuned on a variant of the [`OpenAssistant/oasst1`](https://huggingface.co/datasets/OpenAssistant/oasst1) dataset, which contains a diverse range of dialogues in over 35 languages. As a result, the model can be used for chat and you can check out our [demo](https://huggingface.co/spaces/HuggingFaceH4/starchat-playground) to test its coding capabilities.
187
 
188
  Here's how you can run the model using the `pipeline()` function from 🤗 Transformers:
189
 
 
204
 
205
  <!-- This section is meant to convey both technical and sociotechnical limitations. -->
206
 
207
+ StarChat Alpha has not been aligned to human preferences with techniques like RLHF or deployed with in-the-loop filtering of responses like ChatGPT, so the model can produce problematic outputs (especially when prompted to do so).
208
  Models trained primarily on code data will also have a more skewed demographic bias commensurate with the demographics of the GitHub community, for more on this see the [StarCoder dataset](https://huggingface.co/datasets/bigcode/starcoderdata) which is derived from The Stack.
209
 
210
 
211
+ Since the base model was pretrained on a large corpus of code, it may produce code snippets that are syntactically valid but semantically incorrect.
212
+ For example, it may produce code that does not compile or that produces incorrect results.
213
+ It may also produce code that is vulnerable to security exploits.
214
  We have observed the model also has a tendency to produce false URLs which should be carefully inspected before clicking.
215
 
216
+ StarChat Alpha was fine-tuned from the base model [StarCoder Base](https://huggingface.co/bigcode/starcoderbase), please refer to its model card's [Limitations Section](https://huggingface.co/bigcode/starcoderbase#limitations) for relevant information.
217
  In particular, the model was evaluated on some categories of gender biases, propensity for toxicity, and risk of suggesting code completions with known security flaws; these evaluations are reported in its [technical report](https://drive.google.com/file/d/1cN-b9GnWtHzQRoE7M7gAEyivY0kl4BYs/view).
218
 
219
  ## Training and evaluation data
config.json CHANGED
@@ -1,39 +1,50 @@
1
  {
2
- "_name_or_path": "data/starcoderplus-ift-v4.1",
3
- "activation_function": "gelu",
4
- "architectures": [
5
- "GPTBigCodeForCausalLM"
6
- ],
7
- "attention_softmax_in_fp32": true,
8
- "attn_pdrop": 0.1,
9
- "bos_token_id": 0,
10
- "embd_pdrop": 0.1,
11
- "eos_token_id": 0,
12
- "inference_runner": 0,
13
- "initializer_range": 0.02,
14
- "layer_norm_epsilon": 1e-05,
15
- "max_batch_size": null,
16
- "max_sequence_length": null,
17
- "model_type": "gpt_bigcode",
18
- "multi_query": true,
19
- "n_embd": 6144,
20
- "n_head": 48,
21
- "n_inner": 24576,
22
- "n_layer": 40,
23
- "n_positions": 8192,
24
- "pad_key_length": true,
25
- "pre_allocate_kv_cache": false,
26
- "resid_pdrop": 0.1,
27
- "scale_attention_softmax_in_fp32": true,
28
- "scale_attn_weights": true,
29
- "summary_activation": null,
30
- "summary_first_dropout": 0.1,
31
- "summary_proj_to_labels": true,
32
- "summary_type": "cls_index",
33
- "summary_use_proj": true,
34
- "torch_dtype": "bfloat16",
35
- "transformers_version": "4.28.1",
36
- "use_cache": true,
37
- "validate_runner_input": true,
38
- "vocab_size": 49156
 
 
 
 
 
 
 
 
 
 
 
39
  }
 
1
  {
2
+ "_name_or_path": "data/starcoderplus-ift-v4.1",
3
+ "activation_function": "gelu",
4
+ "architectures": [
5
+ "GPTBigCodeForCausalLM"
6
+ ],
7
+ "attention_softmax_in_fp32": true,
8
+ "attn_pdrop": 0.1,
9
+ "bos_token_id": 0,
10
+ "embd_pdrop": 0.1,
11
+ "eos_token_id": 0,
12
+ "inference_runner": 0,
13
+ "initializer_range": 0.02,
14
+ "layer_norm_epsilon": 1e-05,
15
+ "max_batch_size": null,
16
+ "max_sequence_length": null,
17
+ "model_type": "gpt_bigcode",
18
+ "multi_query": true,
19
+ "n_embd": 6144,
20
+ "n_head": 48,
21
+ "n_inner": 24576,
22
+ "n_layer": 40,
23
+ "n_positions": 8192,
24
+ "pad_key_length": true,
25
+ "pre_allocate_kv_cache": false,
26
+ "resid_pdrop": 0.1,
27
+ "scale_attention_softmax_in_fp32": true,
28
+ "scale_attn_weights": true,
29
+ "summary_activation": null,
30
+ "summary_first_dropout": 0.1,
31
+ "summary_proj_to_labels": true,
32
+ "summary_type": "cls_index",
33
+ "summary_use_proj": true,
34
+ "torch_dtype": "bfloat16",
35
+ "transformers_version": "4.28.1",
36
+ "use_cache": true,
37
+ "validate_runner_input": true,
38
+ "vocab_size": 49156,
39
+ "quantization_config": {
40
+ "bits": 4,
41
+ "group_size": -1,
42
+ "damp_percent": 0.01,
43
+ "desc_act": true,
44
+ "sym": true,
45
+ "true_sequential": true,
46
+ "model_name_or_path": null,
47
+ "model_file_base_name": "model",
48
+ "quant_method": "gptq"
49
+ }
50
  }
gptq_model-4bit--1g.safetensors → model.safetensors RENAMED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:26ff39215164bf738aedd87975593dcefb572956520f3083b6ecf1e7fc3290a2
3
- size 8906687824
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2989a79643594d19ff6680912c8e1787e1c61456ceadee1098780b758a8e70e2
3
+ size 8906687880
quantize_config.json CHANGED
@@ -6,5 +6,5 @@
6
  "sym": true,
7
  "true_sequential": true,
8
  "model_name_or_path": null,
9
- "model_file_base_name": null
10
  }
 
6
  "sym": true,
7
  "true_sequential": true,
8
  "model_name_or_path": null,
9
+ "model_file_base_name": "model"
10
  }