smpanaro
/

gpt2-AutoGPTQ-4bit-128g

Text Generation

text-generation-inference

Inference Endpoints

4-bit precision

Model card Files Files and versions Community

smpanaro commited on Feb 28

Commit

87a85b6

•

1 Parent(s): e059b48

Create README.md

Files changed (1) hide show

README.md +21 -0

README.md ADDED Viewed

	@@ -0,0 +1,21 @@

+---
+license: mit
+datasets:
+- wikitext
+---
+[gpt2](https://huggingface.co/openai-community/gpt2) quantized to 4-bit using [AutoGPTQ](https://github.com/AutoGPTQ/AutoGPTQ).
+To use:
+```shell
+pip install auto-gptq
+```
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+from auto_gptq import AutoGPTQForCausalLM, BaseQuantizeConfig
+model_name = "smpanaro/gpt2-AutoGPTQ-4bit-128g"
+model = AutoGPTQForCausalLM.from_quantized(model_name)
+```