ModelCloud
/

DeepSeek-V2-Chat-0628-gptq-4bit

Text Generation

text-generation-inference

Inference Endpoints

4-bit precision

Model card Files Files and versions Community

lrl-modelcloud commited on Jul 20, 2024

Commit

dd045e7

·

verified ·

1 Parent(s): 64e6765

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -21,7 +21,7 @@ You can use [GPTQModel](https://github.com/ModelCloud/GPTQModel) for model infer
 import torch
 from transformers import AutoTokenizer, GenerationConfig
 from gptqmodel import GPTQModel
-model_name = "/monster/data/model/DeepSeek-V2-Chat-0628/gptq_gptq_4_0719/"
 tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
 # `max_memory` should be set based on your devices
 max_memory = {i: "75GB" for i in range(2)}

 import torch
 from transformers import AutoTokenizer, GenerationConfig
 from gptqmodel import GPTQModel
+model_name = "ModelCloud/DeepSeek-V2-Chat-0628-gptq-4bit"
 tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
 # `max_memory` should be set based on your devices
 max_memory = {i: "75GB" for i in range(2)}