DeepSeek-R1-AWQ / README.md
v2ray's picture
Added README.md, prefill ability, and fixed cache.
8cb6f88 verified
|
raw
history blame
367 Bytes
---
license: mit
language:
- en
- zh
base_model:
- deepseek-ai/DeepSeek-R1
pipeline_tag: text-generation
library_name: transformers
---
# DeepSeek R1 AWQ
AWQ of the DeepSeek R1 model.
This quant modified some of the model code to fix the overflow issue when using float16.
Tested on vLLM with 8x H100, inference speed 5 tokens/s with batch size 1 and short prompts.