dusty-nv's picture
Upload folder using huggingface_hub
2b008ce verified

DeepSeek-R1-Distill-Llama-8B-q4f16_ft-MLC

Model Configuration
Source Model deepseek-ai/DeepSeek-R1-Distill-Llama-8B
Inference API MLC_LLM
Quantization q4f16_ft
Model Type llama
Vocab Size 128256
Context Window Size 131072
Prefill Chunk Size 8192
Temperature 0.6
Repetition Penalty 1.0
top_p 0.95
pad_token_id 0
bos_token_id 128000
eos_token_id 128001

See jetson-ai-lab.com/models.html for benchmarks, examples, and containers to deploy local serving and inference for these quantized models.