neuralmagic
/

Meta-Llama-3-70B-Instruct-quantized.w4a16

Text Generation

text-generation-inference

Inference Endpoints

4-bit precision

Model card Files Files and versions Community

Meta-Llama-3-70B-Instruct-quantized.w4a16

3 contributors

History: 6 commits

abhinavnmagic's picture

Upload tokenizer.json with huggingface_hub

99e411f verified 4 months ago

.gitattributes

1.52 kB

initial commit 4 months ago
config.json

1.05 kB

Upload config.json with huggingface_hub 4 months ago
model.safetensors

39.8 GB
LFS

Upload model.safetensors with huggingface_hub 4 months ago
quantize_config.json

269 Bytes

Upload quantize_config.json with huggingface_hub 4 months ago
special_tokens_map.json

296 Bytes

Upload special_tokens_map.json with huggingface_hub 4 months ago
tokenizer.json

9.09 MB

Upload tokenizer.json with huggingface_hub 4 months ago