Hugging Face
Models
Datasets
Spaces
Posts
Docs
Solutions
Pricing
Log In
Sign Up
neuralmagic
/
Meta-Llama-3.1-405B-Instruct-quantized.w8a16
like
2
Follow
Neural Magic
166
Text Generation
Safetensors
8 languages
llama
int8
vllm
conversational
compressed-tensors
arxiv:
2210.17323
License:
llama3.1
Model card
Files
Files and versions
Community
Train
main
Meta-Llama-3.1-405B-Instruct-quantized.w8a16
Commit History
Updated compression_config to quantization_config
6e5b068
verified
mgoin
commited on
Oct 9
Upload tokenizer.json with huggingface_hub
150ad2b
verified
alexmarques
commited on
Sep 30
Update README.md
8302444
verified
alexmarques
commited on
Sep 30
Upload tokenizer_config.json with huggingface_hub
fcf51db
verified
alexmarques
commited on
Sep 27
Update README.md
e8aa908
verified
alexmarques
commited on
Aug 21
Update README.md
cc8251c
verified
alexmarques
commited on
Aug 19
Update README.md
70795f2
verified
alexmarques
commited on
Aug 19
Update README.md
88efeae
verified
alexmarques
commited on
Aug 19
Update README.md
517dbb1
verified
alexmarques
commited on
Aug 19
Update README.md
0316f72
verified
alexmarques
commited on
Aug 19
Create README.md
35c3e29
verified
alexmarques
commited on
Aug 19
Upload folder using huggingface_hub
ef8e2c3
verified
alexmarques
commited on
Aug 19
initial commit
391ddf8
verified
alexmarques
commited on
Aug 19