neuralmagic
/

Phi-3-medium-128k-instruct-quantized.w4a16

Text Generation

text-generation-inference

Inference Endpoints

compressed-tensors

Model card Files Files and versions Community

Phi-3-medium-128k-instruct-quantized.w4a16

3 contributors

History: 19 commits

mgoin's picture

Updated compression_config to quantization_config

8670dfa verified 4 months ago

.gitattributes

1.52 kB

initial commit 7 months ago
README.md

7.65 kB

Update README.md 5 months ago
config.json

4.24 kB

Updated compression_config to quantization_config 4 months ago
configuration_phi3.py

10.4 kB

Upload configuration_phi3.py with huggingface_hub 7 months ago
generation_config.json

172 Bytes

Upload generation_config.json with huggingface_hub 7 months ago
model-00001-of-00002.safetensors

4.93 GB
LFS

Upload model-00001-of-00002.safetensors with huggingface_hub 7 months ago
model-00002-of-00002.safetensors

2.75 GB
LFS

Upload model-00002-of-00002.safetensors with huggingface_hub 7 months ago
model.safetensors.index.json

49.8 kB

Upload model.safetensors.index.json with huggingface_hub 7 months ago
modeling_phi3.py

73.8 kB

Upload modeling_phi3.py with huggingface_hub 7 months ago
recipe.yaml

315 Bytes

Upload recipe.yaml with huggingface_hub 7 months ago
special_tokens_map.json

569 Bytes

Upload special_tokens_map.json with huggingface_hub 7 months ago
tokenizer.json

1.84 MB

Upload tokenizer.json with huggingface_hub 7 months ago
tokenizer_config.json

3.34 kB

Update tokenizer_config.json 6 months ago