Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
@@ -200,15 +200,14 @@ tags:
|
|
200 |
</p>
|
201 |
<!-- header end -->
|
202 |
|
203 |
-
# Llama 3 8B -
|
204 |
|
205 |
- Model creator: [Meta Llama 3](https://huggingface.co/meta-llama)
|
206 |
- Original model: [Llama 3 8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B)
|
207 |
|
208 |
## Description
|
209 |
|
210 |
-
This repo contains the Llama 3 8B model quantized to
|
211 |
-
Note that FP8 is only supported by NVIDIA Ada, Hopper, and Blackwell GPU architectures.
|
212 |
Check out [FriendliAI documentation](https://docs.friendli.ai/) for more details.
|
213 |
|
214 |
## License
|
@@ -269,7 +268,7 @@ docker run \
|
|
269 |
-e FRIENDLI_CONTAINER_SECRET="YOUR CONTAINER SECRET" \
|
270 |
registry.friendli.ai/trial \
|
271 |
--web-server-port 8000 \
|
272 |
-
--hf-model-name FriendliAI/Meta-Llama-3-8B-
|
273 |
```
|
274 |
|
275 |
---
|
|
|
200 |
</p>
|
201 |
<!-- header end -->
|
202 |
|
203 |
+
# Llama 3 8B - INT8
|
204 |
|
205 |
- Model creator: [Meta Llama 3](https://huggingface.co/meta-llama)
|
206 |
- Original model: [Llama 3 8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B)
|
207 |
|
208 |
## Description
|
209 |
|
210 |
+
This repo contains the Llama 3 8B model quantized to INT8 by FriendliAI, significantly enhancing its inference efficiency while maintaining high accuracy.
|
|
|
211 |
Check out [FriendliAI documentation](https://docs.friendli.ai/) for more details.
|
212 |
|
213 |
## License
|
|
|
268 |
-e FRIENDLI_CONTAINER_SECRET="YOUR CONTAINER SECRET" \
|
269 |
registry.friendli.ai/trial \
|
270 |
--web-server-port 8000 \
|
271 |
+
--hf-model-name FriendliAI/Meta-Llama-3-8B-int8
|
272 |
```
|
273 |
|
274 |
---
|