Update README.md
Browse files
README.md
CHANGED
@@ -61,15 +61,6 @@ TODO
|
|
61 |
|
62 |
Moreover, due to its unique hybrid SSM architecture, Zamba2-7B-Instruct achieves extremely low inference latency and rapid generation with a significantly smaller memory footprint than comparable transformer-based models.
|
63 |
|
64 |
-
<center>
|
65 |
-
<img src="https://cdn-uploads.huggingface.co/production/uploads/65c05e75c084467acab2f84a/nHM8bX0y8SWa4zwMSbBi7.png" width="500" alt="Zamba architecture">
|
66 |
-
</center>
|
67 |
-
|
68 |
-
<center>
|
69 |
-
<img src="https://cdn-uploads.huggingface.co/production/uploads/65c05e75c084467acab2f84a/qXG8aip6h77LHKjhWfjD5.png" width="500" alt="Zamba architecture">
|
70 |
-
</center>
|
71 |
-
|
72 |
-
|
73 |
|
74 |
Time to First Token (TTFT) | Output Generation
|
75 |
:-------------------------:|:-------------------------:
|
|
|
61 |
|
62 |
Moreover, due to its unique hybrid SSM architecture, Zamba2-7B-Instruct achieves extremely low inference latency and rapid generation with a significantly smaller memory footprint than comparable transformer-based models.
|
63 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
64 |
|
65 |
Time to First Token (TTFT) | Output Generation
|
66 |
:-------------------------:|:-------------------------:
|