Update README.md
Browse files
README.md
CHANGED
@@ -143,6 +143,9 @@ For instruction training, we first trained the model with Supervised Fine-tuning
|
|
143 |
</table>
|
144 |
|
145 |
## Interfacing with the Instruct Model
|
|
|
|
|
|
|
146 |
> [!IMPORTANT]
|
147 |
> To ensure optimal performance, please use the following template when interacting with the model:
|
148 |
|
|
|
143 |
</table>
|
144 |
|
145 |
## Interfacing with the Instruct Model
|
146 |
+
Model weights were converted to be Hugging Face compatible, with custom modeling files included due to the lack of official support for Mamba2 attention layers.
|
147 |
+
The attention layer implementation was incorporated from [#32027 PR](https://github.com/huggingface/transformers/pull/32027)
|
148 |
+
|
149 |
> [!IMPORTANT]
|
150 |
> To ensure optimal performance, please use the following template when interacting with the model:
|
151 |
|