|
This repository contains AWS Inferentia2 and neuronx compatible checkpoints for [Mistral-Large-Instruct](https://huggingface.co/mistralai/Mistral-Large-Instruct-2407). You can find detailed information about the base model on its [Model Card](https://huggingface.co/mistralai/Mistral-Large-Instruct-2407). |
|
|
|
This model has been exported to the neuron format using specific input_shapes and compiler parameters detailed in the paragraphs below. |
|
|
|
It has been compiled to run on an inf2.48xlarge instance on AWS. Note that while the inf2.48xlarge has 24 cores, this compilation uses 24. |
|
--- |
|
|
|
SEQUENCE_LENGTH = 4096 |
|
|
|
BATCH_SIZE = 4 |
|
|
|
NUM_CORES = 24 |
|
|
|
PRECISION = "bf16" |
|
|
|
license: other |
|
license_name: mrl |
|
license_link: LICENSE |
|
language: |
|
- en |
|
base_model: |
|
- mistralai/Mistral-Large-Instruct-2407 |
|
pipeline_tag: text-generation |
|
--- |