Update README.md
Browse files
README.md
CHANGED
@@ -5,7 +5,7 @@ inference: false
|
|
5 |
|
6 |
# MegaBeam-Mistral-7B-300k Model
|
7 |
|
8 |
-
MegaBeam-Mistral-7B-300k is a fine-tuned [Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2) language model that supports input contexts up to 320k tokens. MegaBeam-Mistral-7B-300k can be deployed on a single AWS `g5.48xlarge` instance using serving frameworks such as [vLLM](https://github.com/vllm-project/vllm), Sagemaker [
|
9 |
|
10 |
|
11 |
|Model|Max context length| rope_theta| prompt template|
|
@@ -96,7 +96,7 @@ print(chat_completion)
|
|
96 |
### Deploy the model on a SageMaker Endpoint ###
|
97 |
To deploy MegaBeam-Mistral-7B-300k on a SageMaker endpoint, please follow this [SageMaker DJL deployment guide](https://docs.djl.ai/docs/demos/aws/sagemaker/large-model-inference/sample-llm/vllm_deploy_mistral_7b.html).
|
98 |
|
99 |
-
Run the following Python
|
100 |
|
101 |
```python
|
102 |
import sagemaker
|
@@ -106,7 +106,6 @@ sagemaker_session = sagemaker.Session()
|
|
106 |
region = sagemaker_session.boto_region_name
|
107 |
role = sagemaker.get_execution_role()
|
108 |
|
109 |
-
# run the following statement in a notebook cell
|
110 |
%%writefile serving.properties
|
111 |
engine=Python
|
112 |
option.model_id=amazon/MegaBeam-Mistral-7B-300k
|
@@ -116,7 +115,6 @@ option.rolling_batch=vllm
|
|
116 |
option.tensor_parallel_degree=8
|
117 |
option.device_map=auto
|
118 |
|
119 |
-
# run the following statement in a notebook cell
|
120 |
%%sh
|
121 |
mkdir mymodel
|
122 |
mv serving.properties mymodel/
|
|
|
5 |
|
6 |
# MegaBeam-Mistral-7B-300k Model
|
7 |
|
8 |
+
MegaBeam-Mistral-7B-300k is a fine-tuned [Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2) language model that supports input contexts up to 320k tokens. MegaBeam-Mistral-7B-300k can be deployed on a single AWS `g5.48xlarge` instance using serving frameworks such as [vLLM](https://github.com/vllm-project/vllm), Sagemaker [DJL](https://docs.aws.amazon.com/sagemaker/latest/dg/deploy-models-frameworks-djl-serving.html) endpoint, and others. Similarities and differences beween MegaBeam-Mistral-7B-300k and [Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2) are summarized below:
|
9 |
|
10 |
|
11 |
|Model|Max context length| rope_theta| prompt template|
|
|
|
96 |
### Deploy the model on a SageMaker Endpoint ###
|
97 |
To deploy MegaBeam-Mistral-7B-300k on a SageMaker endpoint, please follow this [SageMaker DJL deployment guide](https://docs.djl.ai/docs/demos/aws/sagemaker/large-model-inference/sample-llm/vllm_deploy_mistral_7b.html).
|
98 |
|
99 |
+
Run the following Python code in a SageMaker notebook (with each block running in a separate cell)
|
100 |
|
101 |
```python
|
102 |
import sagemaker
|
|
|
106 |
region = sagemaker_session.boto_region_name
|
107 |
role = sagemaker.get_execution_role()
|
108 |
|
|
|
109 |
%%writefile serving.properties
|
110 |
engine=Python
|
111 |
option.model_id=amazon/MegaBeam-Mistral-7B-300k
|
|
|
115 |
option.tensor_parallel_degree=8
|
116 |
option.device_map=auto
|
117 |
|
|
|
118 |
%%sh
|
119 |
mkdir mymodel
|
120 |
mv serving.properties mymodel/
|