chenwuml commited on
Commit
ee03ba6
1 Parent(s): 58fc7b4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -4
README.md CHANGED
@@ -5,7 +5,7 @@ inference: false
5
 
6
  # MegaBeam-Mistral-7B-300k Model
7
 
8
- MegaBeam-Mistral-7B-300k is a fine-tuned [Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2) language model that supports input contexts up to 320k tokens. MegaBeam-Mistral-7B-300k can be deployed on a single AWS `g5.48xlarge` instance using serving frameworks such as [vLLM](https://github.com/vllm-project/vllm), Sagemaker [Huggingface Text Generation Inference (TGI)](https://github.com/huggingface/text-generation-inference) endpoint, and others. Similarities and differences beween MegaBeam-Mistral-7B-300k and [Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2) are summarized below:
9
 
10
 
11
  |Model|Max context length| rope_theta| prompt template|
@@ -96,7 +96,7 @@ print(chat_completion)
96
  ### Deploy the model on a SageMaker Endpoint ###
97
  To deploy MegaBeam-Mistral-7B-300k on a SageMaker endpoint, please follow this [SageMaker DJL deployment guide](https://docs.djl.ai/docs/demos/aws/sagemaker/large-model-inference/sample-llm/vllm_deploy_mistral_7b.html).
98
 
99
- Run the following Python statements in a SageMaker notebook (with each block running in a separate cell)
100
 
101
  ```python
102
  import sagemaker
@@ -106,7 +106,6 @@ sagemaker_session = sagemaker.Session()
106
  region = sagemaker_session.boto_region_name
107
  role = sagemaker.get_execution_role()
108
 
109
- # run the following statement in a notebook cell
110
  %%writefile serving.properties
111
  engine=Python
112
  option.model_id=amazon/MegaBeam-Mistral-7B-300k
@@ -116,7 +115,6 @@ option.rolling_batch=vllm
116
  option.tensor_parallel_degree=8
117
  option.device_map=auto
118
 
119
- # run the following statement in a notebook cell
120
  %%sh
121
  mkdir mymodel
122
  mv serving.properties mymodel/
 
5
 
6
  # MegaBeam-Mistral-7B-300k Model
7
 
8
+ MegaBeam-Mistral-7B-300k is a fine-tuned [Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2) language model that supports input contexts up to 320k tokens. MegaBeam-Mistral-7B-300k can be deployed on a single AWS `g5.48xlarge` instance using serving frameworks such as [vLLM](https://github.com/vllm-project/vllm), Sagemaker [DJL](https://docs.aws.amazon.com/sagemaker/latest/dg/deploy-models-frameworks-djl-serving.html) endpoint, and others. Similarities and differences beween MegaBeam-Mistral-7B-300k and [Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2) are summarized below:
9
 
10
 
11
  |Model|Max context length| rope_theta| prompt template|
 
96
  ### Deploy the model on a SageMaker Endpoint ###
97
  To deploy MegaBeam-Mistral-7B-300k on a SageMaker endpoint, please follow this [SageMaker DJL deployment guide](https://docs.djl.ai/docs/demos/aws/sagemaker/large-model-inference/sample-llm/vllm_deploy_mistral_7b.html).
98
 
99
+ Run the following Python code in a SageMaker notebook (with each block running in a separate cell)
100
 
101
  ```python
102
  import sagemaker
 
106
  region = sagemaker_session.boto_region_name
107
  role = sagemaker.get_execution_role()
108
 
 
109
  %%writefile serving.properties
110
  engine=Python
111
  option.model_id=amazon/MegaBeam-Mistral-7B-300k
 
115
  option.tensor_parallel_degree=8
116
  option.device_map=auto
117
 
 
118
  %%sh
119
  mkdir mymodel
120
  mv serving.properties mymodel/