amazon
/

MegaBeam-Mistral-7B-300k

Text Generation

text-generation-inference

Model card Files Files and versions Community

chenwuml commited on May 13

Commit

ee03ba6

•

1 Parent(s): 58fc7b4

Update README.md

Files changed (1) hide show

README.md +2 -4

README.md CHANGED Viewed

@@ -5,7 +5,7 @@ inference: false
 # MegaBeam-Mistral-7B-300k Model
-MegaBeam-Mistral-7B-300k is a fine-tuned [Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2) language model that supports input contexts up to 320k tokens. MegaBeam-Mistral-7B-300k can be deployed on a single AWS `g5.48xlarge` instance using serving frameworks such as [vLLM](https://github.com/vllm-project/vllm), Sagemaker [Huggingface Text Generation Inference (TGI)](https://github.com/huggingface/text-generation-inference) endpoint, and others. Similarities and differences beween MegaBeam-Mistral-7B-300k and [Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2) are summarized below:
 |Model|Max context length| rope_theta| prompt template|
@@ -96,7 +96,7 @@ print(chat_completion)
 ### Deploy the model on a SageMaker Endpoint ###
 To deploy MegaBeam-Mistral-7B-300k on a SageMaker endpoint, please follow this [SageMaker DJL deployment guide](https://docs.djl.ai/docs/demos/aws/sagemaker/large-model-inference/sample-llm/vllm_deploy_mistral_7b.html).
-Run the following Python statements in a SageMaker notebook (with each block running in a separate cell)
 ```python
 import sagemaker
@@ -106,7 +106,6 @@ sagemaker_session = sagemaker.Session()
 region = sagemaker_session.boto_region_name
 role = sagemaker.get_execution_role()
-# run the following statement in a notebook cell
 %%writefile serving.properties
 engine=Python
 option.model_id=amazon/MegaBeam-Mistral-7B-300k
@@ -116,7 +115,6 @@ option.rolling_batch=vllm
 option.tensor_parallel_degree=8
 option.device_map=auto
-# run the following statement in a notebook cell
 %%sh
 mkdir mymodel
 mv serving.properties mymodel/

 # MegaBeam-Mistral-7B-300k Model
+MegaBeam-Mistral-7B-300k is a fine-tuned [Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2) language model that supports input contexts up to 320k tokens. MegaBeam-Mistral-7B-300k can be deployed on a single AWS `g5.48xlarge` instance using serving frameworks such as [vLLM](https://github.com/vllm-project/vllm), Sagemaker [DJL](https://docs.aws.amazon.com/sagemaker/latest/dg/deploy-models-frameworks-djl-serving.html) endpoint, and others. Similarities and differences beween MegaBeam-Mistral-7B-300k and [Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2) are summarized below:
 |Model|Max context length| rope_theta| prompt template|
 ### Deploy the model on a SageMaker Endpoint ###
 To deploy MegaBeam-Mistral-7B-300k on a SageMaker endpoint, please follow this [SageMaker DJL deployment guide](https://docs.djl.ai/docs/demos/aws/sagemaker/large-model-inference/sample-llm/vllm_deploy_mistral_7b.html).
+Run the following Python code in a SageMaker notebook (with each block running in a separate cell)
 ```python
 import sagemaker
 region = sagemaker_session.boto_region_name
 role = sagemaker.get_execution_role()
 %%writefile serving.properties
 engine=Python
 option.model_id=amazon/MegaBeam-Mistral-7B-300k
 option.tensor_parallel_degree=8
 option.device_map=auto
 %%sh
 mkdir mymodel
 mv serving.properties mymodel/