liltom-eth commited on
Commit
8ab4933
·
1 Parent(s): f03b7cc

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +36 -0
README.md CHANGED
@@ -8,6 +8,7 @@ inference: false
8
  # LLaVA Model Card
9
 
10
  ## Model details
 
11
 
12
  **Model type:**
13
  LLaVA is an open-source chatbot trained by fine-tuning LLaMA/Vicuna on GPT-generated multimodal instruction-following data.
@@ -19,6 +20,41 @@ LLaVA-v1.5-7B was trained in September 2023.
19
  **Paper or resources for more information:**
20
  https://llava-vl.github.io/
21
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
22
  ## License
23
  Llama 2 is licensed under the LLAMA 2 Community License,
24
  Copyright (c) Meta Platforms, Inc. All Rights Reserved.
 
8
  # LLaVA Model Card
9
 
10
  ## Model details
11
+ This is a fork from origianl [liuhaotian/llava-v1.5-7b](https://huggingface.co/liuhaotian/llava-v1.5-7b). This repo added `code/inference.py` and `code/requirements.txt` to provide customize inference script and environment for SageMaker deployment.
12
 
13
  **Model type:**
14
  LLaVA is an open-source chatbot trained by fine-tuning LLaMA/Vicuna on GPT-generated multimodal instruction-following data.
 
20
  **Paper or resources for more information:**
21
  https://llava-vl.github.io/
22
 
23
+ ## How to Deploy on SageMaker
24
+
25
+ Following `deploy_llava.ipynb` , bundle llava model weights and code into a `model.tar.gz` and upload to S3:
26
+
27
+ ```python
28
+ from sagemaker.s3 import S3Uploader
29
+
30
+ # upload model.tar.gz to s3
31
+ s3_model_uri = S3Uploader.upload(local_path="./model.tar.gz", desired_s3_uri=f"s3://{sess.default_bucket()}/llava-v1.5-7b")
32
+
33
+ print(f"model uploaded to: {s3_model_uri}")
34
+ ```
35
+ Then use `HuggingfaceModel` to deploy our real-time inference endpoint on SageMaker:
36
+
37
+ ```python
38
+ from sagemaker.huggingface.model import HuggingFaceModel
39
+
40
+ # create Hugging Face Model Class
41
+ huggingface_model = HuggingFaceModel(
42
+ model_data=s3_model_uri, # path to your model and script
43
+ role=role, # iam role with permissions to create an Endpoint
44
+ transformers_version="4.28.1", # transformers version used
45
+ pytorch_version="2.0.0", # pytorch version used
46
+ py_version='py310', # python version used
47
+ model_server_workers=1
48
+ )
49
+
50
+ # deploy the endpoint endpoint
51
+ predictor = huggingface_model.deploy(
52
+ initial_instance_count=1,
53
+ instance_type="ml.g5.xlarge",
54
+ )
55
+ ```
56
+
57
+
58
  ## License
59
  Llama 2 is licensed under the LLAMA 2 Community License,
60
  Copyright (c) Meta Platforms, Inc. All Rights Reserved.