Rename README.md to { "script_id": 1, "parameter_id": 1 }# Install vLLM from pip: pip install vllm Copy # Load and run the model: vllm serve "deepseek-ai/DeepSeek-R1-Distill-Qwen-32B" Copy # Call the server using curl: curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "deepseek-ai/DeepSeek-R1-Distill-Qwen-32B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' Use Docker images Copy # Deploy with docker on Linux: docker run --runtime nvidia --gpus all \ --name my_vllm_container \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HUGGING_FACE_HUB_TOKEN=<secret>" \ -p 8000:8000 \ --ipc=host \ vllm/vllm-openai:latest \ --model deepseek-ai/DeepSeek-R1-Distill-Qwen-32B Copy # Load and run the model: docker exec -it my_vllm_container bash -c "vllm serve deepseek-ai/DeepSeek-R1-Distill-Qwen-32B" Copy # Call the server using curl: curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "deepseek-ai/DeepSeek-R1-Distill-Qwen-32B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' Quick Links Read the vLLM documentation
#1
by
9x25dillon
- opened
- README.md +0 -3
- { /"script_id/": 1, /"parameter_id/": 1 }# Install vLLM from pip: pip install vllm Copy # Load and run the model: vllm serve /"deepseek-ai/DeepSeek-R1-Distill-Qwen-32B/" Copy # Call the server using curl: curl -X POST /"http:/localhost:8000/v1/chat/completions/" // /t-H /"Content-Type: application/json/" // /t--data '{ /t/t/"model/": /"deepseek-ai/DeepSeek-R1-Distill-Qwen-32B/", /t/t/"messages/": [ /t/t/t{ /t/t/t/t/"role/": /"user/", /t/t/t/t/"content/": /"What is the capital of France?/" /t/t/t} /t/t] /t}' Use Docker images Copy # Deploy with docker on Linux: docker run --runtime nvidia --gpus all // /t--name my_vllm_container // /t-v ~/.cache/huggingface:/root/.cache/huggingface // /t--env /"HUGGING_FACE_HUB_TOKEN=<secret>/" // /t-p 8000:8000 // /t--ipc=host // /tvllm/vllm-openai:latest // /t--model deepseek-ai/DeepSeek-R1-Distill-Qwen-32B Copy # Load and run the model: docker exec -it my_vllm_container bash -c /"vllm serve deepseek-ai/DeepSeek-R1-Distill-Qwen-32B/" Copy # Call the server using curl: curl -X POST /"http:/localhost:8000/v1/chat/completions/" // /t-H /"Content-Type: application/json/" // /t--data '{ /t/t/"model/": /"deepseek-ai/DeepSeek-R1-Distill-Qwen-32B/", /t/t/"messages/": [ /t/t/t{ /t/t/t/t/"role/": /"user/", /t/t/t/t/"content/": /"What is the capital of France?/" /t/t/t} /t/t] /t}' Quick Links Read the vLLM documentation +15 -0
README.md
DELETED
@@ -1,3 +0,0 @@
|
|
1 |
-
---
|
2 |
-
license: mit
|
3 |
-
---
|
|
|
|
|
|
|
|
{ /"script_id/": 1, /"parameter_id/": 1 }# Install vLLM from pip: pip install vllm Copy # Load and run the model: vllm serve /"deepseek-ai/DeepSeek-R1-Distill-Qwen-32B/" Copy # Call the server using curl: curl -X POST /"http:/localhost:8000/v1/chat/completions/" // /t-H /"Content-Type: application/json/" // /t--data '{ /t/t/"model/": /"deepseek-ai/DeepSeek-R1-Distill-Qwen-32B/", /t/t/"messages/": [ /t/t/t{ /t/t/t/t/"role/": /"user/", /t/t/t/t/"content/": /"What is the capital of France?/" /t/t/t} /t/t] /t}' Use Docker images Copy # Deploy with docker on Linux: docker run --runtime nvidia --gpus all // /t--name my_vllm_container // /t-v ~/.cache/huggingface:/root/.cache/huggingface // /t--env /"HUGGING_FACE_HUB_TOKEN=<secret>/" // /t-p 8000:8000 // /t--ipc=host // /tvllm/vllm-openai:latest // /t--model deepseek-ai/DeepSeek-R1-Distill-Qwen-32B Copy # Load and run the model: docker exec -it my_vllm_container bash -c /"vllm serve deepseek-ai/DeepSeek-R1-Distill-Qwen-32B/" Copy # Call the server using curl: curl -X POST /"http:/localhost:8000/v1/chat/completions/" // /t-H /"Content-Type: application/json/" // /t--data '{ /t/t/"model/": /"deepseek-ai/DeepSeek-R1-Distill-Qwen-32B/", /t/t/"messages/": [ /t/t/t{ /t/t/t/t/"role/": /"user/", /t/t/t/t/"content/": /"What is the capital of France?/" /t/t/t} /t/t] /t}' Quick Links Read the vLLM documentation
ADDED
@@ -0,0 +1,15 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: mit
|
3 |
+
---
|
4 |
+
from datasets import load_dataset
|
5 |
+
|
6 |
+
# Load a dataset from Hugging Face's Dataset Hub
|
7 |
+
dataset = load_dataset(pip install transformers datasets evaluate accelerate)
|
8 |
+
from transformers import AutoModelForSequenceClassification, AutoTokenizer
|
9 |
+
|
10 |
+
# Replace "9x25dillon/DS-R1-Distill-Qwen-32BSQL-INT" with your chosen model
|
11 |
+
model_name = "9x25dillon/DS-R1-Distill-Qwen-32BSQL-INT"
|
12 |
+
|
13 |
+
# Load the tokenizer and model
|
14 |
+
tokenizer = AutoTokenizer.from_pretrained(model_name)
|
15 |
+
model = AutoModelForSequenceClassification.from_pretrained(model_name, num_labels=2) # Update num_labels for your task
|