vllm-inference / Dockerfile

Commit History

feat(Dockerfile): install gcc
e8cd3e0

yusufs commited on

feat(runner.sh): using runner.sh to select llm in the run time
69c6372

yusufs commited on

feat(/app/run-llama.sh): /app/run-llama.sh
cab183f

yusufs commited on

feat(/app/run-sailor.sh): /app/run-sailor.sh
6d92442

yusufs commited on

feat(llama3.2): using llama model first for cost saving, until we want test sailor
92a4a4a

yusufs commited on

feat(sailorchat): using sailor chat model
0f3cd25

yusufs commited on

feat(llama3.2): run llama3.2 using bfloat16 with cache dtype fp8 with same model len
38d356a

yusufs commited on

feat(sailor-8B): using sailor-8b
811d851

yusufs commited on

feat(llama3.2): change model to llama3.2
b826155

yusufs commited on

feat(dep_sizes.txt): removes dep_sizes.txt during build, it not needed
8e49b3b

yusufs commited on

feat(download_model.py): remove download_model.py during build, it causing big image size
c360fd3

yusufs commited on

docs(Dockerfile): add comment about estimated image size after compile
8dc2050

yusufs commited on

feat(add-model): always download model during build, it will be cached in the consecutive builds
8679a35

yusufs commited on

feat(hf_token): set hf token during build
493a5f1

yusufs commited on

fix(hf_token): export HF_TOKEN during build
c6efe6a

yusufs commited on

feat(download-model): add download model at runtime
fc30f26

yusufs commited on

feat(run.sh): add script for running openai server
ded2af7

yusufs commited on

fix(python): fix absolute path of python script
d2e0be1

yusufs commited on

fix(cmd): fix 'error: failed to solve: dockerfile parse error on line 19: unknown instruction: "python3",'
de6b236

yusufs commited on

feat(openai): VLLM OpenAI compatible server
147b3a2

yusufs commited on

feat(t4-gpu): add t4 gpu capability
4998ce7

yusufs commited on

fix(expose-port): add EXPOSE in Dockerfile
6d19ece

yusufs commited on

fix(module): fix 'error module app' it should be 'main'
6a914f2

yusufs commited on

feat(first-commit): follow examples and tutorials
ae7cfbb

yusufs commited on