srivatsavdamaraju commited on
Commit
e813431
·
verified ·
1 Parent(s): 4e729de

Upload 4 files

Browse files
Files changed (4) hide show
  1. Dockerfile +26 -2
  2. README.md +32 -10
  3. main.py +40 -0
  4. requirements.txt +4 -0
Dockerfile CHANGED
@@ -1,3 +1,27 @@
1
- FROM ollama/ollama
2
 
3
- COPY ./pull-llama3.sh /pull-llama3.sh
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ FROM python:3.10-slim
2
 
3
+ ENV DEBIAN_FRONTEND=noninteractive
4
+
5
+ RUN apt-get update && apt-get install -y \
6
+ curl \
7
+ procps \
8
+ && rm -rf /var/lib/apt/lists/*
9
+
10
+ RUN curl -fsSL https://ollama.com/install.sh | sh
11
+
12
+ RUN ollama start & \
13
+ sleep 5 && \
14
+ ollama run llama3.2:1b && \
15
+ kill $(pgrep ollama)
16
+
17
+ WORKDIR /app
18
+
19
+ COPY requirements.txt .
20
+
21
+ RUN pip install --no-cache-dir -r requirements.txt
22
+
23
+ COPY . /app
24
+
25
+ EXPOSE 8000
26
+
27
+ CMD ["sh", "-c", "ollama serve & uvicorn main:app --host 0.0.0.0 --port 8000 --reload"]
README.md CHANGED
@@ -1,10 +1,32 @@
1
- ---
2
- title: Vps
3
- emoji: 🐨
4
- colorFrom: red
5
- colorTo: pink
6
- sdk: docker
7
- pinned: false
8
- ---
9
-
10
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Dockerized FastAPI LLM Setup
2
+
3
+ This repository contains a FastAPI application packaged inside a Docker container for easy deployment and scalability. Follow the steps below to build and run the containerized FastAPI application.
4
+
5
+ ## Prerequisites
6
+
7
+ Ensure you have the following installed on your system before proceeding:
8
+
9
+ - Docker (https://docs.docker.com/get-docker/)
10
+
11
+ ## Steps to Build and Run the Dockerized FastAPI Application
12
+
13
+ Build the Docker Image
14
+ Run the following command to build the Docker image from the Dockerfile in your project directory. This will create a Docker image named `my-fastapi-app`:
15
+
16
+ `docker build -t my-fastapi-app .`
17
+
18
+ Run the Docker Container
19
+ Once the image is built, you can run the container and map it to port `8000` on your local machine. Use the following command:
20
+
21
+ `docker run -p 8000:8000 my-fastapi-app`
22
+
23
+ Explanation: - `-p 8000:8000`: Maps port 8000 on your local machine to port 8000 inside the Docker container, making the FastAPI app accessible at `http://localhost:8000`.
24
+
25
+ Access the Application
26
+ After running the container, the FastAPI app should be accessible at:
27
+
28
+ `http://localhost:8000`
29
+
30
+ You can interact with the API and view the automatically generated documentation provided by FastAPI at:
31
+
32
+ `http://localhost:8000/docs`
main.py ADDED
@@ -0,0 +1,40 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import ollama
2
+ from fastapi import FastAPI, HTTPException
3
+ from pydantic import BaseModel
4
+ from typing import List
5
+
6
+ app = FastAPI()
7
+
8
+ # Model for the API input
9
+ class PromptRequest(BaseModel):
10
+ model: str = "llama3.2:1b"
11
+ prompt: str
12
+
13
+ # Helper function to interact with ollama
14
+ async def generate_response(model: str, prompt: str) -> str:
15
+ try:
16
+ # Call ollama's chat function and stream the response
17
+ stream = ollama.chat(
18
+ model=model,
19
+ messages=[{'role': 'user', 'content': prompt}],
20
+ stream=True
21
+ )
22
+
23
+ response_text = ""
24
+ # Collect the streamed content
25
+ for chunk in stream:
26
+ response_text += chunk['message']['content']
27
+
28
+ return response_text
29
+ except Exception as e:
30
+ raise HTTPException(status_code=500, detail=f"Error generating response: {e}")
31
+
32
+ @app.post("/generate")
33
+ async def generate_text(request: PromptRequest):
34
+ model = request.model
35
+ prompt = request.prompt
36
+
37
+ # Generate the response using the helper function
38
+ response = await generate_response(model, prompt)
39
+
40
+ return {"generated_text": response}
requirements.txt ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ fastapi
2
+ uvicorn
3
+ httpx
4
+ ollama