Files changed (1) hide show
  1. README.md +47 -7
README.md CHANGED
@@ -32,22 +32,62 @@ tags:
32
  **Acknowledgement**:
33
  We sincerely thank our community members, [Mingyuan](https://huggingface.co/ThunderBeee) and [Zoey](https://huggingface.co/ZY6), for their extraordinary contributions to this quantization effort. Please explore [Octopus-v4](https://huggingface.co/NexaAIDev/Octopus-v4) for our original huggingface model.
34
 
 
35
 
36
- ## Run with [Ollama](https://github.com/ollama/ollama)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
37
 
38
  ```bash
39
- ollama run NexaAIDev/octopus-v4-q4_k_m
40
  ```
41
 
42
- Input example:
 
 
 
43
 
44
- ```json
45
- Query: Tell me the result of derivative of x^3 when x is 2?
 
 
 
46
 
47
- Response: <nexa_4> ('Determine the derivative of the function f(x) = x^3 at the point where x equals 2, and interpret the result within the context of rate of change and tangent slope.')<nexa_end>
 
 
 
48
 
 
 
 
 
 
 
 
 
49
  ```
50
- Note that `<nexa_4>` represents the math gpt.
51
 
52
  ### Dataset and Benchmark
53
 
 
32
  **Acknowledgement**:
33
  We sincerely thank our community members, [Mingyuan](https://huggingface.co/ThunderBeee) and [Zoey](https://huggingface.co/ZY6), for their extraordinary contributions to this quantization effort. Please explore [Octopus-v4](https://huggingface.co/NexaAIDev/Octopus-v4) for our original huggingface model.
34
 
35
+ ## (Recommended) Run with [llama.cpp](https://github.com/ggerganov/llama.cpp)
36
 
37
+ 1. **Clone and compile:**
38
+
39
+ ```bash
40
+ git clone https://github.com/ggerganov/llama.cpp
41
+ cd llama.cpp
42
+
43
+ # Compile the source code:
44
+ make
45
+ ```
46
+
47
+ 2. **Prepare the Input Prompt File:**
48
+
49
+ Navigate to the `prompt` folder inside the `llama.cpp`, and create a new file named `chat-with-octopus.txt`.
50
+
51
+ `chat-with-octopus.txt`:
52
+
53
+ ```bash
54
+ # Write "User" at the top of the file to set the identifier for input.
55
+ User:
56
+ ```
57
+
58
+ 3. **Execute the Model:**
59
+
60
+ Run the following command in the terminal:
61
 
62
  ```bash
63
+ ./main -m ./Octopus-v4-gguf/Octopus-v4-Q2_K.gguf -c 512 -b 2048 -n 256 -t 1 --repeat_penalty 1.0 --top_k 0 --top_p 1.0 --color -i -r "User:" -f prompts/chat-with-octopus.txt
64
  ```
65
 
66
+ Example prompt to interact
67
+ ```bash
68
+ <|system|>You are a router. Below is the query from the users, please call the correct function and generate the parameters to call the function.<|end|><|user|>Tell me the result of derivative of x^3 when x is 2?<|end|><|assistant|>
69
+ ```
70
 
71
+ ## Run with [Ollama](https://github.com/ollama/ollama)
72
+ 1. Create a `Modelfile` in your directory and include a `FROM` statement with the path to your local model:
73
+ ```bash
74
+ FROM ./path/to/octopus-v4-Q4_K_M
75
+ ```
76
 
77
+ 2. Use the following command to add the model to Ollama:
78
+ ```bash
79
+ ollama create octopus-v4-Q4_K_M -f Modelfile
80
+ ```
81
 
82
+ 3. Verify that the model has been successfully imported:
83
+ ```bash
84
+ ollama ls
85
+ ```
86
+
87
+ ### Run the model
88
+ ```bash
89
+ ollama run octopus-v4-Q4_K_M "<|system|>You are a router. Below is the query from the users, please call the correct function and generate the parameters to call the function.<|end|><|user|>Tell me the result of derivative of x^3 when x is 2?<|end|><|assistant|>"
90
  ```
 
91
 
92
  ### Dataset and Benchmark
93