openthaigpt
/

openthaigpt1.5-14b-instruct

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

kobkrit commited on Oct 29, 2024

Commit

566ea87

·

verified ·

1 Parent(s): 5bcfaac

Update README.md

Files changed (1) hide show

README.md +7 -2

README.md CHANGED Viewed

@@ -269,9 +269,14 @@ response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
 2. Run server
 ```bash
-vllm serve openthaigpt/openthaigpt1.5-14b-instruct --tensor-parallel-size 2
 ```
-* Note, change ``--tensor-parallel-size 2`` to the amount of available GPU cards.
 3. Run inference (CURL example)
 ```bash

 2. Run server
 ```bash
+vllm serve openthaigpt/openthaigpt1.5-14b-instruct --tensor-parallel-size 4
+```
+* Note, change ``--tensor-parallel-size 4`` to the amount of available GPU cards.
+If you wish to enable tool calling feature, add ``--enable-auto-tool-choice --tool-call-parser hermes`` into command. e.g.,
+```bash
+vllm serve openthaigpt/openthaigpt1.5-14b-instruct --tensor-parallel-size 4 --enable-auto-tool-choice --tool-call-parser hermes
 ```
 3. Run inference (CURL example)
 ```bash