Inference with deepseek v3 using sglang cannot STOP

#50
by peterchanjaon - opened
, gen throughput (token/s): 33.43, #queue-req: 0
[2025-01-08 20:17:32 TP0] Decode batch. #running-req: 2, #token: 27373, token usage: 0.09, gen throughput (token/s): 33.38, #queue-req: 0
[2025-01-08 20:17:34 TP0] Decode batch. #running-req: 2, #token: 27453, token usage: 0.09, gen throughput (token/s): 33.32, #queue-req: 0
[2025-01-08 20:17:37 TP0] Decode batch. #running-req: 2, #token: 27533, token usage: 0.09, gen throughput (token/s): 32.73, #queue-req: 0
[2025-01-08 20:17:39 TP0] Decode batch. #running-req: 2, #token: 27613, token usage: 0.09, gen throughput (token/s): 33.27, #queue-req: 0
[2025-01-08 20:17:41 TP0] Decode batch. #running-req: 2, #token: 27693, token usage: 0.09, gen throughput (token/s): 33.22, #queue-req: 0
[2025-01-08 20:17:44 TP0] Decode batch. #running-req: 2, #token: 27773, token usage: 0.09, gen throughput (token/s): 33.13, #queue-req: 0
[2025-01-08 20:17:46 TP0] Decode batch. #running-req: 2, #token: 27853, token usage: 0.09, gen throughput (token/s): 33.29, #queue-req: 0
[2025-01-08 20:17:49 TP0] Decode batch. #running-req: 2, #token: 27933, token usage: 0.09, gen throughput (token/s): 32.47, #queue-req: 0
[2025-01-08 20:17:51 TP0] Decode batch. #running-req: 2, #token: 28013, token usage: 0.09, gen throughput (token/s): 33.13, #queue-req: 0
[2025-01-08 20:17:54 TP0] Decode batch. #running-req: 2, #token: 28093, token usage: 0.09, gen throughput (token/s): 33.12, #queue-req: 0
[2025-01-08 20:17:56 TP0] Decode batch. #running-req: 2, #token: 28173, token usage: 0.09, gen throughput (token/s): 33.11, #queue-req: 0
[2025-01-08 20:17:58 TP0] Decode batch. #running-req: 2, #token: 28253, token usage: 0.09, gen throughput (token/s): 33.18, #queue-req: 0
[2025-01-08 20:18:01 TP0] Decode batch. #running-req: 2, #token: 28333, token usage: 0.09, gen throughput (token/s): 32.45, #queue-req: 0
[2025-01-08 20:18:03 TP0] Decode batch. #running-req: 2, #token: 28413, token usage: 0.09, gen throughput (token/s): 33.03, #queue-req: 0
[2025-01-08 20:18:06 TP0] Decode batch. #running-req: 2, #token: 28493, token usage: 0.09, gen throughput (token/s): 33.00, #queue-req: 0
[2025-01-08 20:18:08 TP0] Decode batch. #running-req: 2, #token: 28573, token usage: 0.09, gen throughput (token/s): 32.97, #queue-req: 0
[2025-01-08 20:18:10 TP0] Decode batch. #running-req: 2, #token: 28653, token usage: 0.09, gen throughput (token/s): 32.99, #queue-req: 0
[2025-01-08 20:18:13 TP0] Decode batch. #running-req: 2, #token: 28733, token usage: 0.09, gen throughput (token/s): 32.39, #queue-req: 0
[2025-01-08 20:18:15 TP0] Decode batch. #running-req: 2, #token: 28813, token usage: 0.10, gen throughput (token/s): 32.99, #queue-req: 0

what shoule i do, add stop token?

peterchanjaon changed discussion status to closed

Sign up or log in to comment