requests get stuck when sending long prompts (already solved, but still don't know why?)
1
#18 opened 4 days ago
by
uv0xab
Is there any accuracy results comparing to original DeepSeek-R1?
#15 opened 5 days ago
by
traphix
Any one can run this model with SGlang framework?
2
#13 opened 6 days ago
by
muziyongshixin
Regarding the issue of inconsistent calculation of tokens
#12 opened 12 days ago
by
liguoyu3564
Max-Batch-Size, max-num-sequence, and fp_cache fp8_e4m3
#11 opened 12 days ago
by
BenFogerty