Is your approach also feasible for 2x faster inference?
#2
by
ayyylemao
- opened
Hello,
Intriguing project.
For my purposes I would be interested in faster inference mainly rather than finetuning.
Can unsloth also speed up inference?
ayyylemao
changed discussion title from
Is your approach also feasible for 2.5x faster inference?
to Is your approach also feasible for 2x faster inference?
Hello,
Intriguing project.
For my purposes I would be interested in faster inference mainly rather than finetuning.
Can unsloth also speed up inference?
Yes our open source package actually has the fastest inference for a single GPU. It's 2x faster than Hugging Face!
See our colab notebook here: https://colab.research.google.com/drive/1aqlNQi7MMJbynFDyOQteD2t0yVfjb9Zh?usp=sharing
Github page: https://github.com/unslothai/unsloth