Running inference outside of triton
#6
by
lbathen
- opened
Hi, do you have sample code showing how to run the model outside of the triton environment?
Hi, we currently use triton only as an orchestrator to batch together client-side requests into batch sizes of 2 or larger (in our existing code). Therefore, triton is not strictly required as the inference backend uses nemo aligner/nemo. However, we don't have sample code to show this currently, even though it should be easy to implement. If you need more help, can you clarify what type of environment you're planning to run the model under?
I had an old version of Nemo aligner :) I see newer code that will help with this task. Thank you.
lbathen
changed discussion status to
closed