Spaces:
Sleeping
Sleeping
File size: 2,246 Bytes
e3af00f |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 |
<!--
Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
# PyTriton remote mode
Remote mode is a way to use the PyTriton with the Triton Inference Server running remotely (at this moment
it must be deployed on the same machine, but may be launched in a different container).
To bind the model in remote mode, it is required to use the `RemoteTriton` class instead of `Triton`.
Only difference of using `RemoteTriton` is that it requires the triton `url` argument in the constructor.
## Example of binding a model in remote mode
Example below assumes that the Triton Inference Server is running on the same machine (launched with PyTriton
in separate python script).
`RemoteTriton` binds remote model to existing Triton Inference Server.
When `RemoteTriton` is closed, the model is unloaded from the server.
<!--pytest.mark.skip-->
```python
import numpy as np
from pytriton.decorators import batch
from pytriton.model_config import ModelConfig, Tensor
from pytriton.triton import RemoteTriton, TritonConfig
triton_config = TritonConfig(
cache_config=[f"local,size={1024 * 1024}"], # 1MB
)
@batch
def _add_sub(**inputs):
a_batch, b_batch = inputs.values()
add_batch = a_batch + b_batch
sub_batch = a_batch - b_batch
return {"add": add_batch, "sub": sub_batch}
with RemoteTriton(url='localhost') as triton:
triton.bind(
model_name="AddSub",
infer_func=_add_sub,
inputs=[Tensor(shape=(1,), dtype=np.float32), Tensor(shape=(1,), dtype=np.float32)],
outputs=[Tensor(shape=(1,), dtype=np.float32), Tensor(shape=(1,), dtype=np.float32)],
config=ModelConfig(max_batch_size=8, response_cache=True)
)
triton.serve()
```
|