Spaces:
Sleeping
Building binary package from source
This guide provides an outline of the process for building the PyTriton binary package from source. It offers the flexibility to modify the PyTriton code and integrate it with various versions of the Triton Inference Server, including custom builds. Additionally, it allows you to incorporate hotfixes that have not yet been officially released.
Prerequisites
Before building the PyTriton binary package, ensure the following:
- Docker with buildx plugin is installed on the system. For more information, refer to the Docker documentation.
- Access to the Docker daemon is available from the system or container.
If you plan to build arm64
wheel on amd64
machine we suggest to use QUEMU for emulation.
To enable QUEMU on Ubuntu you need to:
- Install the QEMU packages on your x86 machine:
sudo apt-get install qemu binfmt-support qemu-user-static
- Register the QEMU emulators for ARM architectures:
docker run --rm --privileged multiarch/qemu-user-static --reset -p yes
Building PyTriton binary package
To build the wheel binary package, follow these steps from the root directory of the project:
make install-dev
make dist
Note: The default build create wheel for x86_64
architecture. If you would like to build the wheel for aarch64
use
make dist -e PLATFORM=linux/arm64
We use Docker convention name for platforms. The supported options are linux/amd64
and linux/arm64
.
The wheel package will be located in the dist
directory. To install the library, run the following pip
command:
pip install dist/nvidia_pytriton-*-py3-none-*.whl
Note: The wheel name would have x86_64
or aarch64
in name based on selected platform.
Building for a specific Triton Inference Server version
Building for an unsupported OS or hardware platform is possible. PyTriton requires a Python backend and either an HTTP or gRPC endpoint. The build can be CPU-only, as inference is performed on Inference Handlers.
For more information on the Triton Inference Server build process, refer to the building section of Triton Inference Server documentation.
!!! warning "Untested Build"
The Triton Inference Server has only been rigorously tested on Ubuntu 20.04. Other OS and hardware platforms are not
officially supported. You can test the build by following the steps outlined in the
[Triton Inference Server testing guide](https://github.com/triton-inference-server/server/blob/main/docs/customization_guide/test.md).
By the following docker method steps
you can create a tritonserver:latest
Docker image that can be used to build PyTriton with the following command:
make dist -e TRITONSERVER_IMAGE_VERSION=latest -e TRITONSERVER_IMAGE_NAME=tritonserver:latest