Inference Endpoints Version
Hugging Face Inference Endpoints comes with a default serving container which is used for all supported Transformers and Sentence-Transformers tasks and for custom inference handler and implement batching. Below you will find information about the installed packages and versions used.
You can always upgrade installed packages and a custom packages by adding a requirements.txt
file to your model repository. Read more in Add custom Dependencies.
Installed packages & version
The Hugging Face Inference Runtime has separate versions for PyTorch
and TensorFlow
for CPU
and GPU
, which are used based on the selected framework
when creating an Inference Endpoint. The TensorFlow
and PyTorch
flavors are grouped together in the list below.
General
Python
:3.11
huggingface_hub
:0.20.3
pytorch
:2.2.0
transformers[sklearn,sentencepiece,audio,vision]
:4.38.2
diffusers
:0.26.3
accelerate
:0.27.2
sentence_transformers
:2.4.0
pandas
:latest
peft
:0.9.0
tensorflow
:latest
GPU
CUDA
:12.3
Optimized Container
text-generation-inference
:2.1.0
text-embeddings-inference
:1.2.0