Using with vLLM
#112
by
fpaupier
- opened
I'm trying to use this mode with vLLM to serve embeddings inference through an OpenAI API compatible server, see vLLM doc here
I have the error below, in a nutshell Only 'absolute' position_embedding_type is supported
, never got this one ith other emebdding model like thge msmarco or BGE before, any ideas how to adress this ?
Only 'absolute' position_embedding_type is supported
embedding-1 | ERROR 02-04 09:22:29 engine.py:366] Traceback (most recent call last):
embedding-1 | ERROR 02-04 09:22:29 engine.py:366] File "/usr/local/lib/python3.10/dist-packages/vllm/engine/multiprocessing/engine.py", line 357, in run_mp_engine
embedding-1 | ERROR 02-04 09:22:29 engine.py:366] engine = MQLLMEngine.from_engine_args(engine_args=engine_args,
embedding-1 | ERROR 02-04 09:22:29 engine.py:366] File "/usr/local/lib/python3.10/dist-packages/vllm/engine/multiprocessing/engine.py", line 119, in from_engine_args
embedding-1 | ERROR 02-04 09:22:29 engine.py:366] return cls(ipc_path=ipc_path,
embedding-1 | ERROR 02-04 09:22:29 engine.py:366] File "/usr/local/lib/python3.10/dist-packages/vllm/engine/multiprocessing/engine.py", line 71, in __init__
embedding-1 | ERROR 02-04 09:22:29 engine.py:366] self.engine = LLMEngine(*args, **kwargs)
embedding-1 | ERROR 02-04 09:22:29 engine.py:366] File "/usr/local/lib/python3.10/dist-packages/vllm/engine/llm_engine.py", line 273, in __init__
embedding-1 | ERROR 02-04 09:22:29 engine.py:366] self.model_executor = executor_class(vllm_config=vllm_config, )
embedding-1 | ERROR 02-04 09:22:29 engine.py:366] File "/usr/local/lib/python3.10/dist-packages/vllm/executor/executor_base.py", line 36, in __init__
embedding-1 | ERROR 02-04 09:22:29 engine.py:366] self._init_executor()
embedding-1 | ERROR 02-04 09:22:29 engine.py:366] File "/usr/local/lib/python3.10/dist-packages/vllm/executor/gpu_executor.py", line 35, in _init_executor
embedding-1 | ERROR 02-04 09:22:29 engine.py:366] self.driver_worker.load_model()
embedding-1 | ERROR 02-04 09:22:29 engine.py:366] File "/usr/local/lib/python3.10/dist-packages/vllm/worker/worker.py", line 155, in load_model
embedding-1 | ERROR 02-04 09:22:29 engine.py:366] self.model_runner.load_model()
embedding-1 | ERROR 02-04 09:22:29 engine.py:366] File "/usr/local/lib/python3.10/dist-packages/vllm/worker/model_runner.py", line 1096, in load_model
embedding-1 | ERROR 02-04 09:22:29 engine.py:366] self.model = get_model(vllm_config=self.vllm_config)
embedding-1 | ERROR 02-04 09:22:29 engine.py:366] File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/model_loader/__init__.py", line 12, in get_model
embedding-1 | ERROR 02-04 09:22:29 engine.py:366] return loader.load_model(vllm_config=vllm_config)
embedding-1 | ERROR 02-04 09:22:29 engine.py:366] File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/model_loader/loader.py", line 363, in load_model
embedding-1 | ERROR 02-04 09:22:29 engine.py:366] model = _initialize_model(vllm_config=vllm_config)
embedding-1 | ERROR 02-04 09:22:29 engine.py:366] File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/model_loader/loader.py", line 116, in _initialize_model
embedding-1 | ERROR 02-04 09:22:29 engine.py:366] return model_class(vllm_config=vllm_config, prefix=prefix)
embedding-1 | ERROR 02-04 09:22:29 engine.py:366] File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/models/bert.py", line 417, in __init__
embedding-1 | ERROR 02-04 09:22:29 engine.py:366] self.model = self._build_model(vllm_config=vllm_config,
embedding-1 | ERROR 02-04 09:22:29 engine.py:366] File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/models/roberta.py", line 151, in _build_model
embedding-1 | ERROR 02-04 09:22:29 engine.py:366] return BertModel(vllm_config=vllm_config,
embedding-1 | ERROR 02-04 09:22:29 engine.py:366] File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/models/bert.py", line 339, in __init__
embedding-1 | ERROR 02-04 09:22:29 engine.py:366] self.embeddings = embedding_class(config)
embedding-1 | ERROR 02-04 09:22:29 engine.py:366] File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/models/roberta.py", line 44, in __init__
embedding-1 | ERROR 02-04 09:22:29 engine.py:366] raise ValueError("Only 'absolute' position_embedding_type" +
embedding-1 | ERROR 02-04 09:22:29 engine.py:366] ValueError: Only 'absolute' position_embedding_type is supported
On vLLM GitHub, an issue on this has been opened and it points to this being more a model support issue than a vLLM - linked to the fact that remote code trust is required which is a NO GO for production use.
Is this on Jina team roadmap ?