vLLM运行Qwen2-Audio报错Attempted to assign 68 = 68 multimodal tokens to 69 placeholders #103

yll3518974 · 2024-12-11T08:37:44Z

INFO 12-10 22:57:11 model_runner_base.py:120] Writing input of failed execution to /tmp/err_execute_model_input_20241210-225711.pkl...
INFO 12-10 22:57:11 model_runner_base.py:149] Completed writing input of failed execution to /tmp/err_execute_model_input_20241210-225711.pkl.
CRITICAL 12-10 22:57:11 launcher.py:99] MQLLMEngine is already dead, terminating server process
INFO: 10.96.215.30:41264 - "POST /v1/chat/completions HTTP/1.1" 500 Internal Server Error
ERROR 12-10 22:57:11 engine.py:135] ValueError('Error in model execution (input dumped to /tmp/err_execute_model_input_20241210-225711.pkl): Attempted to assign 68 = 68 multimodal tokens to 69 placeholders')
ERROR 12-10 22:57:11 engine.py:135] Traceback (most recent call last):
ERROR 12-10 22:57:11 engine.py:135] File "/usr/local/lib/python3.12/dist-packages/vllm/worker/model_runner_base.py", line 116, in _wrapper
ERROR 12-10 22:57:11 engine.py:135] return func(*args, **kwargs)
ERROR 12-10 22:57:11 engine.py:135] ^^^^^^^^^^^^^^^^^^^^^
ERROR 12-10 22:57:11 engine.py:135] File "/usr/local/lib/python3.12/dist-packages/vllm/worker/model_runner.py", line 1654, in execute_model
ERROR 12-10 22:57:11 engine.py:135] hidden_or_intermediate_states = model_executable(
ERROR 12-10 22:57:11 engine.py:135] ^^^^^^^^^^^^^^^^^
ERROR 12-10 22:57:11 engine.py:135] File "/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
ERROR 12-10 22:57:11 engine.py:135] return self._call_impl(*args, **kwargs)
ERROR 12-10 22:57:11 engine.py:135] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 12-10 22:57:11 engine.py:135] File "/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py", line 1747, in _call_impl
ERROR 12-10 22:57:11 engine.py:135] return forward_call(*args, **kwargs)
ERROR 12-10 22:57:11 engine.py:135] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 12-10 22:57:11 engine.py:135] File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/qwen2_audio.py", line 413, in forward
ERROR 12-10 22:57:11 engine.py:135] inputs_embeds = self.get_input_embeddings(input_ids,
ERROR 12-10 22:57:11 engine.py:135] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 12-10 22:57:11 engine.py:135] File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/qwen2_audio.py", line 390, in get_input_embeddings
ERROR 12-10 22:57:11 engine.py:135] inputs_embeds = merge_multimodal_embeddings(
ERROR 12-10 22:57:11 engine.py:135] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 12-10 22:57:11 engine.py:135] File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/utils.py", line 407, in merge_multimodal_embeddings
ERROR 12-10 22:57:11 engine.py:135] return _merge_multimodal_embeddings(
ERROR 12-10 22:57:11 engine.py:135] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 12-10 22:57:11 engine.py:135] File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/utils.py", line 346, in _merge_multimodal_embeddings
ERROR 12-10 22:57:11 engine.py:135] raise ValueError(
ERROR 12-10 22:57:11 engine.py:135] ValueError: Attempted to assign 68 = 68 multimodal tokens to 69 placeholders
ERROR 12-10 22:57:11 engine.py:135]
ERROR 12-10 22:57:11 engine.py:135] The above exception was the direct cause of the following exception:
ERROR 12-10 22:57:11 engine.py:135]
ERROR 12-10 22:57:11 engine.py:135] Traceback (most recent call last):
ERROR 12-10 22:57:11 engine.py:135] File "/usr/local/lib/python3.12/dist-packages/vllm/engine/multiprocessing/engine.py", line 133, in start
ERROR 12-10 22:57:11 engine.py:135] self.run_engine_loop()
ERROR 12-10 22:57:11 engine.py:135] File "/usr/local/lib/python3.12/dist-packages/vllm/engine/multiprocessing/engine.py", line 196, in run_engine_loop
ERROR 12-10 22:57:11 engine.py:135] request_outputs = self.engine_step()
ERROR 12-10 22:57:11 engine.py:135] ^^^^^^^^^^^^^^^^^^
ERROR 12-10 22:57:11 engine.py:135] File "/usr/local/lib/python3.12/dist-packages/vllm/engine/multiprocessing/engine.py", line 214, in engine_step
ERROR 12-10 22:57:11 engine.py:135] raise e
ERROR 12-10 22:57:11 engine.py:135] File "/usr/local/lib/python3.12/dist-packages/vllm/engine/multiprocessing/engine.py", line 205, in engine_step
ERROR 12-10 22:57:11 engine.py:135] return self.engine.step()
ERROR 12-10 22:57:11 engine.py:135] ^^^^^^^^^^^^^^^^^^
ERROR 12-10 22:57:11 engine.py:135] File "/usr/local/lib/python3.12/dist-packages/vllm/engine/llm_engine.py", line 1454, in step
ERROR 12-10 22:57:11 engine.py:135] outputs = self.model_executor.execute_model(
ERROR 12-10 22:57:11 engine.py:135] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 12-10 22:57:11 engine.py:135] File "/usr/local/lib/python3.12/dist-packages/vllm/executor/gpu_executor.py", line 125, in execute_model
ERROR 12-10 22:57:11 engine.py:135] output = self.driver_worker.execute_model(execute_model_req)
ERROR 12-10 22:57:11 engine.py:135] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 12-10 22:57:11 engine.py:135] File "/usr/local/lib/python3.12/dist-packages/vllm/worker/worker_base.py", line 343, in execute_model
ERROR 12-10 22:57:11 engine.py:135] output = self.model_runner.execute_model(
ERROR 12-10 22:57:11 engine.py:135] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 12-10 22:57:11 engine.py:135] File "/usr/local/lib/python3.12/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context
ERROR 12-10 22:57:11 engine.py:135] return func(*args, **kwargs)
ERROR 12-10 22:57:11 engine.py:135] ^^^^^^^^^^^^^^^^^^^^^
ERROR 12-10 22:57:11 engine.py:135] File "/usr/local/lib/python3.12/dist-packages/vllm/worker/model_runner_base.py", line 152, in _wrapper
ERROR 12-10 22:57:11 engine.py:135] raise type(err)(
ERROR 12-10 22:57:11 engine.py:135] ValueError: Error in model execution (input dumped to /tmp/err_execute_model_input_20241210-225711.pkl): Attempted to assign 68 = 68 multimodal tokens to 69 placeholders

SixGoodX · 2025-01-19T04:53:51Z

问题解决了吗？这个仓库不支持vllm部署吧

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

vLLM运行Qwen2-Audio报错Attempted to assign 68 = 68 multimodal tokens to 69 placeholders #103

vLLM运行Qwen2-Audio报错Attempted to assign 68 = 68 multimodal tokens to 69 placeholders #103

yll3518974 commented Dec 11, 2024

SixGoodX commented Jan 19, 2025

vLLM运行Qwen2-Audio报错Attempted to assign 68 = 68 multimodal tokens to 69 placeholders #103

vLLM运行Qwen2-Audio报错Attempted to assign 68 = 68 multimodal tokens to 69 placeholders #103

Comments

yll3518974 commented Dec 11, 2024

SixGoodX commented Jan 19, 2025