Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vLLM运行Qwen2-Audio报错Attempted to assign 68 = 68 multimodal tokens to 69 placeholders #103

Open
yll3518974 opened this issue Dec 11, 2024 · 1 comment

Comments

@yll3518974
Copy link

INFO 12-10 22:57:11 model_runner_base.py:120] Writing input of failed execution to /tmp/err_execute_model_input_20241210-225711.pkl...
INFO 12-10 22:57:11 model_runner_base.py:149] Completed writing input of failed execution to /tmp/err_execute_model_input_20241210-225711.pkl.
CRITICAL 12-10 22:57:11 launcher.py:99] MQLLMEngine is already dead, terminating server process
INFO: 10.96.215.30:41264 - "POST /v1/chat/completions HTTP/1.1" 500 Internal Server Error
ERROR 12-10 22:57:11 engine.py:135] ValueError('Error in model execution (input dumped to /tmp/err_execute_model_input_20241210-225711.pkl): Attempted to assign 68 = 68 multimodal tokens to 69 placeholders')
ERROR 12-10 22:57:11 engine.py:135] Traceback (most recent call last):
ERROR 12-10 22:57:11 engine.py:135] File "/usr/local/lib/python3.12/dist-packages/vllm/worker/model_runner_base.py", line 116, in _wrapper
ERROR 12-10 22:57:11 engine.py:135] return func(*args, **kwargs)
ERROR 12-10 22:57:11 engine.py:135] ^^^^^^^^^^^^^^^^^^^^^
ERROR 12-10 22:57:11 engine.py:135] File "/usr/local/lib/python3.12/dist-packages/vllm/worker/model_runner.py", line 1654, in execute_model
ERROR 12-10 22:57:11 engine.py:135] hidden_or_intermediate_states = model_executable(
ERROR 12-10 22:57:11 engine.py:135] ^^^^^^^^^^^^^^^^^
ERROR 12-10 22:57:11 engine.py:135] File "/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
ERROR 12-10 22:57:11 engine.py:135] return self._call_impl(*args, **kwargs)
ERROR 12-10 22:57:11 engine.py:135] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 12-10 22:57:11 engine.py:135] File "/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py", line 1747, in _call_impl
ERROR 12-10 22:57:11 engine.py:135] return forward_call(*args, **kwargs)
ERROR 12-10 22:57:11 engine.py:135] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 12-10 22:57:11 engine.py:135] File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/qwen2_audio.py", line 413, in forward
ERROR 12-10 22:57:11 engine.py:135] inputs_embeds = self.get_input_embeddings(input_ids,
ERROR 12-10 22:57:11 engine.py:135] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 12-10 22:57:11 engine.py:135] File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/qwen2_audio.py", line 390, in get_input_embeddings
ERROR 12-10 22:57:11 engine.py:135] inputs_embeds = merge_multimodal_embeddings(
ERROR 12-10 22:57:11 engine.py:135] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 12-10 22:57:11 engine.py:135] File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/utils.py", line 407, in merge_multimodal_embeddings
ERROR 12-10 22:57:11 engine.py:135] return _merge_multimodal_embeddings(
ERROR 12-10 22:57:11 engine.py:135] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 12-10 22:57:11 engine.py:135] File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/utils.py", line 346, in _merge_multimodal_embeddings
ERROR 12-10 22:57:11 engine.py:135] raise ValueError(
ERROR 12-10 22:57:11 engine.py:135] ValueError: Attempted to assign 68 = 68 multimodal tokens to 69 placeholders
ERROR 12-10 22:57:11 engine.py:135]
ERROR 12-10 22:57:11 engine.py:135] The above exception was the direct cause of the following exception:
ERROR 12-10 22:57:11 engine.py:135]
ERROR 12-10 22:57:11 engine.py:135] Traceback (most recent call last):
ERROR 12-10 22:57:11 engine.py:135] File "/usr/local/lib/python3.12/dist-packages/vllm/engine/multiprocessing/engine.py", line 133, in start
ERROR 12-10 22:57:11 engine.py:135] self.run_engine_loop()
ERROR 12-10 22:57:11 engine.py:135] File "/usr/local/lib/python3.12/dist-packages/vllm/engine/multiprocessing/engine.py", line 196, in run_engine_loop
ERROR 12-10 22:57:11 engine.py:135] request_outputs = self.engine_step()
ERROR 12-10 22:57:11 engine.py:135] ^^^^^^^^^^^^^^^^^^
ERROR 12-10 22:57:11 engine.py:135] File "/usr/local/lib/python3.12/dist-packages/vllm/engine/multiprocessing/engine.py", line 214, in engine_step
ERROR 12-10 22:57:11 engine.py:135] raise e
ERROR 12-10 22:57:11 engine.py:135] File "/usr/local/lib/python3.12/dist-packages/vllm/engine/multiprocessing/engine.py", line 205, in engine_step
ERROR 12-10 22:57:11 engine.py:135] return self.engine.step()
ERROR 12-10 22:57:11 engine.py:135] ^^^^^^^^^^^^^^^^^^
ERROR 12-10 22:57:11 engine.py:135] File "/usr/local/lib/python3.12/dist-packages/vllm/engine/llm_engine.py", line 1454, in step
ERROR 12-10 22:57:11 engine.py:135] outputs = self.model_executor.execute_model(
ERROR 12-10 22:57:11 engine.py:135] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 12-10 22:57:11 engine.py:135] File "/usr/local/lib/python3.12/dist-packages/vllm/executor/gpu_executor.py", line 125, in execute_model
ERROR 12-10 22:57:11 engine.py:135] output = self.driver_worker.execute_model(execute_model_req)
ERROR 12-10 22:57:11 engine.py:135] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 12-10 22:57:11 engine.py:135] File "/usr/local/lib/python3.12/dist-packages/vllm/worker/worker_base.py", line 343, in execute_model
ERROR 12-10 22:57:11 engine.py:135] output = self.model_runner.execute_model(
ERROR 12-10 22:57:11 engine.py:135] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 12-10 22:57:11 engine.py:135] File "/usr/local/lib/python3.12/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context
ERROR 12-10 22:57:11 engine.py:135] return func(*args, **kwargs)
ERROR 12-10 22:57:11 engine.py:135] ^^^^^^^^^^^^^^^^^^^^^
ERROR 12-10 22:57:11 engine.py:135] File "/usr/local/lib/python3.12/dist-packages/vllm/worker/model_runner_base.py", line 152, in _wrapper
ERROR 12-10 22:57:11 engine.py:135] raise type(err)(
ERROR 12-10 22:57:11 engine.py:135] ValueError: Error in model execution (input dumped to /tmp/err_execute_model_input_20241210-225711.pkl): Attempted to assign 68 = 68 multimodal tokens to 69 placeholders

@SixGoodX
Copy link

问题解决了吗?这个仓库不支持vllm部署吧

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants