[Model] Refactoring of MiniCPM-V and add MiniCPM-o-2.6 support for vLLM #12069

HwwwwwwwH · 2025-01-15T07:12:18Z

This PR aims to adapt and support all the features of MiniCPM-V and MiniCPM-o. It is designed to be compatible with various modalities (image, video, audio), different model versions (2.0, 2.5, 2.6, o), and diverse input types (raw, embeddings), while maintaining support for LORA, which might require significant effort.

Below is the roadmap for this PR:

This PR is still in development. Once I complete the support for audio, I will request to merge. I'll get this work done ASAP.

FIX #12162

github-actions · 2025-01-15T07:12:28Z

👋 Hi! Thank you for contributing to the vLLM project.
Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can do one of these:

Add ready label to the PR
Enable auto-merge.

🚀

ywang96 · 2025-01-15T09:19:34Z

Really appreciate your effort planned on this PR!

Support for audio outputs (using hidden states).
Streaming multimodal inputs (may be complex; consider starting a new PR for this feature in the future).

It would be great if you can share some design decisions for the these two items as an RFC (or two separate RFCs) first before we proceed with implementation. We (vLLM team) are also thinking about how we want to support multimodal output and streaming/realtime API on vLLM so it's probably the best time for us to discuss these items!

HwwwwwwwH · 2025-01-15T13:58:07Z

Really appreciate your effort planned on this PR!

Support for audio outputs (using hidden states).
Streaming multimodal inputs (may be complex; consider starting a new PR for this feature in the future).

It would be great if you can share some design decisions for the these two items as an RFC (or two separate RFCs) first before we proceed with implementation. We (vLLM team) are also thinking about how we want to support multimodal output and streaming/realtime API on vLLM so it's probably the best time for us to discuss these items!

Thank you for suggestion! I'll start these two RFCs tomorrow.

HwwwwwwwH · 2025-01-15T14:00:04Z

@DarkLight1337 I think I might need some help for verifying LoRA support. Should I do any changes for it?

DarkLight1337 · 2025-01-15T14:01:25Z

@jeejeelee can help with this. Please keep in mind though that currently LoRA is only supported for the language part of multi-modal models.

Signed-off-by: hzh <[email protected]>

…tended design (vllm-project#11672) Signed-off-by: Sungjae Lee <[email protected]> Signed-off-by: hzh <[email protected]>

…ect#11921) Signed-off-by: shaochangxu.scx <[email protected]> Co-authored-by: shaochangxu.scx <[email protected]> Signed-off-by: hzh <[email protected]>

…ject#11934) Signed-off-by: DarkLight1337 <[email protected]> Signed-off-by: hzh <[email protected]>

…#11951) Signed-off-by: DarkLight1337 <[email protected]> Signed-off-by: hzh <[email protected]>

Signed-off-by: NickLucche <[email protected]> Signed-off-by: hzh <[email protected]>

Signed-off-by: Isotr0py <[email protected]> Co-authored-by: Isotr0py <[email protected]> Signed-off-by: hzh <[email protected]>

Signed-off-by: Roger Wang <[email protected]> Signed-off-by: hzh <[email protected]>

Signed-off-by: Rafael Vasquez <[email protected]> Signed-off-by: hzh <[email protected]>

Signed-off-by: Isotr0py <[email protected]> Co-authored-by: Cyrus Leung <[email protected]> Signed-off-by: hzh <[email protected]>

…roject#11100) Signed-off-by: Akshat Tripathi <[email protected]> Signed-off-by: Oleg Mosalov <[email protected]> Signed-off-by: Jee Jee Li <[email protected]> Co-authored-by: Oleg Mosalov <[email protected]> Co-authored-by: Jee Jee Li <[email protected]> Co-authored-by: Isotr0py <[email protected]> Signed-off-by: hzh <[email protected]>

Signed-off-by: hzh <[email protected]>

Signed-off-by: [email protected] <[email protected]> Signed-off-by: hzh <[email protected]>

…m-project#9685) Signed-off-by: Isotr0py <[email protected]> Co-authored-by: Cyrus Leung <[email protected]> Signed-off-by: hzh <[email protected]>

…project#11973) Signed-off-by: [email protected] <[email protected]> Signed-off-by: hzh <[email protected]>

Signed-off-by: hzh <[email protected]>

…project#11979) Signed-off-by: hzh <[email protected]>

Signed-off-by: Sungjae Lee <[email protected]> Signed-off-by: hzh <[email protected]>

Signed-off-by: Woosuk Kwon <[email protected]> Signed-off-by: hzh <[email protected]>

…#12014) Signed-off-by: Konrad Zawora <[email protected]> Signed-off-by: hzh <[email protected]>

…t#12025) Signed-off-by: tjtanaa <[email protected]> Co-authored-by: tjtanaa <[email protected]> Signed-off-by: hzh <[email protected]>

Signed-off-by: youkaichao <[email protected]> Signed-off-by: hzh <[email protected]>

Signed-off-by: DarkLight1337 <[email protected]>

Signed-off-by: Chen Zhang <[email protected]> Signed-off-by: hzh <[email protected]>

Signed-off-by: wangxiyuan <[email protected]> Signed-off-by: hzh <[email protected]>

…project#12040) Signed-off-by: Chen Zhang <[email protected]> Signed-off-by: hzh <[email protected]>

Signed-off-by: Harry Mellor <[email protected]> Signed-off-by: hzh <[email protected]>

…lm-project#12045) Signed-off-by: hzh <[email protected]>

…apping (vllm-project#11924) Signed-off-by: Jee Jee Li <[email protected]> Signed-off-by: hzh <[email protected]>

Signed-off-by: Jee Jee Li <[email protected]> Signed-off-by: hzh <[email protected]>

…tup.py (vllm-project#12046) Signed-off-by: Konrad Zawora <[email protected]> Signed-off-by: hzh <[email protected]>

) Signed-off-by: Shanshan Shen <[email protected]> Signed-off-by: hzh <[email protected]>

Signed-off-by: Woosuk Kwon <[email protected]> Signed-off-by: hzh <[email protected]>

…oject#10467) Signed-off-by: hzh <[email protected]>

…ect#12051) Signed-off-by: Rui Qiao <[email protected]> Signed-off-by: hzh <[email protected]>

) Signed-off-by: youkaichao <[email protected]> Signed-off-by: hzh <[email protected]>

Signed-off-by: hzh <[email protected]>

mergify · 2025-01-22T14:32:58Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @HwwwwwwwH.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

This was referenced Jan 15, 2025

[RFC]: Multi-modality Support on vLLM #4194

Open

[RFC]: Merge input processor and input mapper for multi-modal models #10114

Open

DarkLight1337 self-assigned this Jan 15, 2025

DarkLight1337 mentioned this pull request Jan 17, 2025

[New Model]: openbmb/MiniCPM-o-2_6 #12162

Open

1 task

HwwwwwwwH mentioned this pull request Jan 18, 2025

[BUG] <title>vllm部署调用报错：TypeError: Unknown image model type: minicpmo OpenBMB/MiniCPM-o#742

Open

2 tasks

linyinli mentioned this pull request Jan 19, 2025

How to install Minicpm-0 with gpustack error, how to install Minicpm-0 gpustack/gpustack#1046

Open

HwwwwwwwH and others added 19 commits January 22, 2025 14:28

refactor for images

f78ad12

Signed-off-by: hzh <[email protected]>

supprot image embedding for minicpmv

95230b9

Signed-off-by: hzh <[email protected]>

[Bugfix][SpecDecode] Adjust Eagle model architecture to align with in…

42ffb1b

…tended design (vllm-project#11672) Signed-off-by: Sungjae Lee <[email protected]> Signed-off-by: hzh <[email protected]>

[Bugfix] fused_experts_impl wrong compute type for float32 (vllm-proj…

43ff2e9

…ect#11921) Signed-off-by: shaochangxu.scx <[email protected]> Co-authored-by: shaochangxu.scx <[email protected]> Signed-off-by: hzh <[email protected]>

[CI/Build] Move model-specific multi-modal processing tests (vllm-pro…

0ec9974

…ject#11934) Signed-off-by: DarkLight1337 <[email protected]> Signed-off-by: hzh <[email protected]>

[Doc] Basic guide for writing unit tests for new models (vllm-project…

b4a9094

…#11951) Signed-off-by: DarkLight1337 <[email protected]> Signed-off-by: hzh <[email protected]>

[Bugfix] Fix RobertaModel loading (vllm-project#11940)

ac29198

Signed-off-by: NickLucche <[email protected]> Signed-off-by: hzh <[email protected]>

[Model] Add cogagent model support vLLM (vllm-project#11742)

286107f

Signed-off-by: Isotr0py <[email protected]> Co-authored-by: Isotr0py <[email protected]> Signed-off-by: hzh <[email protected]>

[V1] Avoid sending text prompt to core engine (vllm-project#11963)

535e120

Signed-off-by: Roger Wang <[email protected]> Signed-off-by: hzh <[email protected]>

[CI/Build] Add markdown linter (vllm-project#11857)

925562b

Signed-off-by: Rafael Vasquez <[email protected]> Signed-off-by: hzh <[email protected]>

[Model] Initialize support for Deepseek-VL2 models (vllm-project#11578)

936b306

Signed-off-by: Isotr0py <[email protected]> Co-authored-by: Cyrus Leung <[email protected]> Signed-off-by: hzh <[email protected]>

[Hardware][TPU] workaround fix for MoE on TPU (vllm-project#11764)

eac7811

Signed-off-by: hzh <[email protected]>

[V1][Core][1/n] Logging and Metrics (vllm-project#11962)

e251866

Signed-off-by: [email protected] <[email protected]> Signed-off-by: hzh <[email protected]>

[Model] Support GGUF models newly added in transformers 4.46.0 (vll…

e46c06b

…m-project#9685) Signed-off-by: Isotr0py <[email protected]> Co-authored-by: Cyrus Leung <[email protected]> Signed-off-by: hzh <[email protected]>

[V1] [2/n] Logging and Metrics - OutputProcessor Abstraction (vllm-…

d12c0de

…project#11973) Signed-off-by: [email protected] <[email protected]> Signed-off-by: hzh <[email protected]>

[MISC] fix typo in kv transfer send recv test (vllm-project#11983)

e459c90

Signed-off-by: hzh <[email protected]>

[Bug] Fix usage of .transpose() and .view() consecutively. (vllm-…

93a78ba

…project#11979) Signed-off-by: hzh <[email protected]>

[CI][Spec Decode] fix: broken test for EAGLE model (vllm-project#11972)

dd2f627

Signed-off-by: Sungjae Lee <[email protected]> Signed-off-by: hzh <[email protected]>

WoosukKwon and others added 24 commits January 22, 2025 14:28

[Docs] Add Sky Computing Lab to project intro (vllm-project#12019)

0ca468e

Signed-off-by: Woosuk Kwon <[email protected]> Signed-off-by: hzh <[email protected]>

[HPU][Bugfix] set_forward_context and CI test execution (vllm-project…

6bec0d0

…#12014) Signed-off-by: Konrad Zawora <[email protected]> Signed-off-by: hzh <[email protected]>

[Doc] Update Quantization Hardware Support Documentation (vllm-projec…

0badf14

…t#12025) Signed-off-by: tjtanaa <[email protected]> Co-authored-by: tjtanaa <[email protected]> Signed-off-by: hzh <[email protected]>

[HPU][misc] add comments for explanation (vllm-project#12034)

c6a5060

Signed-off-by: youkaichao <[email protected]> Signed-off-by: hzh <[email protected]>

[Bugfix] Fix various bugs in multi-modal processor (vllm-project#12031)

055a2b7

Signed-off-by: DarkLight1337 <[email protected]>

[Kernel] Revert the API change of Attention.forward (vllm-project#12038)

941a5d5

Signed-off-by: Chen Zhang <[email protected]> Signed-off-by: hzh <[email protected]>

[Platform] Add output for Attention Backend (vllm-project#11981)

3a05c49

Signed-off-by: wangxiyuan <[email protected]> Signed-off-by: hzh <[email protected]>

[Bugfix][Kernel] Give unique name to BlockSparseFlashAttention (vllm-…

87a687b

…project#12040) Signed-off-by: Chen Zhang <[email protected]> Signed-off-by: hzh <[email protected]>

Explain where the engine args go when using Docker (vllm-project#12041)

3183e6a

Signed-off-by: Harry Mellor <[email protected]> Signed-off-by: hzh <[email protected]>

[Doc]: Update the Json Example of the Engine Arguments document (vl…

cc9cde5

…lm-project#12045) Signed-off-by: hzh <[email protected]>

[Misc] Merge bitsandbytes_stacked_params_mapping and packed_modules_m…

58d45cd

…apping (vllm-project#11924) Signed-off-by: Jee Jee Li <[email protected]> Signed-off-by: hzh <[email protected]>

[Kernel] Support MulAndSilu (vllm-project#11624)

bb13b8a

Signed-off-by: Jee Jee Li <[email protected]> Signed-off-by: hzh <[email protected]>

[HPU][Bugfix] Don't use /dev/accel/accel0 for HPU autodetection in se…

1bba3f6

…tup.py (vllm-project#12046) Signed-off-by: Konrad Zawora <[email protected]> Signed-off-by: hzh <[email protected]>

[Platform] move current_memory_usage() into platform (vllm-project#11369

ef22c6c

) Signed-off-by: Shanshan Shen <[email protected]> Signed-off-by: hzh <[email protected]>

[V1][BugFix] Fix edge case in VLM scheduling (vllm-project#12065)

94adbff

Signed-off-by: Woosuk Kwon <[email protected]> Signed-off-by: hzh <[email protected]>

[Misc] Add multipstep chunked-prefill support for FlashInfer (vllm-pr…

654f5d7

…oject#10467) Signed-off-by: hzh <[email protected]>

[core] Turn off GPU communication overlap for Ray executor (vllm-proj…

8146c68

…ect#12051) Signed-off-by: Rui Qiao <[email protected]> Signed-off-by: hzh <[email protected]>

[core] platform agnostic executor via collective_rpc (vllm-project#11256

59e5cf4

) Signed-off-by: youkaichao <[email protected]> Signed-off-by: hzh <[email protected]>

merge main

920038b

Signed-off-by: hzh <[email protected]>

video embedding supports

6f6d2eb

Signed-off-by: hzh <[email protected]>

update support for minicpmo on images and videos

364bca1

Signed-off-by: hzh <[email protected]>

audio language

c2d8dbb

Signed-off-by: hzh <[email protected]>

audio embedding inputs

1ba77eb

Signed-off-by: hzh <[email protected]>

format

1c6f7d8

Signed-off-by: hzh <[email protected]>

HwwwwwwwH force-pushed the minicpmv-refactor branch from 063c349 to 1c6f7d8 Compare January 22, 2025 14:32

mergify bot added documentation Improvements or additions to documentation ci/build labels Jan 22, 2025

mergify bot added frontend needs-rebase labels Jan 22, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Model] Refactoring of MiniCPM-V and add MiniCPM-o-2.6 support for vLLM #12069

[Model] Refactoring of MiniCPM-V and add MiniCPM-o-2.6 support for vLLM #12069

HwwwwwwwH commented Jan 15, 2025 •

edited

Loading

github-actions bot commented Jan 15, 2025

ywang96 commented Jan 15, 2025 •

edited

Loading

HwwwwwwwH commented Jan 15, 2025

HwwwwwwwH commented Jan 15, 2025

DarkLight1337 commented Jan 15, 2025

mergify bot commented Jan 22, 2025

[Model] Refactoring of MiniCPM-V and add MiniCPM-o-2.6 support for vLLM #12069

Are you sure you want to change the base?

[Model] Refactoring of MiniCPM-V and add MiniCPM-o-2.6 support for vLLM #12069

Conversation

HwwwwwwwH commented Jan 15, 2025 • edited Loading

github-actions bot commented Jan 15, 2025

ywang96 commented Jan 15, 2025 • edited Loading

HwwwwwwwH commented Jan 15, 2025

HwwwwwwwH commented Jan 15, 2025

DarkLight1337 commented Jan 15, 2025

mergify bot commented Jan 22, 2025

HwwwwwwwH commented Jan 15, 2025 •

edited

Loading

ywang96 commented Jan 15, 2025 •

edited

Loading