Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

关于vLLM加速推理GLM-4V #705

Open
elesun2018 opened this issue Jan 17, 2025 · 2 comments
Open

关于vLLM加速推理GLM-4V #705

elesun2018 opened this issue Jan 17, 2025 · 2 comments
Assignees

Comments

@elesun2018
Copy link

关于vLLM加速推理GLM-4V

代码如下:

Image

请问如何进行批量推理,一次推理4张图?

单张图片推理速度为 38 tokoens/s 如果进行遍历文件夹推理图片,则推理速度变为:8 tokons/s。
有遇到过这个问题吗?
谢谢!

@zRzRzRzRzRzRzR
Copy link
Member

这个模型只支持一张图呀

@zRzRzRzRzRzRzR zRzRzRzRzRzRzR self-assigned this Jan 20, 2025
@elesun2018
Copy link
Author

意思是说,不支持同时推理多个图文对话?

还有
单张图片推理速度为 38 tokoens/s 如果进行遍历文件夹推理图片,则推理速度变为:8 tokons/s。
这个可能原因是,
谢谢!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants