InternLM / lmdeploy Public

Notifications You must be signed in to change notification settings
Fork 456
Star 5.1k

Code
Issues 328
Pull requests 37
Discussions
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Issues: InternLM/lmdeploy

[Docs] inference DeepSeek-V3 with LMDeploy

#2960 opened Dec 26, 2024 by haswelliris

Open 14

Labels 34 Milestones 0

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

328 Open 1,284 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

[Bug] OOM in jetson but not in x86

#3006 opened Jan 9, 2025 by quanfeifan

3 tasks

[Bug] NPU Out of Memory while using Qwen2-VL-72B-Instruct to infer

#3005 opened Jan 9, 2025 by 00drdelius

3 tasks done

[Bug] CUDA error: an illegal memory access was encountered. Vicuna results wrong

#3004 opened Jan 9, 2025 by AllentDan

3 tasks

[Bug] 使用lmdeploy0.6.4部署llava-hf/llava-v1.6-vicuna-13b-hf之后，openai调用输出为空字符串

#3003 opened Jan 9, 2025 by bang123-box

3 tasks

[Bug] RuntimeError: Triton Error [CUDA]: an illegal memory access was encountered

#2999 opened Jan 8, 2025 by YSShannon

1 of 3 tasks

[Bug] do quantization of mllm models on transformers>=4.47, raise error cannot import name 'shard_checkpoint' from 'transformers.modeling_utils'

#2997 opened Jan 8, 2025 by zhulinJulia24

3 tasks

[Bug] internvl2_8b, 4 3090 cards, CUDA OOM error

#2993 opened Jan 7, 2025 by zhaowenZhou

3 tasks done

[Bug] 随着时间运行，lmdeploy进程占用cpu提高且不释放

#2990 opened Jan 6, 2025 by akai-shuuichi

3 tasks done

[Bug] there no runtime.txt in requirements/ awaiting response

#2985 opened Jan 3, 2025 by 86kkd

3 tasks done

昇腾lmdeploy使用 lmdeploy APIClient 接口时，推理结果被截断

#2969 opened Dec 28, 2024 by winni0

3 tasks done

[Feature] W8A8 support for turbomind engine

#2962 opened Dec 27, 2024 by binhtranmcs

[Docs] inference DeepSeek-V3 with LMDeploy

#2960 opened Dec 26, 2024 by haswelliris

[Feature] 我有一段代码，不知道怎么使用LMDeploy去加速它

#2958 opened Dec 26, 2024 by CallmeZhangChenchen

[Feature] Support Medusa decode (Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads)

#2950 opened Dec 25, 2024 by yimuu

[Bug] generation profile hangs on Mixtral-8x7B-Instruct-v0.1 with pytorch backend

#2948 opened Dec 24, 2024 by zhulinJulia24

3 tasks

910B 起服务失败

#2945 opened Dec 24, 2024 by SefaZeng

3 tasks done

[Feature] Control over Prefix Cache Capacity and Guaranteed Caching

#2942 opened Dec 23, 2024 by ZhuQian0909

[Bug] 如何在推理时输出隐藏层

#2937 opened Dec 22, 2024 by owerbai

3 tasks

[Bug] deploy InterVL2.5 8B，Aborted (core dumped)

#2932 opened Dec 20, 2024 by rover5056

[Bug] 请问如何使用自己的数据集进行lmdeploy的awq量化？量化的校对数据集只支持文字问答数据集吗？

#2931 opened Dec 20, 2024 by lzk9508

3 tasks

[Feature] pipeline.get_logits() support tensor on gpu as input rather than list input

#2926 opened Dec 19, 2024 by BradZhone

[Feature] Support deploy model on specific device id

#2925 opened Dec 19, 2024 by BradZhone

部署后，连续发送请求，推理耗时递增

#2924 opened Dec 18, 2024 by nzomi

[Feature] how to apply a custom calibration data set during awq quant?

#2923 opened Dec 18, 2024 by lzk9508

[Bug] server启动模型，调用流式返回报错

#2920 opened Dec 18, 2024 by pangr

3 tasks

Previous 1 2 3 4 5 … 13 14 Next

Previous Next

ProTip! Adding no:label will show everything without a label.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly