-
Notifications
You must be signed in to change notification settings - Fork 211
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RuntimeError: EncoderDecoder: VisionMambaSeg: shape '[-1, 14, 14, 192]' is invalid for input of size 37824 #108
Comments
same issue,have you fix it now? |
No, I can't apply the pretrained weights to the segmentation model. |
Is VisionMambaSeg a pre trained model? I used Vim small+(26M 81.6 95.4) when loading the pre trained model https://huggingface.co/hustvl/Vim-small-midclstok However, there are many mismatches, such as the absence of the checkpoint ['meta '] setting in mmcv in the downloaded model |
I tried to fine-tune the segmentation model using the pretrained Vim-T, but encountered the following issue while executing
bash scripts/ft_vim_tiny_upernet.sh
:This error is propagated through multiple functions, resulting in the final error:
RuntimeError: EncoderDecoder: VisionMambaSeg: shape '[-1, 14, 14, 192]' is invalid for input of size 37824
.The pretrained weight I used was
vim_t_midclstok_76p1acc.pth
, which seems to be the correct one. If not, there should be an error while loading, such assize mismatch for norm_f.weight: copying a param with shape torch.Size([192]) from checkpoint, the shape in current model is torch.Size([384])
, but I didn't get this error.So, I guess there might be an issue with the model settings, but I’m not sure. 37824 = (14*14 + 1) * 192, and the "+1" is the part that leads to the error. If the "+1" part is for mid cls token, should I just drop it for the segmentation model?
Have anyone ever encountered this problem, or successfully finetuned a segmentation model?
Thank you very much!
The text was updated successfully, but these errors were encountered: