New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

带文字图像方向分类图片分辨率改变问题 #3334

Open

tianwenzhe opened this issue Dec 24, 2024 · 1 comment

Assignees

tianwenzhe commented Dec 24, 2024

使用text_image_orientation_pretrained.pdparams预训练权重训练带文字图片方向分类

https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN/training/single_label_classification/finetune.md#faq
介绍了分辨率从224提高到320的方法。
问1：为什么有些参数从224修改为320，resize_short从256修改为366？
问2：如果想提高更大分辨率，resize_short如何计算呢？
问3：提高更大分辨率后，text_image_orientation_pretrained.pdparams预训练权重还可以使用吗？

TingquanGao self-assigned this

Collaborator

TingquanGao commented Dec 25, 2024

预处理逻辑是先做等比例resize，再做crop，所以resize后的短边尺寸要大于crop的尺寸，比如ImageNet数据集训练惯用256、224的尺寸（比例为224/256=0.875），所以一般也会在此基础上调整分辨率，如等比例放大到320、366；而训练使用的RandCropImage只需指定crop后的尺寸（如224或是320），RandCropImage会默认使用0.875的比例计算resize的尺寸（对应得到256或366），评估或推理时即需要指定256配合224，或是366配合320；
如果需要更大的分辨率，只需保证resize_short的尺寸大于等于crop的尺寸即可，并无严格计算规则；
在更改分辨率后fine-tune训练时可以使用该预训练权重；

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment