How to run without training #556

bkbilly-intrack · 2022-02-03T09:25:03Z

bkbilly-intrack
Feb 3, 2022

I was looking on how to run the segmentation_models_pytorch on pretrained models, but I couldn't find any instructions.
My goal is to run it on a live stream using OpenCV method VideoCapture.
Do you have any example on how to run live segmentation?

I tried the following, but I got an error:

import cv2
import segmentation_models_pytorch as smp
import torch


model = smp.Unet(
    encoder_name="resnet34",        # choose encoder, e.g. mobilenet_v2 or efficientnet-b7
    encoder_weights="imagenet",     # use `imagenet` pre-trained weights for encoder initialization
    in_channels=3,                  # model input channels (1 for gray-scale images, 3 for RGB, etc.)
    classes=3,                      # model output channels (number of classes in your dataset)
)

DEVICE = 'cuda'
vid = cv2.VideoCapture(0)

while(True):
    ret, frame = vid.read()

    x_tensor = torch.from_numpy(frame).to(DEVICE).unsqueeze(0)
    pr_mask = model.predict(x_tensor)
    pr_mask = (pr_mask.squeeze().cpu().numpy().round())

    cv2.imshow('frame', frame)
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

vid.release()
cv2.destroyAllWindows()

RuntimeError: Given groups=1, weight of size [64, 3, 7, 7], expected input[1, 480, 640, 3] to have 1 channels, but got 480 channels instead

Answered by qubvel

Feb 3, 2022

Hi, models have only backbones pretrained. They are not fully trained for some particular objects.

View full answer

bkbilly-intrack · 2022-02-03T12:05:08Z

bkbilly-intrack
Feb 3, 2022
Author

I got it working, but it seems that the pretrained models don't recognise the objects correctly.
This is the code I used:

import cv2
import segmentation_models_pytorch as smp
import torch


DEVICE = 'cuda'
model = smp.Unet(
    encoder_name="resnet50",        # choose encoder, e.g. mobilenet_v2 or efficientnet-b7
    encoder_weights="imagenet",     # use `imagenet` pre-trained weights for encoder initialization
    in_channels=3,                  # model input channels (1 for gray-scale images, 3 for RGB, etc.)
    classes=3,                      # model output channels (number of classes in your dataset)
)
model.to(DEVICE)

vid = cv2.VideoCapture(0)

while(True):
    ret, frame = vid.read()

    x_tensor = torch.from_numpy(frame.transpose((2, 0, 1))).to(DEVICE).unsqueeze(0).float()
    pr_mask = model.predict(x_tensor)
    pr_mask = (pr_mask.squeeze().cpu().numpy()).transpose((1, 2, 0))

    print(pr_mask.shape)
    cv2.imshow('frame', frame)
    cv2.imshow('mask', pr_mask)
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

vid.release()
cv2.destroyAllWindows()

0 replies

qubvel · 2022-02-03T12:10:01Z

qubvel
Feb 3, 2022
Maintainer

Hi, models have only backbones pretrained. They are not fully trained for some particular objects.

1 reply

bkbilly-intrack Feb 3, 2022
Author

Thanks for clarifying that!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to run without training #556

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 2 comments 1 reply

{{title}}

{{title}}

{{title}}

Select a reply

How to run without training #556

bkbilly-intrack Feb 3, 2022

Replies: 2 comments · 1 reply

bkbilly-intrack Feb 3, 2022 Author

qubvel Feb 3, 2022 Maintainer

bkbilly-intrack Feb 3, 2022 Author

bkbilly-intrack
Feb 3, 2022

Replies: 2 comments 1 reply

bkbilly-intrack
Feb 3, 2022
Author

qubvel
Feb 3, 2022
Maintainer

bkbilly-intrack Feb 3, 2022
Author