Can't reproduce the results on cityscapes #35

agenthong · 2021-08-18T09:45:39Z

Hi, it's a great work!

I follow the instruction to train and evaluate on cithscapes, while got following result which is slightly different from the paper.

Hence, how can I achieve the SOTA?

agenthong · 2021-08-18T09:48:30Z

By the way, the code train the model with resolutin of 192 × 512. But the paper is 128 × 416. Is it the same?

agenthong · 2021-08-19T14:44:27Z

@JamieWatson683

JamieWatson683 · 2021-08-20T06:44:04Z

Hi, thanks for your interest in the project!

Okay interesting - have you pulled the latest changes? There was a bug a few weeks ago which meant that the teacher network + pose network were not frozen, which would lead to a deterioration in accuracy - especially for the cityscapes dataset.

If that is not the issue then please let me know and I will do some further investigation.

And about the resolution - we train at 192x512 as per SfmLearner data preparation (i.e. just crop out the ego car), but at test time we crop our predictions to match the 128x416 region of other works (this crops out some of the top and sides of the image). If you take a look at line 340 of evaluate_depth.py it should be more clear what is happening.

Thanks a lot

agenthong · 2021-08-20T09:43:09Z

Hi, thanks for your interest in the project!

Okay interesting - have you pulled the latest changes? There was a bug a few weeks ago which meant that the teacher network + pose network were not frozen, which would lead to a deterioration in accuracy - especially for the cityscapes dataset.

If that is not the issue then please let me know and I will do some further investigation.

And about the resolution - we train at 192x512 as per SfmLearner data preparation (i.e. just crop out the ego car), but at test time we crop our predictions to match the 128x416 region of other works (this crops out some of the top and sides of the image). If you take a look at line 340 of evaluate_depth.py it should be more clear what is happening.

Thanks a lot

Thanks for replying! Could you show more details about the changes you mentioned? I'm not sure if I pull this change, but I just clone this code two days ago so I guess it's not the reason.

agenthong · 2021-08-21T03:16:31Z

@JamieWatson683
Hi! I figure it out that I've pulled the changes and it's not the reason network doesn't achieve SOTA.

agenthong · 2021-08-21T13:43:32Z

By the way, should I add --freeze_teacher_epoch 15 when I train for the KITTI?

longyangqi · 2021-08-25T14:35:41Z

Hi, thanks for your interest in the project!

Okay interesting - have you pulled the latest changes? There was a bug a few weeks ago which meant that the teacher network + pose network were not frozen, which would lead to a deterioration in accuracy - especially for the cityscapes dataset.

If that is not the issue then please let me know and I will do some further investigation.

And about the resolution - we train at 192x512 as per SfmLearner data preparation (i.e. just crop out the ego car), but at test time we crop our predictions to match the 128x416 region of other works (this crops out some of the top and sides of the image). If you take a look at line 340 of evaluate_depth.py it should be more clear what is happening.

Thanks a lot

Thanks for the awesome works!

Here, do you mean the changes in

manydepth/manydepth/trainer.py

Lines 213 to 220 in 893b7a9

    
           # here we reinitialise our optimizer to ensure there are no updates to the 
        
           # teacher and pose networks 
        
           self.parameters_to_train = [] 
        
           self.parameters_to_train += list(self.models["encoder"].parameters()) 
        
           self.parameters_to_train += list(self.models["depth"].parameters()) 
        
           self.model_optimizer = optim.Adam(self.parameters_to_train, self.opt.learning_rate) 
        
           self.model_lr_scheduler = optim.lr_scheduler.StepLR( 
        
               self.model_optimizer, self.opt.scheduler_step_size, 0.1)

As we set train_teacher_and_pose =False after 15 epochs

manydepth/manydepth/trainer.py

Line 210 in 893b7a9

self.train_teacher_and_pose = False

the forward operation of Teacher and Pose net will run with torch.no_grad. Do we really need these changes to fix the bug?

Thanks!

agenthong · 2021-08-26T01:30:36Z

Hi, thanks for your interest in the project!
Okay interesting - have you pulled the latest changes? There was a bug a few weeks ago which meant that the teacher network + pose network were not frozen, which would lead to a deterioration in accuracy - especially for the cityscapes dataset.
If that is not the issue then please let me know and I will do some further investigation.
And about the resolution - we train at 192x512 as per SfmLearner data preparation (i.e. just crop out the ego car), but at test time we crop our predictions to match the 128x416 region of other works (this crops out some of the top and sides of the image). If you take a look at line 340 of evaluate_depth.py it should be more clear what is happening.
Thanks a lot

Thanks for the awesome works!

Here, do you mean the changes in

manydepth/manydepth/trainer.py

Lines 213 to 220 in 893b7a9

# here we reinitialise our optimizer to ensure there are no updates to the

# teacher and pose networks

self.parameters_to_train = []

self.parameters_to_train += list(self.models["encoder"].parameters())

self.parameters_to_train += list(self.models["depth"].parameters())

self.model_optimizer = optim.Adam(self.parameters_to_train, self.opt.learning_rate)

self.model_lr_scheduler = optim.lr_scheduler.StepLR(

self.model_optimizer, self.opt.scheduler_step_size, 0.1)

As we set train_teacher_and_pose =False after 15 epochs

manydepth/manydepth/trainer.py

Line 210 in 893b7a9

self.train_teacher_and_pose = False

the forward operation of Teacher and Pose net will run with torch.no_grad. Do we really need these changes to fix the bug?
Thanks!

The code has been fixed. And your changes is correct.

Now I can see freezing teacher and pose networks! after 5 epochs.

By the way, can you reproduce the results for cityscape?

longyangqi · 2021-08-26T01:51:37Z

The code has been fixed. And your changes is correct.

Now I can see freezing teacher and pose networks! after 5 epochs.

By the way, can you reproduce the results for cityscape?

Sorry, I did not try it on Cityscapes, I got slightly worse results on KITTI (absrel 0.101 vs 0.098 in paper)

So I'm curious about the freezing mode. And I think it's not really a bug which needs to be fixed.
I wonder why we need these changes. Does chese changes matter much on your experiments?

agenthong · 2021-08-26T07:46:11Z

The code has been fixed. And your changes is correct.
Now I can see freezing teacher and pose networks! after 5 epochs.
By the way, can you reproduce the results for cityscape?

Sorry, I did not try it on Cityscapes, I got slightly worse results on KITTI (absrel 0.101 vs 0.098 in paper)

So I'm curious about the freezing mode. And I think it's not really a bug which needs to be fixed.
I wonder why we need these changes. Does chese changes matter much on your experiments?

Actually I've done the same experiments and if you train for 30 epochs you can get a better result which is nearly the same as paper.

As for freezing, my experiments show that it has less impact on KITTI but is important for Cityscape. If you train for KITTI, it doesn't matter.

longyangqi · 2021-08-27T02:19:12Z

Actually I've done the same experiments and if you train for 30 epochs you can get a better result which is nearly the same as paper.

As for freezing, my experiments show that it has less impact on KITTI but is important for Cityscape. If you train for KITTI, it doesn't matter.

Thanks for your reply! It makes sense, I'll have a try for more epochs.

agenthong · 2021-08-27T13:48:12Z

Actually I've done the same experiments and if you train for 30 epochs you can get a better result which is nearly the same as paper.
As for freezing, my experiments show that it has less impact on KITTI but is important for Cityscape. If you train for KITTI, it doesn't matter.

Thanks for your reply! It makes sense, I'll have a try for more epochs.

Hi!

I contacted with Jamie and we figured it out that the teacher isn't frozen. So the code has bug and he will fix it later.

DemingWu · 2021-12-09T07:47:49Z

@JamieWatson683 @agenthong, hello, may i ask you about the setup on cityscapes. especially about STEREO_SCALE_FACTOR in evaluate_depth.py 31 line. Because i use setup as (0.22x2262)/(pred_dispx2048) to get pred_depth.

fengziyue · 2021-12-25T07:09:55Z

Actually I've done the same experiments and if you train for 30 epochs you can get a better result which is nearly the same as paper.
As for freezing, my experiments show that it has less impact on KITTI but is important for Cityscape. If you train for KITTI, it doesn't matter.

Thanks for your reply! It makes sense, I'll have a try for more epochs.

Hi!

I contacted with Jamie and we figured it out that the teacher isn't frozen. So the code has bug and he will fix it later.

@JamieWatson683 @agenthong Could you share what exactly is the bug you mentioned? I tried to train CityScape and only get AbsRel=0.137

agenthong · 2021-12-28T07:24:40Z

Actually I've done the same experiments and if you train for 30 epochs you can get a better result which is nearly the same as paper.
As for freezing, my experiments show that it has less impact on KITTI but is important for Cityscape. If you train for KITTI, it doesn't matter.

Thanks for your reply! It makes sense, I'll have a try for more epochs.

Hi!
I contacted with Jamie and we figured it out that the teacher isn't frozen. So the code has bug and he will fix it later.

@JamieWatson683 @agenthong Could you share what exactly is the bug you mentioned? I tried to train CityScape and only get AbsRel=0.137

Unlikely, this is the best result I get till now. So I can't reproduce the result on cityscapes.

JamieWatson683 · 2021-12-30T10:46:41Z

Hi - really sorry for the delay in getting back on this, I was looking at it, but then had CVPR submission and some vacation to take. I will get back to this in early January, and hopefully push a fix!

fengziyue · 2021-12-30T18:12:02Z

Hi - really sorry for the delay in getting back on this, I was looking at it, but then had CVPR submission and some vacation to take. I will get back to this in early January, and hopefully push a fix!

@JamieWatson683 currently do you have any ideas about this issue? Any insights are greatly appreciated

JamieWatson683 · 2022-01-07T11:24:07Z

Hi - I have now had a chance to look into the Cityscapes issue a bit more closely.

I have been able to reproduce the results in the paper, but it requires a few tweaks:

batch size of 8
earlier freezing of the teacher + pose networks
using evaluation batch norm statistics for the teacher + pose networks after they are frozen

Even with this changes, it appears that Cityscapes scores are very sensitive to initialisation.

For the paper the main experimentation was done on KITTI so I didn't realise quite how much the scores can vary on Cityscapes - I would guess that it is because of the far greater number of moving objects, something which Monodepth2 (our teacher network) can struggle with. I presume that for Cityscapes, using a teacher network which can better handle moving objects would be highly beneficial (e.g. using optical flow etc), and would allow ManyDepth to perform even better.

Regardless, I've made some changes to the code and will push shortly which will:

allow for freezing the teacher based on step number rather than epoch number
change the batch norm as described above
allow random seeding to set via the command line

I hope that this will allow you to reach similar scores to the paper.

Thanks a lot,
Jamie

agenthong · 2022-01-07T15:17:29Z

@JamieWatson683

Thanks for helping us reproduce the results! I'll try this later.

BTW, do you think the teacher network is still necessary for indoor scenes? In my opinion, the mono network is mainly for moving objects, which disappear in the indoor environment. As for untextured area, I assume it could perform better with other methods.

Thanks!

JamieWatson683 · 2022-01-11T15:24:09Z

Hi - I have just pushed an update now with the things I mentioned above. Annoyingly the random seed setting doesn't remove all randomness, but seems to give more stable results.

I have managed to obtain scores similar to those in the paper (albeit with some variation) using the below commands:

batch_size 8
pytorch_random_seed 1
freeze_teacher_step 14000

You can optionally add the --save_intermediate_results to save out the model at each logging step (every 2000 steps). This will allow you to do some digging into how the scores behave over time - it might be interesting to see how the teacher network is scoring, as the performance of this is what leads to good/bad scores of MnayDepth on Cityscapes.

I'll update the readme with these tips if you can also reach reasonable scores.

@agenthong - for indoor scenes, yes I'd imagine you won't need the teacher network as it is mainly used for moving objects. It might help in untextured regions, but certainly worth experimenting with! My concern with indoor data would be:

the pose network may not work very well for arbitrary camera motion - do you have posed images? Or can you run a SLAM system to estimate poses?
just using the previous frame for building the cost volume isn't likely to work all that well, you may want to implement a keyframe selection algorithm such as from DeepVideoMVS.

Either way, let me know how you get on!

longyangqi · 2022-04-03T08:16:50Z

@agenthong @fengziyue @DemingWu

Thanks for helping us reproduce the results! I'll try this later.

Have you achieved the results described in the paper and figured out the reason of poor results?

ZhuYingJessica · 2022-05-16T10:11:23Z

@agenthong Hi, how do u evaluate on Cityscapes. I didn't find relevant evaluation code in this project. Looking forward to your reply. Thanks a lot!

JamieWatson683 · 2022-05-16T10:25:51Z

Hi @ZhuYingJessica - details on how to evaluate on cityscapes are in the readme.

Let me know if you have any other questions!

ZhuYingJessica · 2022-05-16T14:11:50Z

Hi! Thanks for your reminding! I wonder why Min_depth and Max_depth are still 1e-3 and 80 during evaluating Cityscapes dataset. It seems the Max_depth is larger in Cityscapes. Looking forward to your reply.

Hi @ZhuYingJessica - details on how to evaluate on cityscapes are in the readme.

Let me know if you have any other questions!

wangjiyuan9 · 2023-02-28T13:33:08Z

Hi! Thanks for your reminding! I wonder why Min_depth and Max_depth are still 1e-3 and 80 during evaluating Cityscapes dataset. It seems the Max_depth is larger in Cityscapes. Looking forward to your reply.

Hi @ZhuYingJessica - details on how to evaluate on cityscapes are in the readme.
Let me know if you have any other questions!

Hello！ I also have the same problem with the MAX and MIN depth. Actually, I wonder:
1.The unit of the num in the ".npy" file. Are they in meter?
2. Why always use 80m as the MAX depth? I checked the MAX value of your cityscape gt_depth is 473.5748014025198
And the MAX value of Kitti dataset gt_depth is about 82. So I think the "80" is ok for Kitti but not adapted to Cityscape.

WangXuCh · 2023-09-07T07:02:20Z

Hi! Thanks for your reminding! I wonder why Min_depth and Max_depth are still 1e-3 and 80 during evaluating Cityscapes dataset. It seems the Max_depth is larger in Cityscapes. Looking forward to your reply.

Hi @ZhuYingJessica - details on how to evaluate on cityscapes are in the readme.
Let me know if you have any other questions!

Hello！ I also have the same problem with the MAX and MIN depth. Actually, I wonder: 1.The unit of the num in the ".npy" file. Are they in meter? 2. Why always use 80m as the MAX depth? I checked the MAX value of your cityscape gt_depth is 473.5748014025198 And the MAX value of Kitti dataset gt_depth is about 82. So I think the "80" is ok for Kitti but not adapted to Cityscape.

I guess this is because the depth predicted by the network is inaccurate beyond 80, so only depths within 80 are calculated.

hutingz · 2024-03-07T02:22:00Z

Hi, it's a great work!

I follow the instruction to train and evaluate on cithscapes, while got following result which is slightly different from the paper.

Hence, how can I achieve the SOTA?

Hello, have you reproduced Cityscapes? How was its data processed during training? I followed README.md to perform Cityscapes preprocessing and obtained image. jpg and camera parameter. txt files. However, during training, the output was all black (the input. jpg was a 24 bit black image). Did my data not process correctly？

Cresynia · 2024-08-17T09:24:18Z

By the way, the code train the model with resolutin of 192 × 512. But the paper is 128 × 416. Is it the same?

Hello, I noticed that in evaluate_depth.py, the resolution being processed is 512×1664, not 128×416 as stated in the paper. I’m not entirely clear on this. Could you please explain the reason behind this difference?

Ecalpal mentioned this issue Jun 27, 2022

The results of cityscapes #52

Closed

fengziyue mentioned this issue Aug 18, 2022

Training from scratch getting worse results AutoAILab/DynamicDepth#7

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can't reproduce the results on cityscapes #35

Can't reproduce the results on cityscapes #35

agenthong commented Aug 18, 2021

agenthong commented Aug 18, 2021 •

edited

Loading

agenthong commented Aug 19, 2021

JamieWatson683 commented Aug 20, 2021

agenthong commented Aug 20, 2021

agenthong commented Aug 21, 2021

agenthong commented Aug 21, 2021

longyangqi commented Aug 25, 2021

agenthong commented Aug 26, 2021 •

edited

Loading

longyangqi commented Aug 26, 2021

agenthong commented Aug 26, 2021

longyangqi commented Aug 27, 2021

agenthong commented Aug 27, 2021

DemingWu commented Dec 9, 2021 •

edited

Loading

fengziyue commented Dec 25, 2021

agenthong commented Dec 28, 2021

JamieWatson683 commented Dec 30, 2021

fengziyue commented Dec 30, 2021

JamieWatson683 commented Jan 7, 2022 •

edited

Loading

agenthong commented Jan 7, 2022

JamieWatson683 commented Jan 11, 2022

longyangqi commented Apr 3, 2022 •

edited

Loading

ZhuYingJessica commented May 16, 2022

JamieWatson683 commented May 16, 2022

ZhuYingJessica commented May 16, 2022

wangjiyuan9 commented Feb 28, 2023

WangXuCh commented Sep 7, 2023

hutingz commented Mar 7, 2024

Cresynia commented Aug 17, 2024

Can't reproduce the results on cityscapes #35

Can't reproduce the results on cityscapes #35

Comments

agenthong commented Aug 18, 2021

agenthong commented Aug 18, 2021 • edited Loading

agenthong commented Aug 19, 2021

JamieWatson683 commented Aug 20, 2021

agenthong commented Aug 20, 2021

agenthong commented Aug 21, 2021

agenthong commented Aug 21, 2021

longyangqi commented Aug 25, 2021

agenthong commented Aug 26, 2021 • edited Loading

longyangqi commented Aug 26, 2021

agenthong commented Aug 26, 2021

longyangqi commented Aug 27, 2021

agenthong commented Aug 27, 2021

DemingWu commented Dec 9, 2021 • edited Loading

fengziyue commented Dec 25, 2021

agenthong commented Dec 28, 2021

JamieWatson683 commented Dec 30, 2021

fengziyue commented Dec 30, 2021

JamieWatson683 commented Jan 7, 2022 • edited Loading

agenthong commented Jan 7, 2022

JamieWatson683 commented Jan 11, 2022

longyangqi commented Apr 3, 2022 • edited Loading

ZhuYingJessica commented May 16, 2022

JamieWatson683 commented May 16, 2022

ZhuYingJessica commented May 16, 2022

wangjiyuan9 commented Feb 28, 2023

WangXuCh commented Sep 7, 2023

hutingz commented Mar 7, 2024

Cresynia commented Aug 17, 2024

agenthong commented Aug 18, 2021 •

edited

Loading

agenthong commented Aug 26, 2021 •

edited

Loading

DemingWu commented Dec 9, 2021 •

edited

Loading

JamieWatson683 commented Jan 7, 2022 •

edited

Loading

longyangqi commented Apr 3, 2022 •

edited

Loading