Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

code_multi train waymo dataset reproduce problem #54

Open
blackmrb opened this issue May 6, 2024 · 1 comment
Open

code_multi train waymo dataset reproduce problem #54

blackmrb opened this issue May 6, 2024 · 1 comment

Comments

@blackmrb
Copy link

blackmrb commented May 6, 2024

首先感谢作者无私奉献开源了这个repo,我尝试了几个方法,目前Neuralsim是效果最好的。

我基于waymo做了18组实验(这些场景是在repo提供的81个动态场景里挑选的,有低速行驶和高速行驶的),有7组的loss不收敛。
使用的config: all_occ.with_normals.240201.yaml
使用的segmentation模型:https://github.com/open-mmlab/mmsegmentation/tree/main/configs/mask2former

image
想请教的问题:

  1. loss不收敛问题
    1. 自车速度高时loss不收敛是什么原因?对于自车高速行驶的场景有哪些超参数需要调整?config默认的配置segment-9653249092自车是低速行驶的,从实验结果来看这组效果最好,但是我们的业务更关注高速的场景。
  2. 重建效果不好问题有哪些优化思路?(视频见下文)
    1. 车道线不清晰
    2. 空中多了一块云/水滴,可能是什么问题导致的,有哪些优化的思路?
    3. 彻底糊了的原因,见seg-364414视频
    4. 最开始几帧远处闪了几下

具体复现结果如下:

loss不收敛导致训到一半挂了(7组),自车速度在70km/h左右

    # - segment-1758724094753801109_1251_037_1271_037_with_camera_labels # 64km/h
    # - segment-3490810581309970603_11125_000_11145_000_with_camera_labels # 71km/h
    # - segment-3591015878717398163_1381_280_1401_280_with_camera_labels # 68km/h
    # - segment-4468278022208380281_455_820_475_820_with_camera_labels # 70km/h, good case
    # - segment-4537254579383578009_3820_000_3840_000_with_camera_labels # 68km/h, good case
    # - segment-10072231702153043603_5725_000_5745_000_with_camera_labels # 40 -> 70km/h,宽阔,只有前方一辆车
    # - segment-11454085070345530663_1905_000_1925_000_with_camera_labels #70km/h,good case

训练完成的(11组)

效果好的(4组)

segment-9653249092275997647_980_000_1000_000_with_camera_labels, 0, 190 # 路口,很多行人这组效果最好,是code_multi/configs/exps/fg_neus=permuto/all_occ.with_normals.240201.yaml默认使用的scene,自车低速行驶。

seg965324_iter15.0k_eval_ds.4.0_camera_FRONT_1l.mp4

119178,低速行驶,17km/h

seg119178_iter15.0k_eval_ds.4.0_camera_FRONT_1l.mp4

365758,低速行驶,20km/h

seg365758_iter15.0k_eval_ds.4.0_camera_FRONT_1l.mp4

189139,低速行驶,25km

seg189139_iter15.0k_eval_ds.4.0_camera_FRONT_1l.mp4

车道线不清晰(2组)

segment-15053781258223091665_3192_117_3212_117_with_camera_labels # 20->75km/h问题:车道线不清晰
image

https://github.com/PJLab-ADG/neuralsim/assets/165770555/cfbf155e-fbaf-4220-8e2a-5248b649ff35
seg188749,整体可以,车道线不清晰, 36km/h

seg188749_iter15.0k_eval_ds.4.0_camera_FRONT_1l.mp4

空中多了一块东西(3组)

segment-14369250836076988112_7249_040_7269_040_with_camera_labels # 56km/h问题:空中多了一块
image
https://github.com/PJLab-ADG/neuralsim/assets/165770555/f4fb8402-2f6f-44e9-b3c5-a96bbb0ed5c0

177369,空中多了一片东西,17km/h

seg177369_iter15.0k_eval_ds.4.0_camera_FRONT_1l.mp4

391943,低速行驶,夜晚,效果可以,空中多了一片雨滴,15km/h
image

彻底糊了(1组)

364414,低速行驶,street彻底糊了,效果很差,0->50km/h

seg364414_iter15.0k_eval_ds.4.0_camera_FRONT_1l.mp4

最开始几帧远处闪了几下(1组)

416406 ,0-25km
https://github.com/PJLab-ADG/neuralsim/assets/165770555/ed86c126-da63-4b0c-b70b-a6cc0c9222d4

下面是7组失败的场景和一组成功的场景(seg965324,绿色)的loss对比

pixel loss

image

image

lidar loss

image

下面是各组实验的具体log记录

loss NAN

segment-1758724094753801109_1251_037_1271_037_with_camera_labels

image

segment-10072231702153043603_5725_000_5745_000_with_camera_labels

train-20240505234622385.log

segment-11454085070345530663_1905_000_1925_000_with_camera_labels

train-20240505234659815.log

segment-4537254579383578009_3820_000_3840_000_with_camera_labels

train-20240505234419432.log

2024-05-05T17:52:59.168990620Z 
 61%|██████    | 9174/15000 [1:51:42<49:52,  1.95it/s, loss_total=1.9]�[A
2024-05-05T17:52:59.672168504Z 
 61%|██████    | 9174/15000 [1:51:42<49:52,  1.95it/s, loss_total=1.95]�[A
2024-05-05T17:52:59.672575273Z 
 61%|██████    | 9175/15000 [1:51:43<49:34,  1.96it/s, loss_total=1.95]�[A
2024-05-05T17:53:00.187376121Z 
 61%|██████    | 9175/15000 [1:51:43<49:34,  1.96it/s, loss_total=1.55]�[A
2024-05-05T17:53:00.187871303Z 
 61%|██████    | 9176/15000 [1:51:43<49:41,  1.95it/s, loss_total=1.55]�[A
2024-05-05T17:53:00.854840512Z 
 61%|██████    | 9176/15000 [1:51:43<49:41,  1.95it/s, loss_total=1.77]�[A
2024-05-05T17:53:00.855162754Z 
 61%|██████    | 9177/15000 [1:51:44<54:12,  1.79it/s, loss_total=1.77]�[A
2024-05-05T17:53:01.381251068Z 
 61%|██████    | 9177/15000 [1:51:44<54:12,  1.79it/s, loss_total=1.84]�[A
2024-05-05T17:53:01.381580806Z 
 61%|██████    | 9178/15000 [1:51:45<53:16,  1.82it/s, loss_total=1.84]�[A
2024-05-05T17:53:01.893998791Z 
 61%|██████    | 9178/15000 [1:51:45<53:16,  1.82it/s, loss_total=1.67]�[A
2024-05-05T17:53:01.894265751Z 
 61%|██████    | 9179/15000 [1:51:45<52:12,  1.86it/s, loss_total=1.67]�[A
2024-05-05T17:53:02.518480330Z 
 61%|██████    | 9179/15000 [1:51:45<52:12,  1.86it/s, loss_total=1.52]�[A
2024-05-05T17:53:02.518743279Z 
 61%|██████    | 9180/15000 [1:51:46<54:42,  1.77it/s, loss_total=1.52]�[A
2024-05-05T17:53:03.074412687Z 
 61%|██████    | 9180/15000 [1:51:46<54:42,  1.77it/s, loss_total=1.29]�[A
2024-05-05T17:53:03.074830105Z 
 61%|██████    | 9181/15000 [1:51:46<54:27,  1.78it/s, loss_total=1.29]�[A
2024-05-05T17:53:03.582773677Z 
 61%|██████    | 9181/15000 [1:51:46<54:27,  1.78it/s, loss_total=1.35]�[A
2024-05-05T17:53:03.583124855Z 
 61%|██████    | 9182/15000 [1:51:47<52:54,  1.83it/s, loss_total=1.35]�[A
2024-05-05T17:53:04.045558308Z 
 61%|██████    | 9182/15000 [1:51:47<52:54,  1.83it/s, loss_total=1.27]�[A
2024-05-05T17:53:04.045775159Z 
 61%|██████    | 9183/15000 [1:51:47<50:29,  1.92it/s, loss_total=1.27]�[A
2024-05-05T17:53:04.609443972Z 
 61%|██████    | 9183/15000 [1:51:47<50:29,  1.92it/s, loss_total=1.95]�[A
2024-05-05T17:53:04.609845332Z 
 61%|██████    | 9184/15000 [1:51:48<51:43,  1.87it/s, loss_total=1.95]�[A
2024-05-05T17:53:04.788085968Z 
 61%|██████    | 9184/15000 [1:51:48<51:43,  1.87it/s, loss_total=1.76]�[A
 61%|██████    | 9184/15000 [1:51:48<1:10:48,  1.37it/s, loss_total=1.76]
2024-05-05T17:53:04.788116308Z 
  0%|          | 0/1 [2:07:17<?, ?it/s]
2024-05-05T17:53:04.788133860Z Error occurred in exp: logs/waymo/code_multi/fg_neus=permuto/all_occ.with_normals.24020/seg453725
2024-05-05T17:53:04.802094313Z Traceback (most recent call last):
2024-05-05T17:53:04.802125179Z   File "dataio/autonomous_driving/waymo/train_multi_and_eval_multiple.py", line 30, in <module>
2024-05-05T17:53:04.802129404Z     train_main(sce_args)
2024-05-05T17:53:04.802132444Z   File "/home/rongbo.ma/neuralsim/code_multi/tools/train.py", line 1537, in main_function
2024-05-05T17:53:04.802135738Z     raise e
2024-05-05T17:53:04.802138554Z   File "/home/rongbo.ma/neuralsim/code_multi/tools/train.py", line 1529, in main_function
2024-05-05T17:53:04.802141435Z     train_step()
2024-05-05T17:53:04.802144030Z   File "/home/rongbo.ma/anaconda3/envs/nr3d_new/lib/python3.8/site-packages/nr3d_lib/profile.py", line 561, in <lambda>
2024-05-05T17:53:04.802147167Z     return lambda *args, **kwargs: _ProfileWrap(fn=arg)(*args, **kwargs)
2024-05-05T17:53:04.802150041Z   File "/home/rongbo.ma/anaconda3/envs/nr3d_new/lib/python3.8/site-packages/nr3d_lib/profile.py", line 501, in __call__
2024-05-05T17:53:04.802153010Z     ret = self.fn(*args, **kwargs)
2024-05-05T17:53:04.802155911Z   File "/home/rongbo.ma/neuralsim/code_multi/tools/train.py", line 1345, in train_step
2024-05-05T17:53:04.802158779Z     ret, losses = trainer('pixel', sample, ground_truth, local_it, logger=logger)
2024-05-05T17:53:04.802161624Z   File "/home/rongbo.ma/anaconda3/envs/nr3d_new/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
2024-05-05T17:53:04.802182312Z     return forward_call(*input, **kwargs)
2024-05-05T17:53:04.802185404Z   File "/home/rongbo.ma/anaconda3/envs/nr3d_new/lib/python3.8/site-packages/nr3d_lib/profile.py", line 561, in <lambda>
2024-05-05T17:53:04.802188365Z     return lambda *args, **kwargs: _ProfileWrap(fn=arg)(*args, **kwargs)
2024-05-05T17:53:04.802191166Z   File "/home/rongbo.ma/anaconda3/envs/nr3d_new/lib/python3.8/site-packages/nr3d_lib/profile.py", line 501, in __call__
2024-05-05T17:53:04.802194122Z     ret = self.fn(*args, **kwargs)
2024-05-05T17:53:04.802196831Z   File "/home/rongbo.ma/neuralsim/code_multi/tools/train.py", line 344, in forward
2024-05-05T17:53:04.802199701Z     ret, losses = self.train_step_pixel(sample, ground_truth, it, logger=logger)
2024-05-05T17:53:04.802202496Z   File "/home/rongbo.ma/neuralsim/code_multi/tools/train.py", line 500, in train_step_pixel
2024-05-05T17:53:04.802205371Z     ret = self.renderer.render(
2024-05-05T17:53:04.802208018Z   File "/home/rongbo.ma/neuralsim/app/renderers/buffer_compose_renderer.py", line 937, in render
2024-05-05T17:53:04.802211058Z     ret = self(*rays, scene=scene, observer=observer, **kwargs)
2024-05-05T17:53:04.802213869Z   File "/home/rongbo.ma/anaconda3/envs/nr3d_new/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
2024-05-05T17:53:04.802216850Z     return forward_call(*input, **kwargs)
2024-05-05T17:53:04.802220840Z   File "/home/rongbo.ma/neuralsim/app/renderers/buffer_compose_renderer.py", line 96, in forward
2024-05-05T17:53:04.802223668Z     return self.ray_query(*args, **kwargs)
2024-05-05T17:53:04.802226364Z   File "/home/rongbo.ma/anaconda3/envs/nr3d_new/lib/python3.8/site-packages/nr3d_lib/profile.py", line 561, in <lambda>
2024-05-05T17:53:04.802229416Z     return lambda *args, **kwargs: _ProfileWrap(fn=arg)(*args, **kwargs)
2024-05-05T17:53:04.802232338Z   File "/home/rongbo.ma/anaconda3/envs/nr3d_new/lib/python3.8/site-packages/nr3d_lib/profile.py", line 501, in __call__
2024-05-05T17:53:04.802235185Z     ret = self.fn(*args, **kwargs)
2024-05-05T17:53:04.802237827Z   File "/home/rongbo.ma/neuralsim/app/renderers/buffer_compose_renderer.py", line 627, in ray_query
2024-05-05T17:53:04.802240628Z     batched_query_shared(model, group)
2024-05-05T17:53:04.802243280Z   File "/home/rongbo.ma/neuralsim/app/renderers/buffer_compose_renderer.py", line 263, in batched_query_shared
2024-05-05T17:53:04.802246252Z     raw_ret: dict = model.batched_ray_query(
2024-05-05T17:53:04.802248936Z   File "/home/rongbo.ma/anaconda3/envs/nr3d_new/lib/python3.8/site-packages/nr3d_lib/models/fields_conditional_dynamic/neus/renderer_mixin.py", line 288, in batched_ray_query
2024-05-05T17:53:04.802252145Z     details['accel'] = self.accel.debug_stats()
2024-05-05T17:53:04.802254880Z   File "/home/rongbo.ma/anaconda3/envs/nr3d_new/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
2024-05-05T17:53:04.802260619Z     return func(*args, **kwargs)
2024-05-05T17:53:04.802263383Z   File "/home/rongbo.ma/anaconda3/envs/nr3d_new/lib/python3.8/site-packages/nr3d_lib/models/accelerations/occgrid_accel/batched_dynamic.py", line 253, in debug_stats
2024-05-05T17:53:04.802266720Z     **tensor_statistics(num_occupied_per_nonempty_ins, 'per_ins.nonempty.num_occupied', metrics=['mean', 'min', 'max', 'std']), 
2024-05-05T17:53:04.802269593Z   File "/home/rongbo.ma/anaconda3/envs/nr3d_new/lib/python3.8/site-packages/nr3d_lib/utils.py", line 797, in tensor_statistics
2024-05-05T17:53:04.802272508Z     return {f"{prefix}{'.' if prefix and not prefix.endswith('.') else ''}{key}": metric_fn[key](data).item() for key in metrics if key in metric_fn}
2024-05-05T17:53:04.802275665Z   File "/home/rongbo.ma/anaconda3/envs/nr3d_new/lib/python3.8/site-packages/nr3d_lib/utils.py", line 797, in <dictcomp>
2024-05-05T17:53:04.802278848Z     return {f"{prefix}{'.' if prefix and not prefix.endswith('.') else ''}{key}": metric_fn[key](data).item() for key in metrics if key in metric_fn}
2024-05-05T17:53:04.802282326Z   File "/home/rongbo.ma/anaconda3/envs/nr3d_new/lib/python3.8/site-packages/nr3d_lib/utils.py", line 785, in <lambda>
2024-05-05T17:53:04.802285378Z     "min": lambda x: x.min(),
2024-05-05T17:53:04.802288186Z RuntimeError: min(): Expected reduction dim to be specified for input.numel() == 0. Specify the reduction dim with the 'dim' argument.

train-20240505233823947.log

AssertionError: Occupancy grid becomes empty during training.

scenario_id=segment-3490810581309970603_11125_000_11145_000_with_camera_labels

34: 2024-05-05T17:40:15.446258349Z 
 95%|█████████▌| 14267/15000 [1:52:54<05:26,  2.24it/s, loss_total=0.326]�[A
35: 2024-05-05T17:40:15.446497280Z 
 95%|█████████▌| 14268/15000 [1:52:54<05:16,  2.32it/s, loss_total=0.326]�[A
36: 2024-05-05T17:40:15.809401574Z 
 95%|█████████▌| 14268/15000 [1:52:54<05:16,  2.32it/s, loss_total=0.26] �[A
37: 2024-05-05T17:40:15.809610367Z 
 95%|█████████▌| 14269/15000 [1:52:55<05:00,  2.43it/s, loss_total=0.26]�[A
38: 2024-05-05T17:40:16.275019572Z 
 95%|█████████▌| 14269/15000 [1:52:55<05:00,  2.43it/s, loss_total=0.24]�[A
39: 2024-05-05T17:40:16.275154072Z 
 95%|█████████▌| 14270/15000 [1:52:55<05:12,  2.34it/s, loss_total=0.24]�[A
40: 2024-05-05T17:40:16.699921008Z 
 95%|█████████▌| 14270/15000 [1:52:55<05:12,  2.34it/s, loss_total=0.372]�[A
41: 2024-05-05T17:40:16.700209054Z 
 95%|█████████▌| 14271/15000 [1:52:56<05:11,  2.34it/s, loss_total=0.372]�[A
42: 2024-05-05T17:40:17.101061717Z 
 95%|█████████▌| 14271/15000 [1:52:56<05:11,  2.34it/s, loss_total=0.239]�[A
43: 2024-05-05T17:40:17.101251008Z 
 95%|█████████▌| 14272/15000 [1:52:56<05:05,  2.39it/s, loss_total=0.239]�[A
44: 2024-05-05T17:40:17.465482350Z 
 95%|█████████▌| 14272/15000 [1:52:56<05:05,  2.39it/s, loss_total=0.436]�[A
 95%|█████████▌| 14272/15000 [1:52:56<05:45,  2.11it/s, loss_total=0.436]
45: 2024-05-05T17:40:17.465515750Z 
  0%|          | 0/1 [1:57:24<?, ?it/s]
46: 2024-05-05T17:40:17.465537233Z Error occurred in exp: logs/waymo/code_multi/fg_neus=permuto/all_occ.with_normals.24020/seg349081
47: 2024-05-05T17:40:17.468884262Z Traceback (most recent call last):
48: 2024-05-05T17:40:17.468894599Z   File "dataio/autonomous_driving/waymo/train_multi_and_eval_multiple.py", line 30, in <module>
49: 2024-05-05T17:40:17.468897375Z     train_main(sce_args)
50: 2024-05-05T17:40:17.468899457Z   File "/home/rongbo.ma/neuralsim/code_multi/tools/train.py", line 1537, in main_function
51: 2024-05-05T17:40:17.468901977Z     raise e
52: 2024-05-05T17:40:17.468903926Z   File "/home/rongbo.ma/neuralsim/code_multi/tools/train.py", line 1529, in main_function
53: 2024-05-05T17:40:17.468906147Z     train_step()
54: 2024-05-05T17:40:17.468908124Z   File "/home/rongbo.ma/anaconda3/envs/nr3d_new/lib/python3.8/site-packages/nr3d_lib/profile.py", line 561, in <lambda>
55: 2024-05-05T17:40:17.468910394Z     return lambda *args, **kwargs: _ProfileWrap(fn=arg)(*args, **kwargs)
56: 2024-05-05T17:40:17.468912465Z   File "/home/rongbo.ma/anaconda3/envs/nr3d_new/lib/python3.8/site-packages/nr3d_lib/profile.py", line 501, in __call__
57: 2024-05-05T17:40:17.468914765Z     ret = self.fn(*args, **kwargs)
58: 2024-05-05T17:40:17.468916721Z   File "/home/rongbo.ma/neuralsim/code_multi/tools/train.py", line 1400, in train_step
59: 2024-05-05T17:40:17.468918774Z     ret, losses = trainer('lidar', sample, ground_truth, local_it, logger=logger)
60: 2024-05-05T17:40:17.468920915Z   File "/home/rongbo.ma/anaconda3/envs/nr3d_new/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
61: 2024-05-05T17:40:17.468923227Z     return forward_call(*input, **kwargs)
62: 2024-05-05T17:40:17.468925196Z   File "/home/rongbo.ma/anaconda3/envs/nr3d_new/lib/python3.8/site-packages/nr3d_lib/profile.py", line 561, in <lambda>
63: 2024-05-05T17:40:17.468927319Z     return lambda *args, **kwargs: _ProfileWrap(fn=arg)(*args, **kwargs)
64: 2024-05-05T17:40:17.468929319Z   File "/home/rongbo.ma/anaconda3/envs/nr3d_new/lib/python3.8/site-packages/nr3d_lib/profile.py", line 501, in __call__
65: 2024-05-05T17:40:17.468931591Z     ret = self.fn(*args, **kwargs)
66: 2024-05-05T17:40:17.468933531Z   File "/home/rongbo.ma/neuralsim/code_multi/tools/train.py", line 348, in forward
67: 2024-05-05T17:40:17.468935587Z     ret, losses = self.train_step_lidar(sample, ground_truth, it, logger=logger)
68: 2024-05-05T17:40:17.468937575Z   File "/home/rongbo.ma/neuralsim/code_multi/tools/train.py", line 765, in train_step_lidar
69: 2024-05-05T17:40:17.468939672Z     ret = self.renderer.render(
70: 2024-05-05T17:40:17.468949806Z   File "/home/rongbo.ma/neuralsim/app/renderers/buffer_compose_renderer.py", line 937, in render
71: 2024-05-05T17:40:17.468952085Z     ret = self(*rays, scene=scene, observer=observer, **kwargs)
72: 2024-05-05T17:40:17.468954154Z   File "/home/rongbo.ma/anaconda3/envs/nr3d_new/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
73: 2024-05-05T17:40:17.468956293Z     return forward_call(*input, **kwargs)
74: 2024-05-05T17:40:17.468958702Z   File "/home/rongbo.ma/neuralsim/app/renderers/buffer_compose_renderer.py", line 96, in forward
75: 2024-05-05T17:40:17.468960834Z     return self.ray_query(*args, **kwargs)
76: 2024-05-05T17:40:17.468962792Z   File "/home/rongbo.ma/anaconda3/envs/nr3d_new/lib/python3.8/site-packages/nr3d_lib/profile.py", line 561, in <lambda>
77: 2024-05-05T17:40:17.468964964Z     return lambda *args, **kwargs: _ProfileWrap(fn=arg)(*args, **kwargs)
78: 2024-05-05T17:40:17.468966995Z   File "/home/rongbo.ma/anaconda3/envs/nr3d_new/lib/python3.8/site-packages/nr3d_lib/profile.py", line 501, in __call__
79: 2024-05-05T17:40:17.468969282Z     ret = self.fn(*args, **kwargs)
80: 2024-05-05T17:40:17.468971206Z   File "/home/rongbo.ma/neuralsim/app/renderers/buffer_compose_renderer.py", line 627, in ray_query
81: 2024-05-05T17:40:17.468973299Z     batched_query_shared(model, group)
82: 2024-05-05T17:40:17.468975245Z   File "/home/rongbo.ma/neuralsim/app/renderers/buffer_compose_renderer.py", line 259, in batched_query_shared
83: 2024-05-05T17:40:17.468977324Z     model.set_condition(batched_infos)
84: 2024-05-05T17:40:17.468979240Z   File "/home/rongbo.ma/neuralsim/app/models/shared/batched_neus.py", line 404, in set_condition
85: 2024-05-05T17:40:17.468981351Z     super().set_condition(z=z_ins_per_batch, ins_inds_per_batch=ins_inds_per_batch)
86: 2024-05-05T17:40:17.468983453Z   File "/home/rongbo.ma/anaconda3/envs/nr3d_new/lib/python3.8/site-packages/nr3d_lib/models/fields_conditional/neus/renderer_mixin.py", line 105, in set_condition
87: 2024-05-05T17:40:17.468985675Z     self.accel.cur_batch__step(self.it, self.query_sdf)
88: 2024-05-05T17:40:17.468987914Z   File "/home/rongbo.ma/anaconda3/envs/nr3d_new/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
89: 2024-05-05T17:40:17.468990095Z     return func(*args, **kwargs)
90: 2024-05-05T17:40:17.468992076Z   File "/home/rongbo.ma/anaconda3/envs/nr3d_new/lib/python3.8/site-packages/nr3d_lib/models/accelerations/occgrid_accel/batched.py", line 149, in cur_batch__step
91: 2024-05-05T17:40:17.468994266Z     updated = self.occ.step(cur_it, val_query_fn_normalized_x_bi, within_bi=self.ins_inds_per_batch, logger=logger)
92: 2024-05-05T17:40:17.468996555Z   File "/home/rongbo.ma/anaconda3/envs/nr3d_new/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
93: 2024-05-05T17:40:17.468998669Z     return func(*args, **kwargs)
94: 2024-05-05T17:40:17.469000564Z   File "/home/rongbo.ma/anaconda3/envs/nr3d_new/lib/python3.8/site-packages/nr3d_lib/models/accelerations/occgrid/ema_batched.py", line 145, in step
95: 2024-05-05T17:40:17.469002721Z     self._step(cur_it, val_query_fn_normalized_x_bi, within_bi=within_bi, 
96: 2024-05-05T17:40:17.469007031Z   File "/home/rongbo.ma/anaconda3/envs/nr3d_new/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
97: 2024-05-05T17:40:17.469009352Z     return func(*args, **kwargs)
98: 2024-05-05T17:40:17.469011551Z   File "/home/rongbo.ma/anaconda3/envs/nr3d_new/lib/python3.8/site-packages/nr3d_lib/models/accelerations/occgrid/ema_batched.py", line 193, in _step
99: 2024-05-05T17:40:17.469013680Z     assert idx_nonempty.numel() > 0, "Occupancy grid becomes empty during training. Your model/algorithm/training setting might be incorrect. Please check."
100: 2024-05-05T17:40:17.469016117Z AssertionError: Occupancy grid becomes empty during training. Your model/algorithm/training setting might be incorrect. Please check
@blackmrb
Copy link
Author

blackmrb commented May 7, 2024

补充一下7组训练失败的case,每个20s的场景分成前后两段10s,重新训练的结果。
怀疑是速度太大,自车移动距离过长,导致场景太大。

    # - segment-1758724094753801109_1251_037_1271_037_with_camera_labels # 64km/h
    # - segment-3490810581309970603_11125_000_11145_000_with_camera_labels # 71km/h
    # - segment-3591015878717398163_1381_280_1401_280_with_camera_labels # 68km/h
    # - segment-4468278022208380281_455_820_475_820_with_camera_labels # 70km/h, good case
    # - segment-4537254579383578009_3820_000_3840_000_with_camera_labels # 68km/h, good case
    # - segment-10072231702153043603_5725_000_5745_000_with_camera_labels # 40 -> 70km/h,宽阔,只有前方一辆车
    # - segment-11454085070345530663_1905_000_1925_000_with_camera_labels #70km/h,good case

结论:

  1. 整体上渲染是很好的,不过空中几乎都会多出一团东西,像云一样
  2. 隧道很模糊
  3. 有两个场景不知为何没有地面

0-10s

远处山突然丢了,紧接着一团云闪现出来向自车移动

seg100722_iter15.0k_eval_ds.4.0_camera_FRONT_1l.mp4

地面没了,训练失败

seg114540_iter15.0k_eval_ds.4.0_camera_FRONT_1l.mp4
seg349081_iter15.0k_eval_ds.4.0_camera_FRONT_1l.mp4

立交桥突然消失

seg359101_iter15.0k_eval_ds.4.0_camera_FRONT_1l.mp4

10-20s

所有场景都是空中多一块,向自车冲来。
这个多一块尤为严重。

seg349081_iter15.0k_eval_ds.4.0_camera_FRONT_1l.mp4

立交桥下,非常糊。

seg359101_iter15.0k_eval_ds.4.0_camera_FRONT_1l.mp4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant