Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IndexError: tuple index out of range #70

Open
JanineCHEN opened this issue Sep 15, 2020 · 4 comments
Open

IndexError: tuple index out of range #70

JanineCHEN opened this issue Sep 15, 2020 · 4 comments

Comments

@JanineCHEN
Copy link

JanineCHEN commented Sep 15, 2020

Hey, I am a student trying to reproduce the training process using my own dataset. I got the following error right after the first Epoch of training is finished:

Traceback (most recent call last):
  File "train.py", line 227, in <module>
    main()
  File "train.py", line 224, in main
    run(config)
  File "train.py", line 184, in run
    metrics = train(x, y)
  File "/home/projects/BIGGAN/train_fns.py", line 41, in train
    x[counter], y[counter], train_G=False, 
IndexError: tuple index out of range

I execute the training using sh scripts/launch_BigGAN_bs256x8.sh with my own dataset, the dataset has been transformed into HDF5 format without any error. The content of launch_BigGAN_bs256x8.sh I used:

#!/bin/bash
python train.py \
--dataset I128_hdf5 --parallel --shuffle  --num_workers 8 --batch_size 128 --load_in_mem  \
--num_G_accumulations 16 --num_D_accumulations 16 \
--num_D_steps 1 --G_lr 1e-4 --D_lr 4e-4 --D_B2 0.999 --G_B2 0.999 \
--G_attn 64 --D_attn 64 \
--G_nl inplace_relu --D_nl inplace_relu \
--SN_eps 1e-6 --BN_eps 1e-5 --adam_eps 1e-6 \
--G_ortho 0.0 \
--G_shared \
--G_init ortho --D_init ortho \
--hier --dim_z 120 --shared_dim 128 \
--G_eval_mode \
--which_best FID \
--G_ch 32 --D_ch 32 \
--ema --use_ema --ema_start 20000 \
--test_every 200 --save_every 100 --num_best_copies 5 --num_save_copies 2 --seed 0 \
--use_multiepoch_sampler \

I am not sure if this has something to do with the size of my dataset or number of classes? If so, how could I adjust the parameters? Or any other idea why this issue comes into place and how to tackle it? Any help would be very much appreciated! Thanks a bunch in advance.

@Baran-phys
Copy link

What was the solution?

@JanineCHEN
Copy link
Author

What was the solution?

Hi, it was the residual batch that caused the problem, you can either drop_last when constructing the dataloader or increase the number of epochs to avoid using the last batch.

@JanineCHEN JanineCHEN reopened this Sep 23, 2020
@Baran-phys
Copy link

Well, neither of them were solving this error on my side. I get this error when I use num_G_accumulations or num_D_accumulations more than 2.

@datduong
Copy link

I use drop_last and it works. I am using 4 GPU and batch size 52.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants