the grad calculation takes up a lot of memory #14

PanXiebit · 2024-05-10T06:17:25Z

Smooth-Diffusion/train_smooth_diffusion.py

Line 312 in 5522761

grad, = autograd.grad(

grad, = autograd.grad(
        outputs=(fake_img * noise).sum(), inputs=latents, create_graph=True
    )

The calculation of gradients is memory inefficient and lacks support for flast-attention. Consequently, when training with the reg_loss, it becomes necessary to reduce the batch_size.

Pakase · 2024-08-09T13:36:30Z

Hi, I wander how much the extra memory cost may be? Could you give me a rough number?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

the grad calculation takes up a lot of memory #14

the grad calculation takes up a lot of memory #14

PanXiebit commented May 10, 2024

Pakase commented Aug 9, 2024

the grad calculation takes up a lot of memory #14

the grad calculation takes up a lot of memory #14

Comments

PanXiebit commented May 10, 2024

Pakase commented Aug 9, 2024