Debug Hessian Spectra (missing negative eigenvalues) #14

dgcnz · 2024-04-25T14:16:04Z

Resources:

Original issue: Package Max Hessian Spectra computation #25
Relevant snippet in [Park et al. 2022] regarding hessian computation:

"For Hessian max eigenvalue spectrum (Park & Kim, 2021), 10% of the training dataset is used. We also use power iteration with a batch size of 16 to produce the top-5 largest eigenvalues. To this end, we use the implementation of Yao et al. (2020). We modify the algorithm to calculate the eigenvalues with respect to L2 regularized NLL on augmented training datasets. In the strict sense, the weight decay is not L2 regularization, but we neglect the difference."

Missing negative eigenvalues can be caused by either the model training or the hessian computation.

Thus, we need to validate:

that the training for our toy comparison replicates [Park et al. 2022] instructions
that the parameters or our hessian computation are the same

dgcnz added bug Something isn't working experiments labels Apr 25, 2024

dgcnz self-assigned this Apr 25, 2024

dgcnz mentioned this issue Apr 25, 2024

Test hypothesis on toy problem #27

Closed

4 tasks

Provide feedback