Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Debug Hessian Spectra (missing negative eigenvalues) #14

Open
2 tasks
dgcnz opened this issue Apr 25, 2024 · 0 comments
Open
2 tasks

Debug Hessian Spectra (missing negative eigenvalues) #14

dgcnz opened this issue Apr 25, 2024 · 0 comments
Assignees
Labels
bug Something isn't working experiments

Comments

@dgcnz
Copy link
Owner

dgcnz commented Apr 25, 2024

Resources:

  • Original issue: Package Max Hessian Spectra computation #25
  • Relevant snippet in [Park et al. 2022] regarding hessian computation:

    "For Hessian max eigenvalue spectrum (Park & Kim, 2021), 10% of the training dataset is used. We also use power iteration with a batch size of 16 to produce the top-5 largest eigenvalues. To this end, we use the implementation of Yao et al. (2020). We modify the algorithm to calculate the eigenvalues with respect to L2 regularized NLL on augmented training datasets. In the strict sense, the weight decay is not L2 regularization, but we neglect the difference."

Missing negative eigenvalues can be caused by either the model training or the hessian computation.

Thus, we need to validate:

  • that the training for our toy comparison replicates [Park et al. 2022] instructions
  • that the parameters or our hessian computation are the same
@dgcnz dgcnz added bug Something isn't working experiments labels Apr 25, 2024
@dgcnz dgcnz self-assigned this Apr 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working experiments
Projects
None yet
Development

No branches or pull requests

1 participant