You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Relevant snippet in [Park et al. 2022] regarding hessian computation:
"For Hessian max eigenvalue spectrum (Park & Kim, 2021), 10% of the training dataset is used. We also use power iteration with a batch size of 16 to produce the top-5 largest eigenvalues. To this end, we use the implementation of Yao et al. (2020). We modify the algorithm to calculate the eigenvalues with respect to L2 regularized NLL on augmented training datasets. In the strict sense, the weight decay is not L2 regularization, but we neglect the difference."
Missing negative eigenvalues can be caused by either the model training or the hessian computation.
Thus, we need to validate:
that the training for our toy comparison replicates [Park et al. 2022] instructions
that the parameters or our hessian computation are the same
The text was updated successfully, but these errors were encountered:
Resources:
Missing negative eigenvalues can be caused by either the model training or the hessian computation.
Thus, we need to validate:
The text was updated successfully, but these errors were encountered: