You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Dec 24, 2024. It is now read-only.
Directly compute gradient using the given update rule (This is the difference compared to Teacher distill, on-policy distill and entropy regularised distillation
Update nn parameters accordingly
Test is policy distillation works using available teacher policy
The text was updated successfully, but these errors were encountered:
Script exits with terminate called after throwing an instance of 'std::runtime_error what(): invalid argument 13: ldc should be at least max(1, m=0), but have 0 at /pytorch/aten/src/TH/generic/THBlas.cpp:334 [1] 24572 abort (core dumped) python run_n_distill.py
-> need to check the computed gradients
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Adding N-distill according to https://arxiv.org/abs/1902.02186
The text was updated successfully, but these errors were encountered: