-
-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
hyperband and non-linear runtime scale #13
Comments
Documenting appropriately is a no-brainer - I already did that today. |
I had some further thoughts about it and if we simply apply the inverse of the complexity on each budget, we should end up with the same budget across all brackets again. The inverse then would be g := f^(-1) = exp Assume we are using R = 2 and eta = 2, the brackets layout would look like this:
In this example, the usual hyperband algorithm always takes a lot longer to run (theoretically) in the last bracket compared to the first ones.
Now, the sum of the budgets is (approximately) equal across all brackets, again. The only issue I can see here is, that the user specifies R = 2, but the algorithm then uses a budget of 7.39 for the last bracket stage. This is misleading? But at least the budgets are scaled correctly |
should work with the param trafo.
|
From a theoretical perspective, I think this issue is not a problem. The only assumption the hyperband paper makes is that the algorithm performance converges with a budget parameter going to infty. "Adapting for convergence rates" is another topic, and is also given as outlook in the original hyperband paper. |
i am pretty sure, all of the HB formulas dont hold anymore if the tuned algo does not scale linear in runtime with eta.
this is especially true if we use the subsampling trick. we at least need to document this, but also maybe adapt a bit.
this is a more complicated issue, and needs to be discussed in the team
of course, would be great if people already post observations and thoughts here
The text was updated successfully, but these errors were encountered: