-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DevelopmentML possible inconsistencies between implementation and documentation #533
Comments
Hi @cdietrich21, thanks for the feedback. This is great and your commentary makes sense. Definitely need to update the docs to not reference Munich and be a bit clearer on how it works. Being a lesser used estimator, this one hasn't received a whole lotta love. I think we're muddying what "response" means in the docs. The triangle data itself is both the independent variables and the response (lagged of course). I believe when we prep the independent variables, converting to incremental basis should impact the "response" as well. That said, your last bullet suggests that we're converting to incrementals regardless of the end-user selection, and I can confirm that the code is wrong in doing this. I don't believe we have any test cases for non-incremental fits, but we probably should if we claim to support toggling it. I also agree it should clearer how it impacts the autoregressive parameter - which itself doesn't have test cases to ensure accuracy. That is- autoregressive may have rough edges even beyond the transparency you're looking for. |
Thanks @jbogaardt. Seems to be a very useful/flexible implementation and a great feature of the package so I'm happy to help improve it. It's possible my second and fourth bullet points are related if this is where it was attempted to convert to incremental. But I think 1) it's not enforcing that the last column is indeed the response and 2) it occurs regardless of the fit_incrementals argument. Agreed that the response could potentially be the same as some of the independent variables (if we're modeling cumulative If the user wanted to make predictor columns incremental - I think you could pretty easily update the |
You seem to have a pretty good sense for the issues and ways to improve it. I for one would absolutely love contributions on this. I think we have to just be careful that the dependent estimators ( |
Happy to contribute on this and suggest updates to the code. Do I need to be added as a contributor to push a branch for this? |
I had a chance to spend more time with this and get a better feel for what the code is doing. Going to restate my suggestions since I think I originally may have misunderstood some elements of what the code was doing. Happy to help suggest code once again
None of these would impact TweedieGLM or BarnettZehnwirth if implemented correctly |
Thanks for this excellent implementation of using machine learning methods on reserving data. During review I think I've noticed a couple of things in the code that may be out of step with the documentation, but I may be missing something with many of these.
Just a note that the docstring for the fit method references the Munich adjustment (I think held over from the
DevelopmentBase
class).The documentation states that the
fit_incrementals
argument controls "whether the response variable should be converted to an incremental basis for fitting", but I don't believey_ml
is ever converted to incremental values before fitting the ML model of choice in line 182, which I feel is out of line with what the documentation statesI notice that the
fit_incrementals
argument also controls whether or not autoregressive values are converted to incremental before fitting. While I agree it could sometimes make since to convert these, I don't think many users would infer this is happening from the documentation. Perhaps it'd be better to pass this alongside theautoregressive
parameter?On a side note I believe this merge is odd, it seems like it converts the last column to incremental regardless of if its the response or if fit_incrementals is used
The text was updated successfully, but these errors were encountered: