Method summary#852
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #852 +/- ##
==========================================
+ Coverage 86.94% 87.22% +0.28%
==========================================
Files 86 86
Lines 4994 5097 +103
Branches 644 649 +5
==========================================
+ Hits 4342 4446 +104
Misses 462 462
+ Partials 190 189 -1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
| return self.X_.sum('development') | ||
|
|
||
| @property | ||
| def summary(self): |
There was a problem hiding this comment.
I think it should be summary_ with the _ to denote that the summary includes estimated values.
| ldf = X.ldf_.to_frame(implicit_axis=True,keepdims=True).reset_index().melt(id_vars=dev_ids,var_name = 'column',value_name='ldf') | ||
| dev_factor_index = X.ldf_.key_labels + ['development','column'] | ||
| ldf = ldf[dev_factor_index + ['ldf']] | ||
| ldf.set_index(dev_factor_index,inplace=True) |
There was a problem hiding this comment.
Is the LDFs useful? I think the CDFs are fine.
There was a problem hiding this comment.
doesn't have to be useful. we are trying to save the user extra work. manipulating ldf separately is not trivial.
There was a problem hiding this comment.
When would the user ever need LDFs in the summary?
There was a problem hiding this comment.
easy example, if they want to calculate the expected emergence over the next calendar period
There was a problem hiding this comment.
I still think it's not needed since in my mind summary_ is a summary, it's the end of the workflow and it should tell me everything I want from the model/method. If I want the expected emergence it shouldn't come out of the summary. But we can keep it if you think it's useful.
| cdf = cdf[dev_factor_index + ['cdf']] | ||
| cdf.set_index(dev_factor_index,inplace=True) | ||
| #assemble full summary. start from ultimate, as some methods (e.g. BF) return an ultimate without any actual amount | ||
| output = ultimate.drop(columns=['development','valuation']) |
There was a problem hiding this comment.
I would also add the ibnr_ in there.
There was a problem hiding this comment.
this actually came up a couple of weeks earlier. bugbot saw my friedland code and made the comment that ult - paid isn't IBNR. i think that's a valid point. ibnr_ isn't guaranteed to be IBNR.
There was a problem hiding this comment.
Oh ya that's another thing that bugs me, a better name would be unmerged or something like that. But this will be deprecate and move to a new name.
There was a problem hiding this comment.
also, unlike ldf, "ibnr" can just be df["ultimate"] - df["latest"], whenever the user wants.
There was a problem hiding this comment.
It depends, because most actuaries care about the "ibnr", or the "unemerged".
I do think we should rename the column from ibnr_ to 'unemerged_or something like that. Similar in the Triangle class,developmentshould bevaluation, because development` are actual ages - but this is a separate topic.
There was a problem hiding this comment.
I agree that actuaries care about IBNR. I'm only not including it because I don't have a good name that generalizes between IBNR and unpaid. my reservation with unemerged is that no one uses it.
| #columns for melt | ||
| ids = X.key_labels + ['origin','development','valuation'] | ||
| #create dataframe for amount | ||
| amount = X.incr_to_cum().latest_diagonal.to_frame(implicit_axis=True,keepdims=True).reset_index().melt(id_vars=ids,var_name = 'column',value_name='actual') |
There was a problem hiding this comment.
Can we call "Actual" "Latest" instead? Let's match the MackChainladder method. Here's an example.
| the known data | ||
| summary_: | ||
| summary of the model | ||
| mack_summary_: |
There was a problem hiding this comment.
I understand the intend here, but I don't think we should change this right away as it will break the workflow. If we want to go this route, we need to throw a warning and deprecate appropriately. I think it's best to just leave as summary_.
There was a problem hiding this comment.
i need clearer instructions then. these methods are inherited. i can't make mack have a different name than chainladder. we either go back to summary for this new dataframe, or change the old summary_ from mack to something else.
There was a problem hiding this comment.
what if i add st err to the summary_ just for mack, essentially merging the old summary_
There was a problem hiding this comment.
Ya that might be good...
So we have summary_ for everything, and std_err_ with actual numbers from Mack, and NaNs for deterministic models.
|
@henrydingliu I am good with the I don't like change of |
|
closing due to finding this. will rework and resubmit. |
Summary of Changes
adding a summary property/attribute to ibnr methods,
a private function to assemble the exhibit in methodbase
a property in methodbase that points to the function above
an attribute to the predicted triangle (only needed for chainladder and benktander), again using the function above
added summary to the attribute section of the docstring of various methods
added a test across the various methods
Related GitHub Issue(s)
#839
Additional Context for Reviewers
i went through various approaches:
also streamlined some tests while i was in there, so i didn't have to write a bunch of identical tests for each method
uv run pytest) and documentation changes (uv run jb build docs --builder=custom --custom-builder=doctest)Note
Medium Risk
Renaming MackChainladder’s
summary_tomack_summary_is a breaking API change for existing callers; the new join/melt summary logic affects all listed IBNR methods.Overview
Adds a shared
summary_exhibit on IBNR estimators: a pandas table of latest, ldf, cdf, and ultimate per origin/column, built inMethodBase._get_summaryfrom melted triangle frames and left joins (so ultimates without a reported amount still appear).predicton Chainladder and Benktander attachessummary_on the returned triangle; docstrings for Cape Cod, BF, Expected Loss, etc. document the attribute. Benktander now requiressample_weightonpredict(raisesValueErrorif missing).MackChainladder: the old origin table (Latest / IBNR / Ultimate / Mack Std Err) is renamed tomack_summary_;summary_is the new generic exhibit. Tests and notebooks/gallery updated accordingly.Predict tests are parametrized across several estimators to assert
summary_matcheslatest_diagonal/ultimate_and enforcessample_weightwhere required.Reviewed by Cursor Bugbot for commit 1c3db5c. Bugbot is set up for automated code reviews on this repo. Configure here.