• We are pleased to announce that the winner of our Feedback Prize Draw for the Winter 2024-25 session and winning £150 of gift vouchers is Zhao Liang Tay. Congratulations to Zhao Liang. If you fancy winning £150 worth of gift vouchers (from a major UK store) for the Summer 2025 exam sitting for just a few minutes of your time throughout the session, please see our website at https://www.acted.co.uk/further-info.html?pat=feedback#feedback-prize for more information on how you can make sure your name is included in the draw at the end of the session.
  • Please be advised that the SP1, SP5 and SP7 X1 deadline is the 14th July and not the 17th June as first stated. Please accept out apologies for any confusion caused.

Tweedie

T

tatos

Member
I'm a bit unsure about the issue around nil claims when you use a gamma distribution to model the error structure of claims severity.

Normally I aggregate my data and I've got a LOT of data - and by that I mean that I've split my data into as many cells as possible (i.e. possible referring to remaining statistically relevant). So, for instance, over the last 5 years I might have 500 policyholders each contributing at least a month of exposure during their policy lifetime, during which their risk profile matched the profile of a certain cell. So, essentially, I see this as one risk (assuming time consistency of the effects) because it's all data relating to the same type of profile (just spread over time). So usually I DO have claims data for this cell. Almost always, in fact. When I don't have claims data it's usually because I've got one too many variables and therefore not enough data or policyholders per cell. Is it these cases that are referred to in respect of the nil-claims issue when modelling severity?

I ask because I read about the Tweedie distribution and the way it talked about nil claims suggested that others who model severity remove the very large number of nil claims relating to individual policies. For instance, there was a graph where there was a very large spike at zero (presumably relating to all those instances when all those individuals or months of expsoure did not result in a claim) and then a large heavy-tailed spread of claims. It sounded like that spike was then completely removed for the Gamma modelling but that the Tweedie distribution took care of these. I feel like I'm missing something fundamental here.

I understand that you need to keep your severity and frequency treated consistently - so normally, if I ever do have a cell that has zero claims in it (resulting in removing it because the Gamma distribution can't cope), I ALSO remove this cell in my frequency modelling.

Hoping someone could provide some clarification
 
Last edited by a moderator:
Back
Top