• We are pleased to announce that the winner of our Feedback Prize Draw for the Winter 2024-25 session and winning £150 of gift vouchers is Zhao Liang Tay. Congratulations to Zhao Liang. If you fancy winning £150 worth of gift vouchers (from a major UK store) for the Summer 2025 exam sitting for just a few minutes of your time throughout the session, please see our website at https://www.acted.co.uk/further-info.html?pat=feedback#feedback-prize for more information on how you can make sure your name is included in the draw at the end of the session.
  • Please be advised that the SP1, SP5 and SP7 X1 deadline is the 14th July and not the 17th June as first stated. Please accept out apologies for any confusion caused.

ASET Sep 2020 Q2ii

B

Brett Kim

Member
I am struggling to fully pin down the concept behind the Generalized Pareto Distribution that is being used in Question 2 ii of the September 2020 exam.

My immediate answer to this question was to calculate 1 - G(70) ?= 1 - P(X <=70) = P(X > 70)
But the marking scheme says it is incorrect to use G(70). Why is this? And what is G(70) = 1 - (1 + (70/45))^-3 = 0.9400838 actually telling us then if it is not P(X <= 70)?

I get this probably related to where it says "it is easy to make the mistake of calculating a probability for the threshold exceedance instead of for the underlying distance." But I don't understand what the difference is between "threshold exceedance" and "underlying distance".
 
Hi Brett

Your immediate answer of using 1 - G(70) is the easiest trap to fall into when working with the GPD. For threshold exceedances (in this case the threshold is 50), we are creating a new variable (Y, say) where Y = X - 50 | X > 50. In other words the GPD models the excess of X above 50, where that excess is positive.

This leads to the relationship:

P(X > 70) = P(X > 50) * P(Y > 20)

Or alternatively,

P(X > 70) = P(X > 50) * (1 - G(20))

Hope that helps

Dave
 
Hi Dave, thanks for the quick response.
Yes I think the important piece I was not thinking about was the fact that the GPD models the conditional distribution above some threshold.
So that means by calculating G(70) what I actually did was calculate P(X - 50 <= 70 | X > 50) = P(X <= 120 | X > 50)?
The probability that someone throws between 50 and 120m, given that they throw over 50m?
 
If that is the case, and 1 - G(70) = 1 - 0.9400838 ~= 0.0599, meaning the probability of throwing between 50 and 120m given someone throws over 50m is approximately 6%.
How does this have a higher probability than someone throwing over 70m given that they throw over 50m?
Or am I getting my conditional and unconditional probabilities confused here?
 
You've written some of that the wrong way round but I think I know what you meant to say. Let's look at the 3 probabilities to make sure you know what they all mean.

1 - G(70) is the probability of throwing more than 120m, given the throw is over 50m, and this is c. 6%.

The probability of throwing more than 70m given the throw was over 50m (ie 1 - G(20)) is c. 33%, as given in the marking schedule, and logically this is higher than the probability above.

The probability of throwing over 70m unconditionally is c. 1.7%, and is given by the product of the above probability and the probability of throwing over 50m.
 
Back
Top