• We are pleased to announce that the winner of our Feedback Prize Draw for the Winter 2024-25 session and winning £150 of gift vouchers is Zhao Liang Tay. Congratulations to Zhao Liang. If you fancy winning £150 worth of gift vouchers (from a major UK store) for the Summer 2025 exam sitting for just a few minutes of your time throughout the session, please see our website at https://www.acted.co.uk/further-info.html?pat=feedback#feedback-prize for more information on how you can make sure your name is included in the draw at the end of the session.
  • Please be advised that the SP1, SP5 and SP7 X1 deadline is the 14th July and not the 17th June as first stated. Please accept out apologies for any confusion caused.

chisq.test function in R

Naitik Shah

Keen member
An insurer believes that the distribution of the number of claims on a particular type of policy is binomial with parameters n = 3 and p . A random sample of the number of claims on 153 policies revealed the following results:
Number of Claims | 0 | 1 | 2 | 3 |
Number of Policies | 60 | 75 | 16 | 2 |
(a) Show that the method of moments estimate for p is 0.246.
(b) Carry out a goodness of fit test for the specified binomial model for the number of claims on each policy, ensuring that the expected frequencies are greater than 5.
(c) Use the CDF of a chi squared distribution to find the correct p-value.

Can anyone help me for the parts (b) & (c) for the above question?

R Code for part (a) is as follows:
obs <- c(60,75,16,2)
x <- c(0,1,2,3)
n <- 3
mu <- sum(x*obs)/sum(obs)
n <- 3
p <- mu/n
 
The built in chisq.test function doesn't work well here as it doesn't correctly calculate the degrees of freedom when you fit a model - so you simply use R as a calculator and do what you would do on a bit of paper.
 
Hi John,

Thanks for your prompt response.

But would it be possible for you to help me out with the (c) part and how do I go about doing the manual procedure on R?

For the latter, do I just substitute the values in the formula?

Best,
Naitik Shah
 
Yes, so if you store the observed frequencies in "obs" and the expected frequencies in "exptd" then recall the formula is:
\(\sum \frac{(obs-exptd)^2}{exptd}\) ~ \( \chi^2\)
So if you calculate this in R and store the result in "stat" then recall we reject H0: "model is a good fit" for large values.
So the p-value will just be the probability that the chi-square is greater than stat.
 
Back
Top