Graduation reading material questions

ALEX_AK · Feb 3, 2008

Hi I have a few questions on the reading material for chapter 12,

1) Page 3 line 6, Dx~Binomial(Ex, qx)
In Chapter10 page 3, D~(N,qx). The N is being replaced by Ex in chapter 12. This is the first time I have seen Ex replacing N and I don't really understand the reason why. If each life contributes a full year, then N which is the number of lives can be replaced by Ex since

N x (1full year) = Ex

However, not all lives will contribute a full year. Also inital exposed to risk considers deaths as contributing exposure of length (x+1)-(x-ai) where (x+ai) is the age of entry.
May I know what is the rationale of using Ex instead of N.

2) Page 31 line 14,

"If there is heterogeneity within the age group.......variance will be (respectively) smaller or greater than we would expect if our underlying model were correct"

What does this sentence mean? What is the variance referring to?

3) Page 35 example line 5,
" the centre of the distribution...., ie the variance is greater than predicted by the binomial model"

What does this sentence mean? What is this variance referring to?

4) Page 38 example,
I dont really understand why the answer is p=2P(P<=6). Why do we multiply by 2? P(P>=14)=P(P<=6)=0.0577. Can this be the reason? Can anyone explain?
A similar working can be found in Page 39 example. It uses 2x(p-value) also. I dont really get this point.

5) Page 40 line 14,

"Here the deviation has (approximate) distribution:"
May I know why is it an approximate?

In line 11 and 12,
"Consider the hypothesis,
dx~Normal(E...."
Normal distribution was used as an approximate asymptotic distribution to the Poisson model on page2 last 2nd line. Hence there was already an approximation in page 40 line 11 and 12.

Why is there a further approximation in page 40 line14?

Can someone can answer my questions? Hope its not too many.

Robert Chadburn · Feb 16, 2008

Hi Alex
(1) You don't need the idea of Ex when all your lives are exposed for the full potential year of age - from the x birthday to the x+1 birthday. You just use N - the total number lives observed for this whole year. You can call this Ex if you like - in these circumstances it means exactly the same thing. But note that any of the N lives that dies during the year of age are counted with the value "1" in N (and in Ex), regardless of the fact that they actually cease "exposure" at the age at which they die.

The concept of Ex is brought in to deal with the (more usual) case in which not all lives are observed over the whole year of age. The rationale for calculating a person's contribution to Ex is that it is the same as it would have been to the count of "N", APART from any parts of the year for which the person's death would have been missed from the investigation.

Eg, for a person who is not under observation until age x+s (0<s<1), the contribution to Ex is reduced by s of a year. For a person who left observation ALIVE at age x+b (0<b<1), then the contribution to Ex is reduced by 1-b of a year. (But if the person DIES at x+b, there is no deduction from Ex, as we discussed for N above. The logic is consistent, because, for a living person leaving at x+b, if he dies after that age then that death would not be included in the observed number of deaths. However, a dead person at x+b is already dead - there are no missed observations of deaths.)

(2) The "variance" referred to is just the square of the zx value at a particular age. So it is saying that the zx value will tend to be smaller than expected if there is heterogeneity within an age group, and larger than expected if deaths in that age group are not independent.

(3) Here we are looking at the whole sample of 20 zx values. These should have the stated unit normal distribution (with a variance of 1), if Ho is correct. But the observed spread seems wider - fatter tailed - than expected from the standard normal distribution, which imples that the actual variance of the zx variables is greater than 1.

(4) The expected value of P is 10. Observed values of P should be increasingly unlikely further away from 10. So to get a value of 14 (or more) is just as likely (or unlikely!) as getting a value of 6 (or less). If the observed value is so far away from 10 (either above it or below it) to have a probability of occurrence less than 5%, then we reject Ho.
The above logic is what makes this a 2-tailed test - ie we will reject Ho if the observed value is too far above, or too far below, the mean to have a reasonable likely probability of occurring by chance. So, the p-value of a 2-tailed test is the TOTAL probability of getting a value at least as extreme as the one actually observed. In this case, values that are at least as extreme as 6 are those that are 6 or lower, and those that are 14 or greater. So we sum the probabilities of these two events together, and by symmetry they are each equal to 0.0577, so the total probability is double this.

(5) You've actually answered your own question. In lines 11 and 12 it talks about "the hypothesis" that dx is normally distributed bla bla. As you say, it is this distribution that is approximate (hence the use of the word "hypothesis"). So, in line 14, we are reiterating the fact that the statement is only true asymptotically.

Hope that helps!
Good luck!

Robert Chadburn · Feb 16, 2008

Hi again - I've just changed the above slightly - second paragraph - should have been a 1-b not b - it's right now! Hope that didn't confuse if you read the first version!

Log in or Sign up

Graduation reading material questions

ALEX_AK Member

Robert Chadburn Member

Robert Chadburn Member

Share This Page

Log in or Sign up

Useful Searches

Graduation reading material questions

ALEX_AK Member

Robert Chadburn Member

Robert Chadburn Member

Share This Page