Hi Alex
(1) You don't need the idea of Ex when all your lives are exposed for the full potential year of age - from the x birthday to the x+1 birthday. You just use N - the total number lives observed for this whole year. You can call this Ex if you like - in these circumstances it means exactly the same thing. But note that any of the N lives that dies during the year of age are counted with the value "1" in N (and in Ex), regardless of the fact that they actually cease "exposure" at the age at which they die.
The concept of Ex is brought in to deal with the (more usual) case in which not all lives are observed over the whole year of age. The rationale for calculating a person's contribution to Ex is that it is the same as it would have been to the count of "N", APART from any parts of the year for which the person's death would have been missed from the investigation.
Eg, for a person who is not under observation until age x+s (0<s<1), the contribution to Ex is reduced by s of a year. For a person who left observation ALIVE at age x+b (0<b<1), then the contribution to Ex is reduced by 1-b of a year. (But if the person DIES at x+b, there is no deduction from Ex, as we discussed for N above. The logic is consistent, because, for a living person leaving at x+b, if he dies after that age then that death would not be included in the observed number of deaths. However, a dead person at x+b is already dead - there are no missed observations of deaths.)
(2) The "variance" referred to is just the square of the zx value at a particular age. So it is saying that the zx value will tend to be smaller than expected if there is heterogeneity within an age group, and larger than expected if deaths in that age group are not independent.
(3) Here we are looking at the whole sample of 20 zx values. These should have the stated unit normal distribution (with a variance of 1), if Ho is correct. But the observed spread seems wider - fatter tailed - than expected from the standard normal distribution, which imples that the actual variance of the zx variables is greater than 1.
(4) The expected value of P is 10. Observed values of P should be increasingly unlikely further away from 10. So to get a value of 14 (or more) is just as likely (or unlikely!) as getting a value of 6 (or less). If the observed value is so far away from 10 (either above it or below it) to have a probability of occurrence less than 5%, then we reject Ho.
The above logic is what makes this a 2-tailed test - ie we will reject Ho if the observed value is too far above, or too far below, the mean to have a reasonable likely probability of occurring by chance. So, the p-value of a 2-tailed test is the TOTAL probability of getting a value at least as extreme as the one actually observed. In this case, values that are at least as extreme as 6 are those that are 6 or lower, and those that are 14 or greater. So we sum the probabilities of these two events together, and by symmetry they are each equal to 0.0577, so the total probability is double this.
(5) You've actually answered your own question. In lines 11 and 12 it talks about "the hypothesis" that dx is normally distributed bla bla. As you say, it is this distribution that is approximate (hence the use of the word "hypothesis"). So, in line 14, we are reiterating the fact that the statement is only true asymptotically.
Hope that helps!
Good luck!
Last edited by a moderator: Feb 16, 2008