Hi All,
Just had a query re: the covariates and factors. The practice question has covariates of the size of a house (4 categories), the area of house (urban or rural) and the # of claims made. When writing out the linear predictor, for the factor - area, only beta is used (instead of beta-i). I was just wondering why that was. I thought that area had two parameters, urban and rural, so shouldn't it have a subscript... any help?
Thanks.
I agree it's a little confusing - you could put a j subscript and that would be marked correct also.
When you add a new covariate you always lose one parameter as we are unable to estimate it.
For example suppose we had size and area and the constants for each of the categories were:
Area Rural: 80
Area Urban: 120
Size 1: 50
Size 2: 100
Size 3: 200
Size 4: 500
Then we would get the following totals for the following data:
R1: 130 U1: 170
R2: 180 U2: 220
R3: 280 U3: 320
R4: 580 U4: 620
Now if you gave this data to someone and asked them to estimate the constants they wouldn't be able to individually identify all of them.
R1-R4 would tell you that the sizes go up by 50, 100, 300
As would U1-U4
Comparing R1 and U1 would tell you that Urban is 50 more than Rural.
As would R2 and U2, etc.
But we wouldn't be able to work out exactly how R and 1 were split to make the 130 and so on (the equations are linearly dependent).
So what we do is set one of the parameters equal to zero (effectively we absorb it into the other constants). So suppose we set Rural to zero. Then we would get:
Area Rural: 0
Area Urban: 40
Size 1: 130
Size 2: 180
Size 3: 280
Size 4: 580
This would give exactly the same answers as before:
R1: 130 U1: 170
R2: 180 U2: 220
R3: 280 U3: 320
R4: 580 U4: 620
Hence, we only have one parameter for area - which is called beta and is included if you are in an urban area and but not if you are in a rural area. The subscript would help us identify that but would perhaps mislead us into thinking there are 2 parameters there. Hence it was omitted. But once again please be assured you would get the marks if you did put it on.
Clear as mud?