Hello
In answer to the first question, the root should be -1/3. This was an error in the 2019 Course Notes and the correction appeared in the 2019 corrections document. It is correct in the 2020 Course Notes.
As has been pointed out, we are generally interested in the absolute value of the root; however, the correct root is still -1/3.
In terms of the discussion on differencing, I think the easiest way to see how it removes unit roots (roots of 1) is the idea mentioned in a previous post about factoring out (1-B).
Short answer
Say the characteristic polynomial (for the terms of the process) of some time series \( X_t \) is given by:
\( 1 - \alpha_1 \lambda - ... - \alpha_p \lambda^p \)
This is a polynomial of degree \(p \) that has \( p\) roots, say \( r_1, ..., r_p \). This means we can factorise to:
\( (r_1 - \lambda )(r_2 - \lambda)...(r_p - \lambda) \)
Say we have a unit root (say \( r_1 \) for ease of reference). Then we can write this as:
\( (1 - \lambda )(r_2 - \lambda)...(r_p - \lambda) \)
For our purposes we can think of this as:
\( (1 - \lambda ) * something \)
Where we have factored out the \( (1 - \lambda ) \). Now, this \( something \) turns out to be the characteristic polynomial (for the terms of the process) of the differenced series:
\( X_t - X_{t-1} \)
To see this let's go back to writing out the equation with \( B \) instead of \( \lambda \):
\( (1 - \alpha_1 B - ... - \alpha_p B^p) X_t = RHS \)
where the RHS contains any constants and error terms.
Now, we know that we can write the brackets as:
\( (1 - B ) something \)
So we have:
\( ((1 - B ) something) X_t = RHS \)
However:
\( (1 - B ) X_t = X_t - X_{t-1} \)
So we have:
\( (something) (X_t - X_{t-1}) = RHS \)
This is effectively an equation for the differenced series. Let:
\( Y_t = X_t - X_{t-1} \)
Then we have:
\( (something) Y_t = RHS \)
So we've shown that taking out the unit root, factoring out the (1-B), leaves us with the differenced series with remaining polynomial \(something \). Now it's possible this differenced series still has unit roots and requires differencing again, however the process is the same.
Longer answer
If you still want to see all the algebra for \( Y_t \) really fall into place, I've set it out below.
Say we have the following time series model:
\( X_t = \mu + \alpha_1 (X_{t-1} - \mu) + ... + \alpha_p (X_{t-p} - \mu) + \beta_1 e_{t-1} + ...+ \beta_q e_{t-q} + e_t \quad \quad \quad \) \( (1) \)
Then, rearranging by putting the past terms of the process on the LHS:
\( X_t - \alpha_1 X_{t-1} - ... - \alpha_p X_{t-p} = \mu (1 - \alpha_1 - ... - \alpha_p) + \beta_1 e_{t-1} + ...+ \beta_q e_{t-q} + e_t \)
In terms of B:
\((1 - \alpha_1 B - ... - \alpha_p B^p)X_t = RHS \)
The characteristic polynomial is then:
\( 1 - \alpha_1 \lambda - ... - \alpha_p \lambda^p \)
This is a polynomial of degree \(p \) that has \( p\) roots, say \( r_1, ..., r_p \). This means we can factorise to:
\( (r_1 - \lambda )(r_2 - \lambda)...(r_p - \lambda) \)
Say we have a unit root (say \( r_1 \) for ease of reference). Now, one of the really key points here is that if we have a unit root then the following must hold:
\( (1 - \alpha_1 - ... - \alpha_p) = 0 \)
Why? Well just plug it into the characteristic polynomial:
\( 1 - \alpha_1 * 1 - ... - \alpha_p * 1 = 0 \)
Now why does this matter? Let's go back to the original time series model, (1):
\( X_t = \mu + \alpha_1 (X_{t-1} - \mu) + ... + \alpha_p (X_{t-p} - \mu) + \beta_1 e_{t-1} + ...+ \beta_q e_{t-q} + e_t \quad \quad \quad \) \( (1) \)
Firstly, if we expand out the brackets on the RHS, we get a term that is \( \mu (1 - \alpha_1 - ... - \alpha_p) \). This is 0, so we can effectively ignore \( \mu \) from now on.
Subtracting \( X_{t-1} \) from both sides and doing some rearranging:
\( X_t - X_{t-1} = (\alpha_1 - 1) * ( X_{t-1} - X_{t-2}) + (\alpha_2 + \alpha_1 - 1) * (X_{t-2} - X_{t-3}) + (\alpha_3 + \alpha_2 + \alpha_1 - 1) * (X_{t-3} - X_{t-4}) + ... + \)
\( (\alpha_{p-1} + \alpha_{p-1} + ... + \alpha_1 - 1) ( X_{t-p+1} - X_{t-p}) + (\alpha_{p} + \alpha_{p-1} + ... + \alpha_1 - 1) X_{t-p} + \beta_1 e_{t-1} + ...+ \beta_q e_{t-q} + e_t \)
Now, the last \( X_{t-p} \) term is actually 0 because the coefficient is \( \alpha_{p} + \alpha_{p-1} + ... + \alpha_1 - 1 = 0 \). So, we have:
\( X_t - X_{t-1} = (\alpha_1 - 1) * ( X_{t-1} - X_{t-2}) + (\alpha_2 + \alpha_1 - 1) * (X_{t-2} - X_{t-3}) + (\alpha_3 + \alpha_2 + \alpha_1 - 1) * (X_{t-3} - X_{t-4}) + ... + \)
\( (\alpha_{p-1} + \alpha_{p-1} + ... + \alpha_1 - 1) ( X_{t-p+1} - X_{t-p}) + \beta_1 e_{t-1} + ...+ \beta_q e_{t-q} + e_t \)
If we let:
\( Y_t = X_t - X_{t-1} \)
Then we have:
\( Y_t = (\alpha_1 - 1) * Y_{t-1} + (\alpha_2 + \alpha_1 - 1) * Y_{t-2} + (\alpha_3 + \alpha_2 + \alpha_1 - 1) * Y_{t-3} + ... + \)
\( (\alpha_{p-1} + \alpha_{p-1} + ... + \alpha_1 - 1) Y_{t-p+1} + \beta_1 e_{t-1} + ...+ \beta_q e_{t-q} + e_t \)
Let's find the characteristic equation involving the terms of the process for \( Y_t \).
So, putting past terms of the process on the LHS:
\( Y_t + (1 - \alpha_1) * Y_{t-1} + (1 - \alpha_2 - \alpha_1) * Y_{t-2} + ... + (1 - \alpha_{p-1} - ... - \alpha_1) Y_{t-p+1} = \beta_1 e_{t-1} + ...+ \beta_q e_{t-q} + e_t \)
The characteristic equation is:
\( 1 + (1 - \alpha_1) \lambda + ... + (1 - \alpha_{p-1} - ... - \alpha_1) \lambda^{p-1} \)
If we times this by \( 1 - \lambda \) we get:
\( 1 + (1 - \alpha_1) \lambda + ... + (1 - \alpha_{p-1} - ... - \alpha_1) \lambda^{p-1} - \lambda - (1 - \alpha_1) \lambda^2 - ... - (1 - \alpha_{p-1} - ... - \alpha_1) \lambda^p \)
\( = 1 - \alpha_1 \lambda - \alpha_2 \lambda^2 - ... - \alpha_p \lambda^p \)
This is the characteristic polynomial for \( X_t \). Remember that we could represent it as:
\( (r_1 - \lambda )(r_2 - \lambda)...(r_p - \lambda) \)
So, we've shown that the characteristic polynomial for the differenced series, \( Y_t \), is equal to:
\( (r_2 - \lambda)...(r_p - \lambda) \)
assuming \( r_1 \) was the unit root, as stated earlier.
Hope this helps
Andy