Can you provide an explanation of how to get from the first Normal equation to equation 1.1 please? And the reasoning behind doing this?
Are you looking for the derivation of writing Normal distribution pdf in the form of exponential family distribution? See the pic below. We do this because in GLM, we can only model whose distributions (dist. of Y) which pdf can be written in the form of the given exponential family distribution.