Could someone explain why we never consider k=j for the forward integral equations? If say we take a simple 3 state model {1,2,3} and we want to have the forward integral equation for p_{12}(t) then the way I think about it is we can go from 1->1 then the transition from 1 to 2, then we have the probability of staying in state 2 and we can product these and integrate. Similarly for going 1->3, transition from 3->2 and stay at 2. Why can we not have the probability of 1->2 at time (t-w), then the transition of 2 to 2 then the holding probability as before. The only reason I can think as to why we cannot have 1 -> 2 is being the transition of state 2 to 2 would be 0, but I cannot see why?
Forward integral equations focus on the last jump that occurs over the period. In the scenario you suggest a transition from 2 to 2 is not a jump - it's staying in the same place. So the last jump (given that you start off in state 1), would actually have been at some point earlier - and that's the jump to focus on. So, focussing on a jump (ie a change of state) means we exclude k=j.