Given an initial probability distribution for a stochastic process one can use the transition matrix to compute a change in the probability distribution. If one tries to go backwards one finds that the inverse matrix has numbers outside the interval [0,1] and so cannot be interpreted as a probability matrix.

Bayes' Theorem provides a solution to this problem. Suppose we start with a simple transition diagram.

Focusing on the states marked in red we see that both initial states contribute to the final state s'

_{0}. We can view the transition matrix as mixing events from the two sources. Suppose N

_{0} events come from s

_{0} and N

_{1} come from s

_{1} making a total for s'

_{0} of N'

_{0}=N

_{0}+N

_{1}. Bayes' theorem tell us the probability that s

_{0} was responsible for some event in s'

_{0} is P'

_{0,0}=N

_{0}/N'

_{0}=P

_{0,0 }p

_{0}/p'

_{0} and likewise the probability that s

_{1} was responsible is P'

_{0,1}=N

_{1}/N'

_{0}=P

_{0,0 }p

_{0}/p'

_{0}. Doing this for all the paths in the diagram gives a set of inverse probabilities, P'

_{j,k}. Factoring out the components of p and p' into separate matrices and generalizing to n states gives a matrix expression for P' which can be generalized for n states .

D(p) is a matrix whose non-zero components along the diagonal are the components of p. We can show how this works with a sample calculation. A check shows that P' has the properties of a probability matrix. The sum of each column is 1 and we see that p=P'p'.

Given two sets of data for p and p' we see from the argument above that P=p'p

^{-1} and a check of the column vectors in p and p' shows p'=Pp. One needs to be careful about the inverse probabilities P' since they are dependent on both p and p' so the P' for the first columns of p and p' is not the same as that for the second columns. The inverse probabilities are not determined by P alone.