Information Geometry (Part 15)

January 14, 2016

Information Geometry (Part 15)

John Baez and Blake Pollard

Lately we've been thinking about open Markov processes. These are random processes where something can hop randomly from one state to another (that's the 'Markov process' part) but also enter or leave the system (that's the 'open' part).

The ultimate goal is to understand the nonequilibrium thermodynamics of open systems — systems where energy and maybe matter flows in and out — well enough to understand in detail how life works. That's a difficult job! But one has to start somewhere, and this is one place to start.

We have a few papers on this subject:

Blake Pollard, A Second Law for open Markov processes. (Blog article here.)
John Baez, Brendan Fong and Blake Pollard, A compositional framework for Markov processes. (Blog article here.)
Blake Pollard, Open Markov processes: A compositional perspective on non-equilibrium steady states in biology. (Blog article NOT YET.)

However, right now we just want to show you three closely connected results about how relative entropy changes in open Markov processes.

Definitions

An open Markov process, is a triple $(X,B,H)$ where $X$ is a finite set of state, $B \subseteq X$ is the subset of boundary states, and $H: \mathbb{R}^X \to \mathbb{R}^X$ is an infinitesimal stochastic operator, meaning a linear operator with

$$ H_{ij} \geq 0, \ \ i \neq j $$

and

$$ \sum_i H_{ij} = 0 $$

For each $i \in X$ we introduce population $p_i \in [0,\infty).$ We call the resulting function $p : X \to [0,\infty)$ the population distribution. Populations evolve in time according to the open master equation:

$$ \displaystyle{ \frac{dp_i}{dt} = \sum_i H_{ij}p_j}, \; i \in X-B $$

$$ p_i(t) = b_i(t), \; i \in B $$

So, the populations $p_i$ obey a linear differential equation at states $i$ that are not in the boundary, but they are specified 'by the user' to be chosen functions $b_i$ at the boundary states.

The off-diagonal entries $H_{ij}, \ i \neq j$ are the rates at which population transitions from the $j$th to the $i$th state. A steady state distribution is a population distribution which is constant in time:

$$\displaystyle{ \frac{dp_i}{dt} = 0 } \quad \textrm{ for all } i \in X $$

A closed Markov process, or continuous-time discrete-state Markov chain, is an open Markov process whose boundary is empty. For a closed Markov process, the open master equation becomes the usual master equation:

$$\displaystyle{ \frac{dp}{dt} = Hp } $$

In a closed Markov process the total population is conserved:

$$\displaystyle{ \frac{d}{dt} \sum_{i \in X} p_i = \sum_{i,j} H_{ij}p_j = 0 } $$

This lets us normalize the initial total population to 1 and have it stay equal to 1. If we do this, we can talk about probabilities instead of populations. In an open Markov process, population can flow in and out at the boundary states.

A steady-state distribution in a closed Markov process is typically called an equilibrium. We say an equilibrium $q \in [0,\infty)^X$ of a Markov process is detailed balanced if

$$ H_{ij}q_j = H_{ji}q_i \ \ \text{for all} \ \ i,j \in X. $$

Given two population distributions

$$ p, q : X \to [0,\infty) $$

we can define the relative entropy

$$ \displaystyle{ I(p,q) = \sum_i p_i \ln \left( \frac{p_i}{q_i} \right)} $$

There are some nice results about how this changes with time. When $q$ is a detailed balanced equilibrium solution of the master equation, the relative entropy can be seen as the 'free energy' of $p$. For a precise statement, see Section 4 of Relative entropy in biological systems.

The Second Law of Thermodynamics implies that the free energy of a closed system tends to decrease with time, so for closed Markov processes we expect $I(p,q)$ to be nonincreasing. And this is true! But for open Markov processes, free energy can flow in from outside.

Results

Theorem 1. Consider an open Markov process with $X$ as its set of states and $B$ as the set of boundary states. Suppose $p(t)$ and $q(t)$ obey the open master equation, and let the quantities

$$ \displaystyle{ \frac{Dp_i}{Dt} = \frac{dp_i}{dt} - \sum_{j \in X} H_{ij}p_j } $$

$$\displaystyle{ \frac{Dq_i}{Dt} = \frac{dq_i}{dt} - \sum_{j \in X} H_{ij}q_j } $$

measure how much the time derivatives of $p_i$ and $q_i$ fail to obey the master equation. Then we have

$$ \begin{array}{ccl} \displaystyle{ \frac{d}{dt} I(p(t),q(t)) } &=& \displaystyle{ \sum_{i, j \in X} H_{ij} p_j \left( \ln(\frac{p_i}{q_i}) - \frac{p_i q_j}{p_j q_i} \right)} \\ \\ && + \;\;\; \displaystyle{ \sum_{i \in B} \frac{\partial I}{\partial p_i} \frac{Dp_i}{Dt} + \frac{\partial I}{\partial q_i} \frac{Dq_i}{Dt} } \end{array} $$

This result separates the change in relative entropy change into two parts: an 'internal' part and a 'boundary' part.

It turns out the 'internal' part is always less than or equal to zero. So, from Theorem 1 we can deduce a version of the Second Law of Thermodynamics for open Markov processes:

Theorem 2. Given the conditions of Theorem 1, we have

$$ \displaystyle{ \frac{d}{dt} I(p(t),q(t)) \; \le \; \sum_{i \in B} \frac{\partial I}{\partial p_i} \frac{Dp_i}{Dt} + \frac{\partial I}{\partial q_i} \frac{Dq_i}{Dt} } $$

Intuitively, this says that free energy can only increase if it comes in from the boundary!

There is another nice result that holds when $q$ is an equilibrium solution of the master equation. This idea seems to go back to Schnakenberg:

Theorem 3. Given the conditions of Theorem 1, suppose also that $q$ is an equilibrium solution of the master equation. Then we have

$$ \displaystyle{ \frac{d}{dt} I(p(t),q) = \frac{1}{2} \sum_{i,j \in X} J_{ij} A_{ij} \; + \; \sum_{i \in B} \frac{\partial I}{\partial p_i} \frac{Dp_i}{Dt} } $$

where

$$ J_{ij} = H_{ij}p_j - H_{ji}p_i $$

is the flux from $j$ to $i,$ while

$$ \displaystyle{ A_{ij} = \ln \left( \frac{p_i q_j}{p_j q_i} \right) } $$

is the conjugate thermodynamic force.

The flux $J_{ij}$ has a nice meaning: it's the net flow of population from $j$ to $i.$ The thermodynamic force is a bit subtler, but this theorem reveals its meaning: its how much free energy increase is caused by a flow form $j$ to $i.$ We should probably include a minus sign, since free energy wants to decrease.

Proofs

Proof of Theorem 1. We begin by taking the time derivative of the relative information:

$$ \begin{array}{ccl} \displaystyle{ \frac{d}{dt} I(p(t),q(t)) } &=& \displaystyle{ \sum_{i \in X} \frac{\partial I}{\partial p_i} \frac{dp_i}{dt} + \frac{\partial I}{\partial q_i} \frac{dq_i}{dt} } \end{array} $$

We can separate this into a sum over states $i \in X - B,$ for which the time derivatives of $p_i$ and $q_i$ are given by the master equation, and boundary states $i \in B,$ for which they are not:

$$ \begin{array}{ccl} \displaystyle{ \frac{d}{dt} I(p(t),q(t)) } &=& \displaystyle{ \sum_{i \in X-B, \; j \in X} \frac{\partial I}{\partial p_i} H_{ij} p_j + \frac{\partial I}{\partial q_i} H_{ij} q_j }\\ \\ && + \; \; \; \displaystyle{ \sum_{i \in B} \frac{\partial I}{\partial p_i} \frac{dp_i}{dt} + \frac{\partial I}{\partial q_i} \frac{dq_i}{dt}} \end{array} $$

For boundary states we have

$$\displaystyle{ \frac{dp_i}{dt} = \frac{Dp_i}{Dt} + \sum_{j \in X} H_{ij}p_j } $$

and similarly for the time derivative of $q_i.$ We thus obtain

$$ \begin{array}{ccl} \displaystyle{ \frac{d}{dt} I(p(t),q(t)) } &=& \displaystyle{ \sum_{i,j \in X} \frac{\partial I}{\partial p_i} H_{ij} p_j + \frac{\partial I}{\partial q_i} H_{ij} q_j }\\ \\ && + \; \; \displaystyle{ \sum_{i \in B} \frac{\partial I}{\partial p_i} \frac{Dp_i}{Dt} + \frac{\partial I}{\partial q_i} \frac{Dq_i}{Dt}} \end{array} $$

To evaluate the first sum, recall that

$$\displaystyle{ I(p,q) = \sum_{i \in X} p_i \ln (\frac{p_i}{q_i})} $$

$$ \displaystyle{\frac{\partial I}{\partial p_i}} =\displaystyle{1 + \ln (\frac{p_i}{q_i})} , \qquad \displaystyle{ \frac{\partial I}{\partial q_i}}= \displaystyle{- \frac{p_i}{q_i} } $$

Thus, we have

$$ \displaystyle{ \sum_{i,j \in X} \frac{\partial I}{\partial p_i} H_{ij} p_j + \frac{\partial I}{\partial q_i} H_{ij} q_j = \sum_{i,j\in X} (1 + \ln (\frac{p_i}{q_i})) H_{ij} p_j - \frac{p_i}{q_i} H_{ij} q_j } $$

We can rewrite this as

$$ \displaystyle{ \sum_{i,j \in X} H_{ij} p_j \left( 1 + \ln(\frac{p_i}{q_i}) - \frac{p_i q_j}{p_j q_i} \right) } $$

Since $H_{ij}$ is infinitesimal stochastic we have $\sum_{i} H_{ij} = 0,$ so the first term drops out, and we are left with

$$ \displaystyle{ \sum_{i,j \in X} H_{ij} p_j \left( \ln(\frac{p_i}{q_i}) - \frac{p_i q_j}{p_j q_i} \right) } $$

as desired. █

Proof of Theorem 2. Thanks to Theorem 1, to prove

$$ \displaystyle{ \frac{d}{dt} I(p(t),q(t)) \; \le \; \sum_{i \in B} \frac{\partial I}{\partial p_i} \frac{Dp_i}{Dt} + \frac{\partial I}{\partial q_i} \frac{Dq_i}{Dt} } $$

it suffices to show that

$$ \displaystyle{ \sum_{i,j \in X} H_{ij} p_j \left( \ln(\frac{p_i}{q_i}) - \frac{p_i q_j}{p_j q_i} \right) \le 0 } $$

or equivalently (recalling the proof of Theorem 1):

$$ \displaystyle{ \sum_{i,j} H_{ij} p_j \left( \ln(\frac{p_i}{q_i}) + 1 - \frac{p_i q_j}{p_j q_i} \right) \le 0 } $$

The last two terms on the left hand side cancel when $i = j.$ Thus, if we break the sum into an $i \ne j$ part and an $i = j$ part, the left side becomes

$$\displaystyle{ \sum_{i \ne j} H_{ij} p_j \left( \ln(\frac{p_i}{q_i}) + 1 - \frac{p_i q_j}{p_j q_i} \right) \; + \; \sum_j H_{jj} p_j \ln(\frac{p_j}{q_j}) } $$

Next we can use the infinitesimal stochastic property of $H$ to write $H_{jj}$ as the sum of $-H_{ij}$ over $i$ not equal to $j,$ obtaining

$$ \displaystyle{ \sum_{i \ne j} H_{ij} p_j \left( \ln(\frac{p_i}{q_i}) + 1 - \frac{p_i q_j}{p_j q_i} \right) - \sum_{i \ne j} H_{ij} p_j \ln(\frac{p_j}{q_j}) } = \displaystyle{ \sum_{i \ne j} H_{ij} p_j \left( \ln(\frac{p_iq_j}{p_j q_i}) + 1 - \frac{p_i q_j}{p_j q_i} \right) } $$

Since $H_{ij} \ge 0$ when $i \ne j$ and $\ln(s) + 1 - s \le 0$ for all $s > 0,$ we conclude that this quantity is $\le 0.$ █

Proof of Theorem 3. Now suppose also that $q$ is an equilibrium solution of the master equation. Then $Dq_i/Dt = dq_i/dt = 0$ for all states $i,$ so by Theorem 1 we need to show

$$ \displaystyle{ \sum_{i, j \in X} H_{ij} p_j \left( \ln(\frac{p_i}{q_i}) - \frac{p_i q_j}{p_j q_i} \right) \; = \; \frac{1}{2} \sum_{i,j \in X} J_{ij} A_{ij} }$$

We also have $\sum_{j \in X} H_{ij} q_j = 0,$ so the second term in the sum at left vanishes, and it suffices to show

$$ \displaystyle{ \sum_{i, j \in X} H_{ij} p_j \ln(\frac{p_i}{q_i}) \; = \; \frac{1}{2} \sum_{i,j \in X} J_{ij} A_{ij} }$$

By definition we have

$$ \displaystyle{ \frac{1}{2} \sum_{i,j} J_{ij} A_{ij}} = \displaystyle{ \frac{1}{2} \sum_{i,j} \left( H_{ij} p_j - H_{ji}p_i \right) \ln \left( \frac{q_j p_i}{q_i p_j} \right) } $$

This in turn equals

$$ \displaystyle{ \frac{1}{2} \sum_{i,j} H_{ij}p_j \ln \left( \frac{q_j p_i}{q_i p_j} \right) - \frac{1}{2} \sum_{i,j} H_{ji}p_i \ln \left( \frac{q_j p_i}{q_i p_j} \right) } $$

and we can switch the dummy indices $i,j$ in the second sum, obtaining

$$ \displaystyle{ \frac{1}{2} \sum_{i,j} H_{ij}p_j \ln \left( \frac{q_j p_i}{q_i p_j} \right) - \frac{1}{2} \sum_{i,j} H_{ij}p_j \ln \left( \frac{q_i p_j}{q_j p_i} \right) } $$

or simply

$$ \sum_{i,j} H_{ij} p_j \ln \left( \frac{q_j p_i}{q_i p_j} \right) $$

But this is

$$ \sum_{i,j} H_{ij} p_j \left(\ln ( \frac{q_j}{p_j}) + \ln (\frac{p_i}{q_i}) \right) $$

and the first term vanishes because $H$ is infinitesimal stochastic: $\sum_i H_{ij} = 0.$ We thus have

$$ \displaystyle{ \frac{1}{2} \sum_{i,j} J_{ij} A_{ij}} = \sum_{i,j} H_{ij} p_j \ln (\frac{p_i}{q_i} ) $$

as desired. █

You can read a discussion of this article on Azimuth, and make your own comments or ask questions there!