Summary: Small random perturbations may have a dramatic impact
on the long time evolution of dynamical systems, and large deviation
theory is often the right theoretical framework to understand these
effects. At the core of the theory lies the minimization of an
action functional, which in many cases of interest has to be
computed by numerical means. Here, a numerical method is presented
to effectively compute minimizers of the Freidlin-Wentzell action
functional in very general settings. The matlab source code of the
algorithm and some test problems is available for download
here.
Introduction
Small random perturbations often have a lasting effect on the
long-time evolution of dynamical systems. For example, they give rise
to transitions between otherwise stable equilibria, a phenomenon
referred to as metastability that is observed in a wide variety of
contexts, e.g. phase separation, population dynamics, chemical
reactions, climate regimes, neuroscience, or fluid dynamics. Since the
time-scale over which these transition events occurs is typically
exponentially large in some control parameter (for example the noise
amplitude), a brute-force simulation approach to compute these events
quickly becomes infeasible. Fortunately, it is possible to exploit the
fact that the mechanism of these transitions is often predictable when
the random perturbations have small amplitude: with high probability
the transitions occur by their path of maximum likelihood (PML), and
knowledge of this PML also permits to estimate their rate. This is the
essence of large deviation theory (LDT), which applies in a wide
variety of contexts. For example, systems whose evolution is governed
by a stochastic (ordinary or partial) differential equation driven by
a small noise or by a Markov jump process in which jumps occur often
but lead to small changes of the system state, or slow/fast systems in
which the fast variables are randomly driven and the slow ones feel
these perturbations through the effect fast variables only, all fit
within the framework of LDT. Note that, typically, the dynamics of
these systems fail to exhibit microscopic reversibility (detailed
balance) and the transitions therefore occur
out-of-equilibrium. Nevertheless, LDT still applies.
LDT also indicates that the PML is computable as the minimizer of a
specific objective function (action): the large deviation rate
function of the problem at hand. This is a non-trivial numerical
optimization problem which calls for tailor-made techniques for its
solution. Here we will focus on one such technique, the geometric
minimum action method, which was designed to perform the action
minimization over both the transition path location and its
duration. This computation gives the so-called quasipotential, whose
role is key to understand the long time effect of the random
perturbations on the system, including the mechanism of transitions
events induced by these perturbations.
Simplified geometric minimum action method
Consider $$dX = b(X) dt + \sqrt{\epsilon} \sigma(X) dW\,,$$ where $b:
\mathbb{R}^n \to\mathbb{R}^n$ denotes the drift term, $W$ is a
standard Wiener process in $\mathbb{R}^n$, $\sigma: \mathbb{R}^n \to
\mathbb{R}^n \times \mathbb{R}^n$ is related to the diffusion tensor
via $a(x) = (\sigma \sigma^\dagger)(x)$, and $\epsilon>0$ is a
parameter measuring the noise amplitude. Suppose that we want to
estimate the probability of an event, such as finding the solution in
a set $B\subset \mathbb{R}^n$ at time $T$ given that it started at
$X(0)=x$ at time $t=0$.
Minimization problem
In the limit as $\epsilon\to0$, this probability can be estimated
via a minimization problem
$$\mathcal{P}^x \left(X(T)\in B\right) \asymp \exp\left(-\epsilon^{-1} \min _{\phi\in
\mathcal {C}} S_T(\phi) \right)$$
for the action functional (or rate function)
$$ S_T(\phi) =
\tfrac12\int_0^T \langle\dot \phi - b(\phi),
\left(a(\phi)\right)^{-1}(\dot \phi - b(\phi))\rangle\,dt\,.$$
Here $\asymp$ denotes log-asymptotic equivalence (i.e. the ratio of
the logarithms of both sides tends to 1 as $\epsilon\to0$), the
minimum is taken over the set $\mathcal{C} = { \phi\in
C([0,T],\mathbb{R}^n): \phi(0)=x,\phi(T)\in B}$, and we assumed for
simplicity that $a(\phi)$ is invertible and
$\langle\cdot,\cdot\rangle$ denotes the Euclidean inner product in
$\mathbb{R}^n$. LDT also indicates that, as $\epsilon \to0$, when the
event occurs, it does so with $X$ being arbitrarily close to the
minimizer $\phi_* = \mathop{\text{argmin }} _{\phi\in \mathcal {C}}
S_T(\phi)$. We can define a Hamiltonian associated with the action
functional above via $H(\phi,\theta) =
\langle{b(\phi)},{\theta}\rangle+\tfrac12\langle{\theta},{a(\phi)
\theta}\rangle$. Our algorithm will turn out to be valid for more than
just Hamiltonians of this form.
As detailed in Proposition 2.1 of [Heymann, Vanden-Eijnden, CPAM
61 (8), 2008], it is possible to redefine the action functional to
a geometric form, $$\hat S(\phi) = \sup_{\theta: H(\phi,\theta=0)}
\int_0^1 \langle \phi', \theta\rangle\,ds$$ and we abbreviate $E(\varphi,\vartheta) = \int_0^1 \langle{\varphi'},{\vartheta}\rangle\,ds$. Let
$$ E_*(\varphi) = \sup_{\vartheta:H(\varphi,\vartheta)=0} E(\varphi,\vartheta)
$$
and $\vartheta_(\varphi)$ such that $E_(\varphi) = E(\varphi,
\vartheta_(\varphi))$. This implies that $\vartheta_$ fulfills the
Euler-Lagrange equation associated with
the constrained optimization problem, that is,
$$ D_\vartheta E(\varphi, \vartheta_) = \mu H_\vartheta(\varphi,\vartheta_)\,,$$
where on the right-hand side $\mu(s)$ is the Lagrange
multiplier added to enforce the constraint
$H(\varphi,\vartheta_)=0$. In particular, at $\vartheta=\vartheta_$,
we have
$$ \mu = \frac{|D_\vartheta E|^2}{\langle!\langle{D_\vartheta E},{H_\vartheta}\rangle!\rangle} = \frac{|\varphi'|^2}{\langle!\langle{\varphi'},{H_\vartheta}\rangle!\rangle}\,,
$$
where the inner product $\langle!\langle{\cdot},{\cdot}\rangle!\rangle$ and its induced norm
$|\cdot|$ can be chosen appropriately, for example as
$\langle{\cdot},{\cdot}\rangle$ or
$\langle{\cdot},{H_{\vartheta\vartheta}^{-1} \,\cdot}\rangle$.
At the minimizer $\varphi_$, the variation of $E_$ with respect to $\varphi$
vanishes. We conclude
$$\begin{aligned}
0 = D_\varphi E_(\varphi_) &= D_\varphi E(\varphi_,\vartheta_)
+ \left[D_\vartheta E D_\varphi \vartheta
\right]{(\varphi,\vartheta)=(\varphi,\vartheta_)}\nonumber\
&= -\vartheta_' + \mu \left[H_\vartheta
D_\varphi \vartheta\right]{(\varphi,\vartheta)=(\varphi,\vartheta_)}
\nonumber\
&= -\vartheta_' - \mu
H_\varphi(\varphi_,\vartheta_)\,,
\label{eq:optim-psi}
\end{aligned}$$
where in the last step we used $H(\varphi, \vartheta_)=0$ and
therefore $$H_\varphi(\varphi, \vartheta_) = -
H_\vartheta(\varphi,\vartheta_*) D_\varphi \vartheta.$$ Multiplying
the gradient with any positive definite matrix as
pre-conditioner yields a descent direction. It
is necessary to choose $\mu^{-1}$ as pre-conditioner to ensure
convergence around critical points, where $\varphi'=0$.
Algorithm
Summarizing, we have reduced the minimization of the geometric action
into two separate tasks:
-
For a given $\varphi$, find $\vartheta_(\varphi)$ by solving the
constrained optimization problem
$$\max_{\vartheta, H(\varphi,\vartheta)=0} E(\varphi,\vartheta)\,,$$
which is equivalent to solving
$$ D_\vartheta E(\varphi,\vartheta_) = \varphi' = \mu H_\vartheta(\varphi,\vartheta_)$$
for $(\mu,\vartheta_)$ under the constraint $H(\varphi,\vartheta_*)=0$. This
can be done via gradient descent, a second order algorithm for faster convergence
(e.g. Newton-Raphson) or, in many cases, analytically.
-
Find $\varphi_$ by solving the optimization problem
$$\min_{\varphi\in\hat{\mathcal{C}}{x,y}} E(\varphi)\,,$$
for example by pre-conditioned gradient descent, using as direction
$$-\mu^{-1} D_\varphi E_ = \mu^{-1}\vartheta_'(\varphi) + H_\varphi(\varphi,\vartheta_*(\varphi))\,,$$
with $\mu^{-1}$ as pre-conditioner. The constraint on the
parametrization, e.g. ${|\varphi'|=\textrm{const}}$, must be fulfilled
during this descent (see below).
Applications
Maier-Stein model: Breaking detailed balance
Maier and Stein's model [J. Stat. Phys 83, 3–4 (1996)] is a
simple system often used as benchmark in LDT calculations. It reads
$$ \begin{cases}
du = (u-u^3-\beta u v^2) dt + \sqrt{\epsilon}dW_u\
dv = -(1+u^2)v dt + \sqrt{\epsilon}dW_v\,,
\end{cases}$$
where $\beta$ is a parameter. For all values of $\beta$, the
deterministic system has the two stable fixed points,
$\varphi_-=(-1,0)$ and $\varphi_+=(1,0)$, and a unique unstable
critical point $\varphi_s=(0,0)$. However it satisfies detailed
balance only for $\beta=1$. To the right, the transition
paths between the two stable fixed points is depicted in
comparison to the heteroclinic orbit. The source code for this problem
is available below.
Eggers model: Meta-stable climate regimes
Egger [J. Atm. Sc. 38, 12 (1981)] introduces the following SDE system as a
crude model to describe weather regimes in central Europe:
$$\small \begin{aligned}
da =& kb(U-\beta/k^2)\,dt - \gamma a\,dt + \sqrt{\epsilon} dW_a\,,\
db =& -ka(U-\beta/k^2)\,dt + UH/k\,dt \&- \gamma b\,dt
+ \sqrt{\epsilon} dW_b\,,\
dU =& -bHk/2\,dt-\gamma(U-U_0)\,dt + \sqrt{\epsilon} dW_U\,.
\end{aligned}$$
When $\epsilon$ is small, these equation exhibit metastability between
a "blocked state" and a "zonal state". To the right, the transition
paths between these two states is depicted in comparison to the
heteroclinic orbit. The source code for this problem is available
below.
Allen-Cahn/Cahn-Hilliard model: Phase separation and growth
Consider the SDE system $$ d\phi= (\frac1\alpha Q (\phi-\phi^3) -
\phi)dt + \sqrt{\epsilon} dW$$ with $\phi=(\phi_1, \phi_2)$ and the
matrix $Q=((1,-1),(-1,1))$. This system does not satisfy detailed
balance, as its drift is made of two gradient terms with incompatible
mobility operators (namely $Q$ and $\textrm{Id}$). This model can be
seen as a 2-dimensional reduction to a discretized version of the
continuous Allen-Cahn/Cahn-Hilliard. When $\alpha$ is small, there
exists a "slow manifold", comprised of all points where
$Q(\phi-\phi^3)=0$ which is shown as a white dashed line to the
right. On this manifold, the deterministic dynamics are of order
$O(1)$, which is small in comparison to the dynamics of the $Q$-term,
which are of order $O(1/\alpha)$. This suggests that for small enough
$\alpha$ the transition trajectory will follow this slow manifold on
which the drift is small, rather than the heteroclinic orbit, to
escape the basin of the stable fixed points. This intuition is
confirmed in the plot to the right. The source code for this problem
is available below.
Genetic switch: A non-quadratic Hamiltonian
The algorithm can readily be applied to situations, where the
Hamiltonian is non-quadratic. This is the case for example for
birth-death-processes with a positive feedback loop, modeled as
continuous time Markov process. The model $$
\begin{aligned}H(\phi,\theta) &=
\frac{a_1}{1+\phi_b^3}(e^{\theta_a}-1) + \phi_a(e^{-\theta_a}-1)\&+
\frac{a_2}{1+\phi_a}(e^{\theta_b}-1) +
\phi_b(e^{-\theta_b}-1)\end{aligned}$$ with $a_1=156, a_2=30$ was
first defined by Roma et al. [Phys. Rev. E 71 (2005)] to model a genetic
switching mechanism in molecular dynamics. The deterministic dynamics
for this model follow along with minimizer and heteroclinic orbit for
this setup are depicted to the right (zoomed into the "uphill"
region).
Source code
Matlab source code with the generic algorithm and most of the examples
above is available here
Relevant publications
-
T. Grafke, and E. Vanden-Eijnden,
"Numerical computation of rare events via large deviation theory",
Chaos 29 (2019), 063118 (link)
-
T. Grafke, T. Schäfer, and E. Vanden-Eijnden, "Long Term Effects
of Small Random Perturbations on Dynamical Systems: Theoretical and
Computational
Tools",
Fields Institute Communications, In: Recent Progress and Modern
Challenges in Applied Mathematics, Modeling and Computational
Science (Springer, New York, NY) (2017)
(link)
-
T. Grafke, and E. Vanden-Eijnden, "Non-equilibrium transitions in
multiscale systems with a bifurcating slow
manifold"
J. Stat. Mech 2017/9 (2017) 093208
(link)
-
T. Grafke, M. Cates, and E. Vanden-Eijnden, "Spatiotemporal
Self-Organization of Fluctuating Bacterial
Colonies",
Phys. Rev. Lett. 119 (2017), 188003
(link)