44. Modeling COVID 19 with Differential Equations#
Contents
44.1. Overview#
Coauthored with Chris Rackauckas
This is a Julia version of code for analyzing the COVID-19 pandemic.
The purpose of these notes is to introduce economists to quantitative modeling of infectious disease dynamics, and to modeling with ordinary differential equations.
In this lecture, dynamics are modeled using a standard SEIR (Susceptible-Exposed-Infected-Removed) model of disease spread, represented as a system of ordinary differential equations where the number of agents is large and there are no exogenous stochastic shocks.
The first part of the model is inspired by
Notes from Andrew Atkeson and NBER Working Paper No. 26867
Estimating and Forecasting Disease Scenarios for COVID-19 with an SIR Model by Andrew Atkeson, Karen Kopecky and Tao Zha
Estimating and Simulating a SIRD Model of COVID-19 for Many Countries, States, and Cities by Jesús Fernández-Villaverde and Charles I. Jones
Further variations on the classic SIR model in Julia here.
We then extend this deterministic model in this lecture which build on this model, adding in aggregate shocks and policy tradeoffs.
The interest is primarily in
studying the impact of suppression through social distancing on the spread of the infection
the number of infections at a given time (which determines whether or not the health care system is overwhelmed); and
how long the caseload can be deferred (hopefully until a vaccine arrives)
In addition, we will be exploring the Ordinary Differential Equations package within the SciML ecosystem.
using LaTeXStrings, LinearAlgebra, Random, SparseArrays, Statistics
using OrdinaryDiffEq, Plots
44.2. The SEIR Model#
In the version of the SEIR model, all individuals in the population are assumed to be in a finite number of states.
The states are: susceptible (S), exposed (E), infected (I) and removed (R).
This type of compartmentalized model has many extensions (e.g. SEIRS relaxes lifetime immunity and allow transitions from
Comments:
Those in state R have been infected and either recovered or died. Note that in some variations, R may refer only to recovered agents.
Those who have recovered, and live, are assumed to have acquired immunity.
Those in the exposed group are not yet infectious.
44.2.1. Changes in the Infected State#
Within the SEIR model, the flow across states follows the path
We will ignore birth and non-covid death during our time horizon, and assume a large, constant, number of individuals of size
With this, the symbols
Since we have assumed that
The transitions between those states are governed by the following rates
is called the transmission rate or effective contact rate (the rate at which individuals bump into others and expose them to the virus). is called the infection rate (the rate at which those who are exposed become infected) is called the recovery rate (the rate at which infected people recover or die)
The rate
The SEIR model can then be written as
Here,
The first term of (44.1),
Individuals in the susceptible state (S) have a rate
of prolonged contacts with other individuals where transmission would occur if either was infectedOf these contacts, a fraction
will be with infected agents (since we assumed that exposed individuals are not yet infectious)Finally, there are
susceptible individuals.The sign indicates that the product of those terms is the outflow from the
state, and an inflow to the state.
44.2.2. Basic Reproduction Number#
If
When the transmission rate is time-varying, we will follow notation in [FVJ20] and refer to
Analyzing the system in (44.1) provides some intuition on the
Individual transitions from the infected to removed state occur at a Poisson rate
, the expected time in the infected state isProlonged interactions occur at rate
, so a new individual entering the infected state will potentially transmit the virus to an average of othersIn more complicated models, see [HSW05] for a formal definition for arbitrary models, and an analysis on the role of
.
Note that the notation
Prior to solving the model directly, we make a few changes to (44.1)
Re-parameterize using
Define the proportion of individuals in each state as
etc.Divide each equation in (44.1) by
, and write the system of ODEs in terms of the proportions
Since the states form a partition, we could reconstruct the “removed” fraction of the population as
44.2.3. Implementation#
We begin by implementing a simple version of this model with a constant
First, define the system of equations
function F_simple(x, p, t; gamma = 1 / 18, R_0 = 3.0, sigma = 1 / 5.2)
s, e, i, r = x
return [-gamma * R_0 * s * i; # ds/dt
gamma * R_0 * s * i - sigma * e; # de/dt
sigma * e - gamma * i; # di/dt
gamma * i]
end
F_simple (generic function with 1 method)
Given this system, we choose an initial condition and a timespan, and create a ODEProblem encapsulating the system.
# 33 = 1E-7*330 million population = initially infected
# 132 = 1E-7*330 million = initially exposed
i_0 = 1E-7
e_0 = 4.0 * i_0
s_0 = 1.0 - i_0 - e_0
r_0 = 0.0
x_0 = [s_0, e_0, i_0, r_0] # initial condition
tspan = (0.0, 350.0) # ≈ 350 days
prob = ODEProblem(F_simple, x_0, tspan)
ODEProblem with uType Vector{Float64} and tType Float64. In-place: false
Non-trivial mass matrix: false
timespan: (0.0, 350.0)
u0: 4-element Vector{Float64}:
0.9999995
4.0e-7
1.0e-7
0.0
With this, choose an ODE algorithm and solve the initial value problem. A good default algorithm for non-stiff ODEs of this sort might be Tsit5(), which is the Tsitouras 5/4 Runge-Kutta method.
sol = solve(prob, Tsit5())
plot(sol; labels = [L"s" L"e" L"i" L"r"], title = "SEIR Dynamics", lw = 2,
xlabel = L"t")
We did not provide either a set of time steps or a dt time step size to the solve. Most accurate and high-performance ODE solvers appropriate for this problem use adaptive time-stepping, changing the step size based the degree of curvature in the derivatives.
Or, as an alternative visualization, the proportions in each state over time
areaplot(sol.t, sol', labels = [L"s" L"e" L"i" L"r"],
title = "SEIR Proportions", xlabel = L"t")
While maintaining the core system of ODEs in
44.2.4. Extending the Model#
First, we can consider some additional calculations such as the cumulative caseload (i.e., all those who have or have had the infection) as
We will assume that the transmission rate follows a process with a reversion to a value
Finally, let
Define the cumulative number of deaths as
While we could integrate the deaths given the solution to the model ex-post, it is more convenient to use the integrator built into the ODE solver. That is, we add
This is a common trick when solving systems of ODEs. While equivalent in principle to using the appropriate quadrature scheme, this becomes especially convenient when adaptive time-stepping algorithms are used to solve the ODEs (i.e. there is not a regular time grid). Note that when doing so,
The system (44.2) and the supplemental equations can be written in vector form
Note that in those parameters, the targeted reproduction number,
The model is then
Note that if
44.2.5. Parameters#
The parameters,
As in Atkeson’s note, we set
to reflect an average incubation period of 5.2 days. to match an average illness duration of 18 days. to match a basic reproduction number of 1.6, and initially time-invariant for a one-percent mortality rate
As we will initially consider the case where
44.3. Implementation#
First, construct our
function F(x, p, t)
s, e, i, r, R_0, c, d = x
(; sigma, gamma, R_bar_0, eta, delta) = p
return [-gamma * R_0 * s * i; # ds/dt
gamma * R_0 * s * i - sigma * e; # de/dt
sigma * e - gamma * i; # di/dt
gamma * i; # dr/dt
eta * (R_bar_0(t, p) - R_0); # dR_0/dt
sigma * e; # dc/dt
delta * gamma * i]
end;
This function takes the vector x of states in the system and extracts the fixed parameters passed into the p object.
The only confusing part of the notation is the R_bar_0(t, p) which evaluates the p.R_bar_0 at this time (and also allows it to depend on the p parameter).
44.3.1. Parameters#
#####HERE ENDED The baseline parameters are put into a named tuple generator (see previous lectures using Parameters.jl) with default values discussed above.
function p_gen(; T = 550.0, gamma = 1.0 / 18, sigma = 1 / 5.2, eta = 1.0 / 20,
R_0_n = 1.6, delta = 0.01, N = 3.3E8,
R_bar_0 = (t, p) -> p.R_0_n)
return (; T, gamma, sigma, eta, R_0_n, delta, N, R_bar_0)
end
p_gen (generic function with 1 method)
Note that the default R_bar_0 function
Setting initial conditions, we choose a fixed
p = p_gen() # use all default parameters
i_0 = 1E-7
e_0 = 4.0 * i_0
s_0 = 1.0 - i_0 - e_0
x_0 = [s_0, e_0, i_0, 0.0, p.R_0_n, 0.0, 0.0]
tspan = (0.0, p.T)
prob = ODEProblem(F, x_0, tspan, p)
ODEProblem with uType Vector{Float64} and tType Float64. In-place: false
Non-trivial mass matrix: false
timespan: (0.0, 550.0)
u0: 7-element Vector{Float64}:
0.9999995
4.0e-7
1.0e-7
0.0
1.6
0.0
0.0
The tspan of (0.0, p.T) determines the
The time period we investigate will be 550 days, or around 18 months:
44.4. Experiments#
Let’s run some experiments using this code.
sol = solve(prob, Tsit5())
@show length(sol.t);
length(sol.t) = 45
We see that the adaptive time-stepping used approximately 45 time-steps to solve this problem to the desired accuracy. Evaluating the solver at points outside of those time-steps uses an interpolator consistent with the solution to the ODE.
While it may seem that 45 time intervals is extremely small for that range, for much of the
The solution object has built in plotting support.
plot(sol; idxs = [6, 7], label = [L"c(t)" L"d(t)"], lw = 2,
title = ["Cumulative Infected" "Death Proportion"],
xlabel = L"t", layout = (1, 2), size = (600, 300))
A few more comments:
If you want to ensure that there are specific points that the adaptive-time stepping must include (e.g. at known discontinuities) use tstops.
The built-in plots for the solutions provide all of the attributes in Plots.jl.
See here for details on analyzing the solution and extracting the output.
44.4.1. Experiment 1: Constant Reproduction Case#
Let’s start with the case where
We calculate the time path of infected people under different assumptions of
R_0_n_vals = range(1.6, 3.0, length = 6)
sols = [solve(ODEProblem(F, x_0, tspan, p_gen(R_0_n = R_0_n)),
Tsit5(); saveat = 0.5) for R_0_n in R_0_n_vals];
Here we chose saveat=0.5 to get solutions that were evenly spaced every 0.5.
Changing the saved points is just a question of storage/interpolation, and does not change the adaptive time-stepping of the solvers.
Let’s plot current cases as a fraction of the population.
labels = permutedims([L"R_0 = %$r" for r in R_0_n_vals])
infecteds = [sol[3, :] for sol in sols]
plot(infecteds; label = labels, legend = :topleft, lw = 2, xlabel = L"t",
ylabel = L"i(t)", title = "Current Cases")
As expected, lower effective transmission rates defer the peak of infections.
They also lead to a lower peak in current cases.
Here is cumulative cases, as a fraction of population:
cumulative_infected = [sol[6, :] for sol in sols]
plot(cumulative_infected; label = labels, legend = :topleft, lw = 2,
xlabel = L"t", ylabel = L"c(t)", title = "Cumulative Cases")
44.4.2. Experiment 2: Changing Mitigation#
Let’s look at a scenario where mitigation (e.g., social distancing) is
successively imposed, but the target (maintaining
To do this, we start with
In the simple case, where
We will examine the case where
The parameter eta controls the rate, or the speed at which restrictions are
imposed.
We consider several different rates:
eta_vals = [1 / 5, 1 / 10, 1 / 20, 1 / 50, 1 / 100]
labels = permutedims([L"\eta = %$eta" for eta in eta_vals]);
Let’s calculate the time path of infected people, current cases, and mortality
x_0 = [s_0, e_0, i_0, 0.0, 3.0, 0.0, 0.0]
sols = [solve(ODEProblem(F, x_0, tspan, p_gen(eta = eta)), Tsit5();
saveat = 0.5) for eta in eta_vals];
Next, plot the
Rs = [sol[5, :] for sol in sols]
plot(Rs; label = labels, legend = :topright, lw = 2, xlabel = L"t",
ylabel = L"R_0(t)", title = "Basic Reproduction Number")
Now let’s plot the number of infected persons and the cumulative number of infected persons:
infecteds = [sol[3, :] for sol in sols]
plot(infecteds; label = labels, legend = :topleft, lw = 2, xlabel = L"t",
ylabel = L"i(t)", title = "Current Infected")
cumulative_infected = [sol[6, :] for sol in sols]
plot(cumulative_infected; label = labels, legend = :topleft, lw = 2,
xlabel = L"t", ylabel = L"c(t)", title = "Cumulative Infected")
44.5. Ending Lockdown#
The following is inspired by additional results by Andrew Atkeson on the timing of lifting lockdown.
Consider these two mitigation scenarios:
choose
to target for 30 days and then for the remaining 17 months. This corresponds to lifting lockdown in 30 days. for 120 days and then for the remaining 14 months. This corresponds to lifting lockdown in 4 months.
For both of these, we will choose a large
The parameters considered here start the model with 25,000 active infections and 75,000 agents already exposed to the virus and thus soon to be contagious.
R_0_L = 0.5 # lockdown
R_bar_0_lift_early(t, p) = t < 30.0 ? R_0_L : 2.0
R_bar_0_lift_late(t, p) = t < 120.0 ? R_0_L : 2.0
p_early = p_gen(R_bar_0 = R_bar_0_lift_early, eta = 10.0)
p_late = p_gen(R_bar_0 = R_bar_0_lift_late, eta = 10.0)
# initial conditions
i_0 = 25000 / p_early.N
e_0 = 75000 / p_early.N
s_0 = 1.0 - i_0 - e_0
x_0 = [s_0, e_0, i_0, 0.0, R_0_L, 0.0, 0.0] # start in lockdown
# create two problems, with rapid movement of R_0(t) towards R_bar_0(t)
prob_early = ODEProblem(F, x_0, tspan, p_early)
prob_late = ODEProblem(F, x_0, tspan, p_late)
ODEProblem with uType Vector{Float64} and tType Float64. In-place: false
Non-trivial mass matrix: false
timespan: (0.0, 550.0)
u0: 7-element Vector{Float64}:
0.9996969696969696
0.00022727272727272727
7.575757575757576e-5
0.0
0.5
0.0
0.0
Unlike the previous examples, the tstops
Let’s calculate the paths:
sol_early = solve(prob_early, Tsit5(); tstops = [30.0, 120.0])
sol_late = solve(prob_late, Tsit5(); tstops = [30.0, 120.0])
plot(sol_early; idxs = [7], title = "Total Mortality", label = "Lift Early",
legend = :topleft)
plot!(sol_late; idxs = [7], label = "Lift Late", xlabel = L"t")
Next we examine the daily deaths,
flow_deaths(sol, p) = p.N * p.delta * p.gamma * sol[3, :]
plot(sol_early.t, flow_deaths(sol_early, p_early); title = "Flow Deaths",
label = "Lift Early")
plot!(sol_late.t, flow_deaths(sol_late, p_late); label = "Lift Late",
xlabel = L"t")
Pushing the peak of curve further into the future may reduce cumulative deaths if a vaccine is found, or allow health authorities to better smooth the caseload.
44.5.1. Randomness#
Despite its richness, the model above is fully deterministic. The policy
One way that randomness can lead to aggregate fluctuations is the granularity that comes through the discreteness of individuals. This topic, the connection between SDEs and the Langevin equations typically used in the approximation of chemical reactions in well-mixed media is explored in further lectures on continuous time Markov chains.
Instead, in the next lecture, we will concentrate on randomness that comes from aggregate changes in behavior or policy.