22. AR1 Processes#
Contents
22.1. Overview#
In this lecture we are going to study a very simple class of stochastic models called AR(1) processes.
These simple models are used again and again in economic research to represent the dynamics of series such as
labor income
dividends
productivity, etc.
AR(1) processes can take negative values but are easily converted into positive processes when necessary by a transformation such as exponentiation.
We are going to study AR(1) processes partly because they are useful and partly because they help us understand important concepts.
using LinearAlgebra, Statistics
using Distributions, LaTeXStrings, Plots
22.2. The AR(1) Model#
The AR(1) model (autoregressive model of order 1) takes the form
where
This law of motion generates a time series
This is called the state process and the state space is
To make things even simpler, we will assume that
the process
is IID and standard normal,the initial condition
is drawn from the normal distribution andthe initial condition
is independent of .
22.2.1. Moving Average Representation#
Iterating backwards from time
If we work all the way back to time zero, we get
Equation (22.2) shows that
the parameters,
the initial condition
andthe shocks
from time to the present.
Throughout, the symbol
22.2.2. Distribution Dynamics#
One of the nice things about this model is that it’s so easy to trace out the sequence of distributions
To see this, we first note that
This is immediate from (22.2), since linear combinations of independent normal random variables are normal.
Given that
Let
We can pin down these values from (22.2) or we can use the following recursive expressions:
These expressions are obtained from (22.1) by taking, respectively, the expectation and variance of both sides of the equality.
In calculating the second expression, we are using the fact that
(This follows from our assumptions and (22.2).)
Given the dynamics in (22.2) and initial conditions
The following code uses these facts to track the sequence of marginal
distributions
The parameters are
a = 0.9
b = 0.1
c = 0.5
# initial conditions mu_0, v_0
mu = -3.0
v = 0.6
0.6
Here’s the sequence of distributions:
using StatsPlots
sim_length = 10
x_grid = range(-5, 7, length = 120)
plt = plot()
for t in 1:sim_length
mu = a * mu + b
v = a^2 * v + c^2
dist = Normal(mu, sqrt(v))
plot!(plt, x_grid, pdf.(dist, x_grid), label = L"\psi_{%$t}", linealpha = 0.7)
end
plt
22.3. Stationarity and Asymptotic Stability#
Notice that, in the figure above, the sequence
This is even clearer if we project forward further into the future:
function plot_density_seq(mu_0 = -3.0, v_0 = 0.6; sim_length = 60)
mu = mu_0
v = v_0
plt = plot()
for t in 1:sim_length
mu = a * mu + b
v = a^2 * v + c^2
dist = Normal(mu, sqrt(v))
plot!(plt, x_grid, pdf.(dist, x_grid), label = nothing, linealpha = 0.5)
end
return plt
end
plot_density_seq()
Moreover, the limit does not depend on the initial condition.
For example, this alternative density sequence also converges to the same limit.
plot_density_seq(3.0)
In fact it’s easy to show that such convergence will occur, regardless of the initial condition, whenever
To see this, we just have to look at the dynamics of the first two moments, as given in (22.3).
When
(See our lecture on one dimensional dynamics for background on deterministic convergence.)
Hence
We can confirm this is valid for the sequence above using the following code.
plt = plot_density_seq(3.0)
mu_star = b / (1 - a)
std_star = sqrt(c^2 / (1 - a^2)) # square root of v_star
psi_star = Normal(mu_star, std_star)
plot!(plt, x_grid, psi_star, color = :black, label = L"\psi^*", linewidth = 2)
plt
As claimed, the sequence
22.3.1. Stationary Distributions#
A stationary distribution is a distribution that is a fixed point of the update rule for distributions.
In other words, if
A different way to put this, specialized to the current setting, is as follows: a
density
The distribution
(Of course, we are assuming that
In fact, it can be shown that no other distribution on
Thus, when
22.4. Ergodicity#
The concept of ergodicity is used in different ways by different authors.
One way to understand it in the present setting is that a version of the Law
of Large Numbers is valid for
In particular, averages over time series converge to expectations under the stationary distribution.
Indeed, it can be proved that, whenever
whenever the integral on the right hand side is finite and well defined.
Notes:
In (22.6), convergence holds with probability one.
The textbook by [MT09] is a classic reference on ergodicity.
For example, if we consider the identity function
In other words, the time series sample mean converges to the mean of the stationary distribution.
As will become clear over the next few lectures, ergodicity is a very important concept for statistics and simulation.
22.5. Exercises#
22.5.1. Exercise 1#
Let
The
When that random variable is
Here
According to (22.6), we should have, for any
when
Confirm this by simulation at a range of
22.5.2. Exercise 2#
Write your own version of a one dimensional kernel density estimator, which estimates a density from a sample.
Write it as a function
For
To set the bandwidth, use Silverman’s rule (see the “rule of thumb” discussion on this page). Test the function you have written by going through the steps
simulate data
from distributionplot the kernel density estimate over a suitable range
plot the density of
on the same figure
for distributions
beta distribution with
beta distribution with
andbeta distribution with
Use
Make a comment on your results. (Do you think this is a good estimator of these distributions?)
22.5.3. Exercise 3#
In the lecture we discussed the following fact: for the
with
Confirm this, at least approximately, by simulation. Let
First, plot
Second, plot
Generate
draws of from the distributionUpdate them all using the rule
Use the resulting sample of
values to produce a density estimate via kernel density estimation.
Try this for
22.6. Solutions#
22.6.1. Exercise 1#
Here is one solution:
using Random
a = 0.9
b = 0.1
c = 0.5
mu_star = b / (1 - a)
std_star = sqrt(c^2 / (1 - a^2)) # square root of v_star
function sample_moments_ar1(k, m = 100_000, mu_0 = 0.0, sigma_0 = 1.0; seed = 1234)
Random.seed!(seed)
sample_sum = 0.0
x = mu_0 + sigma_0 * randn()
for t in 1:m
sample_sum += (x - mu_star)^k
x = a * x + b + c * randn()
end
return sample_sum / m
end
function double_factorial(n)
return prod(1:2:n)
end
function true_moments_ar1(k)
if k % 2 == 0
return std_star^k * double_factorial(k - 1)
else
return 0
end
end
k_vals = collect(1:6)
sample_moments = zeros(6)
true_moments = zeros(6)
for (k_idx, k) in enumerate(k_vals)
sample_moments[k_idx] = sample_moments_ar1(k)
true_moments[k_idx] = true_moments_ar1(k)
end
plt = plot()
plot!(plt, k_vals, true_moments, label = "true moments")
plot!(plt, k_vals, sample_moments, label = "sample moments")
plt
22.6.2. Exercise 2#
Here is one solution:
K(x) = pdf.(Normal(), x)
function f(x_val, x_data, h)
return (1 / h) * mean(K((x_val .- x_data) / h))
end
f (generic function with 1 method)
function plot_kde(phi, n, plt, idx; x_min = -0.2, x_max = 1.2)
x_data = rand(phi, n)
x_grid = range(-0.2, 1.2, length = 100)
c = std(x_data)
h = 1.06 * c * n^(-1 / 5)
plot!(plt[idx], x_grid, f.(x_grid, Ref(x_data), h), label = "estimate")
plot!(plt[idx], x_grid, pdf.(phi, x_grid), label = "true density")
return plt
end
plot_kde (generic function with 1 method)
n = 500
parameter_pairs = [(2, 2), (2, 5), (0.5, 0.5)]
plt = plot(layout = (3, 1))
for (idx, (alpha, beta)) in enumerate(parameter_pairs)
plot_kde(Beta(alpha, beta), n, plt, idx)
end
plt
We see that the kernel density estimator is effective when the underlying distribution is smooth but less so otherwise.
22.6.3. Exercise 3#
Here is our solution:
a = 0.9
b = 0.0
c = 0.1
mu = -3
s = 0.2
mu_next = a * mu + b
s_next = sqrt(a^2 * s^2 + c^2)
psi = Normal(mu, s)
psi_next = Normal(mu_next, s_next)
Normal{Float64}(μ=-2.7, σ=0.20591260281974003)
K(x) = pdf.(Normal(), x)
function f(x_val, x_data, h)
return (1 / h) * mean(K((x_val .- x_data) / h))
end
f (generic function with 1 method)
n = 2000
x_draws = rand(psi, n)
x_draws_next = a * x_draws .+ b + c * randn(n)
c = std(x_draws_next)
h = 1.06 * c * n^(-1 / 5)
x_grid = range(mu - 1, mu + 1, length = 100)
plt = plot()
plot!(plt, x_grid, pdf.(psi, x_grid), label = L"$\psi_t$")
plot!(plt, x_grid, pdf.(psi_next, x_grid), label = L"$\psi_{t+1}$")
plot!(plt, x_grid, f.(x_grid, Ref(x_draws_next), h),
label = L"estimate of $\psi_{t+1}$")
plt
The simulated distribution approximately coincides with the theoretical distribution, as predicted.