Kalman Filter 1D

In [1]:
import daft
In [2]:
from __future__ import division, print_function
In [3]:
%matplotlib inline

#from IPython.core.pylabtools import figsize
from matplotlib import pyplot as plt
import numpy as np

Markov Chain

In [4]:
def plot_markov_chain():
    pgm = daft.PGM([8.3, 1.05], origin=[-2., .3], aspect=1.)
    pgm.add_node(daft.Node("H_1", r"$H_1$", 0., 1.))
    pgm.add_node(daft.Node("H_2", r"$H_2$", 1., 1.))
    pgm.add_node(daft.Node("H_3", r"$H_3$", 2., 1.))
    pgm.add_node(daft.Node("H_4", r"$H_4$", 3., 1.))
    pgm.add_edge("H_1", "H_2")
    pgm.add_edge("H_2", "H_3")
    pgm.add_edge("H_3", "H_4")
In [5]:

The following independence assumption (Markov Assumption) holds for the graph:

$$ P(H_i \mid H_1, \dots H_{i-1}) = P(H_i \mid H_{i-1}) $$

In other words, the current state is statistically independent of the previous states given the state one time step before (the chain starts with time i=0):

$$ \forall i \in \mathbb N_+; \forall k \in \mathbb N_+: (i-k \geq 0) \land k>1 \Rightarrow H_i \perp H_{i-k} \mid H_{i-1} $$

Localization with motion on a grid

$H_t$ is the random variable for the state of a robot, $Val(H_t)$ are the grid indices.

  • e.g. $H_t=3$: the robot is located at time $t$ in grid cell with index 3.

Motion equation:

$$ P(H_t=i) = \sum_jP\left(H_{t-1}=j)P(H_t=i \mid H_{t-1}=j)\right) $$
  • $P(H_t=i)$: Probability that the robot is located at grid cell index $i$ at time $t$
  • $P(H=i \mid H=j)$: Transition Probability

For localization the (relative) transition probabilities are typically translation invariant, e.g.

$$ P(H=2 \mid H=3) = P(H=1 \mid H=2)$$

Therefore the motion equation can mathematically be expressed as a convolution.

So if we know the localization of the robot at time $t=1$ and we do not observe any variable (no measurement), our knowledge of the robot location becomes more "blurry" with time.

  • 10 cells
  • the robot is located at time 0 in cell with grid index 4
  • motion:
    • go left with probability 10%
    • stay at position: 70%
    • go right: 20%
In [6]:
p = np.zeros([5,10])
p[0, 4] = 1 # the robot is located at time 0 in cell with grid index 4

# transition probabilities for 
# [P(H=1|H=i-1), P(H=1|H=i), P(H=1|H=i+1)] all other tp are zero
tp = np.array([0.1, 0.7, 0.2])
for t in range(4):
    p[t+1] = np.convolve(p[t], tp, mode='same')    
print (p)    
[[  0.000e+00   0.000e+00   0.000e+00   0.000e+00   1.000e+00   0.000e+00
    0.000e+00   0.000e+00   0.000e+00   0.000e+00]
 [  0.000e+00   0.000e+00   0.000e+00   1.000e-01   7.000e-01   2.000e-01
    0.000e+00   0.000e+00   0.000e+00   0.000e+00]
 [  0.000e+00   0.000e+00   1.000e-02   1.400e-01   5.300e-01   2.800e-01
    4.000e-02   0.000e+00   0.000e+00   0.000e+00]
 [  0.000e+00   1.000e-03   2.100e-02   1.530e-01   4.270e-01   3.060e-01
    8.400e-02   8.000e-03   0.000e+00   0.000e+00]
 [  1.000e-04   2.800e-03   3.020e-02   1.540e-01   3.601e-01   3.080e-01
    1.208e-01   2.240e-02   1.600e-03   0.000e+00]]
In [7]:
for t in range(4):
    plt.subplot(4, 1, t+1)
    plt.plot(range(10), p[t],'bo')
plt.xlabel("grid cells")
<matplotlib.text.Text at 0x10b0d8b50>

Note: This also works with a multimodal distribution.

Hidden Markov Models

In [8]:
def plot_HMM():
    pgm = daft.PGM([8.3, 2.05], origin=[-1., -0.3], aspect=1.)
    pgm.add_node(daft.Node("Z_1", r"$Z_1$", 0., 0., observed=True))
    pgm.add_node(daft.Node("Z_2", r"$Z_2$", 1., 0., observed=True))
    pgm.add_node(daft.Node("Z_3", r"$Z_3$", 2., 0., observed=True))
    pgm.add_node(daft.Node("Z_4", r"$Z_4$", 3., 0., observed=True))
    pgm.add_node(daft.Node("H_1", r"$H_1$", 0., 1.))
    pgm.add_node(daft.Node("H_2", r"$H_2$", 1., 1.))
    pgm.add_node(daft.Node("H_3", r"$H_3$", 2., 1.))
    pgm.add_node(daft.Node("H_4", r"$H_4$", 3., 1.))
    pgm.add_edge("H_1", "H_2")
    pgm.add_edge("H_2", "H_3")
    pgm.add_edge("H_3", "H_4")

    pgm.add_edge("H_1", "Z_1")
    pgm.add_edge("H_2", "Z_2")
    pgm.add_edge("H_3", "Z_3")
    pgm.add_edge("H_4", "Z_4")
In [9]:

The following conditional independence assumptions encodes the graph $\mathcal G$:

$$ P(Z_t \mid Z_1, \dots, Z_{t-1}, H_1, \dots, H_{t}) = P(Z_t \mid H_t) $$$$ P(H_t \mid Z_1, \dots, Z_{t-1}, H_1, \dots, H_{t-1}) = P(H_t \mid H_{t-1}) $$

Example: Robot locatization with observations

Addtionally we have a sensor which gives us an observation $Z_t$ which indicates, where the robot is located.

Bayes rule can be use to update our belief of the robot location:

$$ P(H_t=i \mid Z_t) = \frac{P(Z_t \mid H_t=i)P(H_{t}=i)}{P(Z_t)} $$
  • the prior $P(H_{t}=i)$ is computed by the motion equation.
  • Measurement: $P(Z_t \mid H_t=i)$
  • Normalizer $P(Z_t) = \sum_i P(Z_t \mid H_t=i) P(H_{t}=i)$
In [10]:
plt.subplot(4, 1, 1)
plt.plot(range(10), p[4],'ro')
plt.xlabel("grid cells")
<matplotlib.text.Text at 0x10b74e910>
In [11]:
p_measurement = np.array([0., 0., 0., 0.05, 0.18, 0.54, 0.18, 0.05, 0., 0.])
plt.subplot(4, 1, 2)
plt.plot(range(10), p_measurement,'go')
plt.xlabel("grid cells")
<matplotlib.text.Text at 0x10b694ad0>
In [12]:
posterior = p_measurement * p[4]
posterior = posterior / posterior.sum()
plt.subplot(4, 1, 3)
plt.plot(range(10), posterior,'bo')
plt.xlabel("grid cells")
<matplotlib.text.Text at 0x10ae3b610>

1D Continuous State Space

Motion and Prediction


  • The Position is described by a Gaussian with parameters $\mu, \sigma$

Limitation due to the Gaussian: unimodal


  • the state transition probability $p(h_t \mid h_{t-1}, u_t)$ is linear with Gaussian noise (called linear Gaussian), in general for 1d:
$$ h_t = h_{t-1} + v_t + \epsilon $$
  • $v_t$ is the traveling distance in time $\Delta t$ (one time step).
  • $\epsilon$ is Gaussian noise with $\sigma_v$ (the noise of $v_t$ in $\Delta t$)
$$ p(h_t \mid h_{t-1}, v_t) = \mathcal N(h_{t-1} + v_t, \sigma_v) $$

So the motion is given by a Gaussian with parameters $\mu_v (=v_t), \sigma_v$

The inital position at $t=1$ changes to due the motion. The initial uncertainty of the position is increased by the uncertainty of the state transition probability.
This is described mathematically by a convolution of the two Gaussians.

The convolution of two Gaussians is again a Gaussian with (see e.g. [Thr06][Bro14]:

  • $\mu' = \mu + \mu_v$
  • $\sigma'^2 = \sigma^2 + \sigma^2_v$

Function for the prediction:

In [13]:
def predict(mu, sigma, mu1, sigma1):
    # convolution of two gaussians
    mu_ = mu + mu1
    sigma_ = np.sqrt(sigma**2+sigma1**2) 
    return mu_, sigma_
In [14]:
import scipy.stats  
In [15]:
h_space = np.arange(-10.,10.,.05)
v_space = np.arange(-5.,9.,.05)
mu0 = -2.
sigma0 = 1.5
mu_v = 2.5
sigma_v = 1.9
In [16]:
h0_pdf = scipy.stats.norm(loc=mu0, scale=sigma0).pdf(h_space)
v_pdf = scipy.stats.norm(loc=mu_v, scale=sigma_v).pdf(v_space)
mu1,sigma1 = predict(mu0, sigma0, mu_v, sigma_v)
h1_pdf = scipy.stats.norm(loc=mu1, scale=sigma1).pdf(h_space)
In [17]:
plt.figure(figsize= (8,3))
plt.plot(v_space, v_pdf,'g-')
plt.ylabel("probability density")
<matplotlib.text.Text at 0x10ce0ea50>
In [18]:
plt.plot(h_space, h0_pdf,'b-', label="Position at t=1")
plt.plot(h_space, h1_pdf,'r-', label="Predicted Position at t=2")
plt.ylabel("probability density")
<matplotlib.legend.Legend at 0x10cf7f1d0>


Assumption: The measurement probability is linear

$$ z_t = h_t + \delta $$


  • $\delta$: Gaussian noise of the measurement

or $$ h_t = z_t - \delta $$

(Note: the pdf of the Gaussian noise is symmetric.)

$$ p(z_t \mid h_t) = \mathcal N(z_t, \sigma_o) $$

For the measurement we use Bayes rule with the probability densities

$$ p(h_t \mid z_t) = \frac{p(z_t \mid h_t)p(h_t)}{p(z_t)} $$


  • $h \in \mathbb R$
  • The prior is the predicted position after a movement step: $p(H_{t}=h)=\mathcal N(\mu',\sigma')$
  • Measurement Likelihood: $p(z_t \mid h_t) = \mathcal N(\mu_o,\sigma_o)$
    • The measurement locates the robot at position $z_t = \mu_o$.
    • $\sigma_o$ is the standard deviation of the measurement (e.g. sensor uncertainty).
  • Normalizer $p(z_t) = \int_{\mathcal H} p(z_t \mid H_t=h) p(H_{t}=h) dh$
Product of two Gaussians
$$ \mu = \frac{\sigma_o^2 \mu' +\sigma'^2\mu_o}{\sigma'^2+\sigma_o^2} $$$$ \sigma^2 = \frac{1}{1/\sigma'^2 + 1/\sigma_o^2} $$
In [19]:
def measurement(mu, sigma, mu1, sigma1):
    # product of two gaussians
    mu_ = (sigma1** 2 * mu + sigma**2 * mu1)/ (sigma** 2 +sigma1** 2 )
    sigma_ = np.sqrt( 1./( 1./sigma**2 + 1./sigma1**2) )
    return mu_, sigma_
In [20]:
sigma_o = 3.
mu_o = -1.
In [21]:
o_pdf = scipy.stats.norm(loc=mu_o, scale=sigma_o).pdf(h_space)
mu, sigma = measurement(mu1, sigma1, mu_o, sigma_o)
h_pdf = scipy.stats.norm(loc=mu, scale=sigma).pdf(h_space)
In [22]:
plt.plot(h_space, h1_pdf,'r-', label="Prior")
plt.plot(h_space, o_pdf,'g-', label="Likelihood")
plt.plot(h_space, h_pdf,'b-', label="Posterior")
plt.ylabel("probability density")
<matplotlib.legend.Legend at 0x10d1039d0>


In [23]:
sigma_o = 1.6  
sigma_v = .9  
In [24]:
observations = np.array([-2, -1.5, -0.4, 1.2, 2.1])
velocity = np.array([1, 1.1, 1.2, 1.2, 1.2])
h_space = np.arange(-6., 10., 0.05)
In [25]:
mu = 0. # initial position
sigma = 6. # initial standard deviation of the position

h_pdf = scipy.stats.norm(loc=mu, scale=sigma).pdf(h_space)
plt.subplot(6, 1, 1)
plt.plot(h_space, h_pdf,'b-', label="Position at t=0")
for i,(o, v) in enumerate(zip(observations, velocity)):
    plt.subplot(6, 1, i+2)  
    mu_, sigma_ = predict(mu, sigma, v, sigma_v) 
    h_pdf = scipy.stats.norm(loc=mu_, scale=sigma_).pdf(h_space)
    plt.plot(h_space, h_pdf,'r-', label="Prediction at t=%i"%(i+1))
    h_pdf = scipy.stats.norm(loc=o, scale=sigma_o).pdf(h_space)
    plt.plot(h_space, h_pdf,'g-', label="Observation at t=%i"%(i+1))
    mu, sigma = measurement(mu_, sigma_, o, sigma_o)
    h_pdf = scipy.stats.norm(loc=mu, scale=sigma).pdf(h_space)
    plt.plot(h_space, h_pdf,'b-', label="Position at t=%i"%(i+1))

Literature and Links: