Kalman Filter 1D¶

In [1]:

import daft

In [2]:

from __future__ import division, print_function

In [3]:

%matplotlib inline

#from IPython.core.pylabtools import figsize
from matplotlib import pyplot as plt
import numpy as np

Markov Chain¶

In [4]:

def plot_markov_chain():
    pgm = daft.PGM([8.3, 1.05], origin=[-2., .3], aspect=1.)
    pgm.add_node(daft.Node("H_1", r"$H_1$", 0., 1.))
    pgm.add_node(daft.Node("H_2", r"$H_2$", 1., 1.))
    pgm.add_node(daft.Node("H_3", r"$H_3$", 2., 1.))
    pgm.add_node(daft.Node("H_4", r"$H_4$", 3., 1.))
   
    
    pgm.add_edge("H_1", "H_2")
    pgm.add_edge("H_2", "H_3")
    pgm.add_edge("H_3", "H_4")
 
    pgm.render()

In [5]:

plot_markov_chain()

The following independence assumption (Markov Assumption) holds for the graph:

$$ P(H_t \mid H_1, \dots ,H_{t-1}) = P(H_t \mid H_{t-1}) $$

In other words, the current state is statistically independent of the previous states given the state one time step before, i.e. if the chain starts at time $t=1$:

$$ \forall t \in \mathbb N^{+}, \forall t' \in \mathbb N^{+}: (t-t' \geq 2) \Rightarrow H_t \perp H_{t'} \mid H_{t-1} $$

Localization with motion on a grid¶

$H_t$ is the random variable for the state of a robot, $Val(H_t)$ are the grid indices.

e.g. $H_t=3$: the robot is located at time $t$ in grid cell with index 3.

Motion equation:¶

$$ P(H_t=i) = \sum_j P(H_{t-1}=j)P(H_t=i \mid H_{t-1}=j) $$

$P(H_t=i)$: Probability that the robot is located at grid cell index $i$ at time $t$
$P(H=i \mid H=j)$: Transition Probability

For localization the (relative) transition probabilities are typically translation invariant, e.g.

$$ P(H_t=2 \mid H_{t-1}=3) = P(H_t=1 \mid H_{t-1}=2) $$

The transition probability depends only on the local differenz between the states: $H_t-H_{t-1}$.

$$ P(H_t) = \sum_{H_{t-1}} P(H_{t-1})P(H_t - H_{t-1}) $$

So the motion equation is mathematically a convolution.

$$ (f * g)(H_t) = \sum_{H_{t-1}} f(H_{t-1}) g(H_t - H_{t-1}) $$

So if we know the localization of the robot at time $t=1$ and we do not observe any variable (no measurement), our knowledge of the robot location becomes more "blurry" with time.

Example:¶

10 cells
the robot is located at time 0 in cell with grid index 4
motion:
- go left with probability 10%
- stay at position: 70%
- go right: 20%

In [6]:

p = np.zeros([5,10])
p[0, 4] = 1 # the robot is located at time 0 in cell with grid index 4

# transition probabilities for 
# [P(H=1|H=i-1), P(H=1|H=i), P(H=1|H=i+1)] all other tp are zero
tp = np.array([0.1, 0.7, 0.2])
np.set_printoptions(precision=3)
for t in range(4):
    p[t+1] = np.convolve(p[t], tp, mode='same')    
print (p)

[[0.000e+00 0.000e+00 0.000e+00 0.000e+00 1.000e+00 0.000e+00 0.000e+00
  0.000e+00 0.000e+00 0.000e+00]
 [0.000e+00 0.000e+00 0.000e+00 1.000e-01 7.000e-01 2.000e-01 0.000e+00
  0.000e+00 0.000e+00 0.000e+00]
 [0.000e+00 0.000e+00 1.000e-02 1.400e-01 5.300e-01 2.800e-01 4.000e-02
  0.000e+00 0.000e+00 0.000e+00]
 [0.000e+00 1.000e-03 2.100e-02 1.530e-01 4.270e-01 3.060e-01 8.400e-02
  8.000e-03 0.000e+00 0.000e+00]
 [1.000e-04 2.800e-03 3.020e-02 1.540e-01 3.601e-01 3.080e-01 1.208e-01
  2.240e-02 1.600e-03 0.000e+00]]

In [7]:

plt.figure(figsize=(10,8))
for t in range(4):
    plt.subplot(4, 1, t+1)
    plt.plot(range(10), p[t],'bo')
    plt.ylim(-0.0,1.1)
    plt.ylabel("probability")
_ = plt.xlabel("grid cells")

Note: This also works with a multimodal distribution.

Hidden Markov Models¶

In [8]:

def plot_HMM():
    pgm = daft.PGM([8.3, 2.05], origin=[-1., -0.3], aspect=1.)
    pgm.add_node(daft.Node("Z_1", r"$Z_1$", 0., 0., observed=True))
    pgm.add_node(daft.Node("Z_2", r"$Z_2$", 1., 0., observed=True))
    pgm.add_node(daft.Node("Z_3", r"$Z_3$", 2., 0., observed=True))
    pgm.add_node(daft.Node("Z_4", r"$Z_4$", 3., 0., observed=True))
    
    pgm.add_node(daft.Node("H_1", r"$H_1$", 0., 1.))
    pgm.add_node(daft.Node("H_2", r"$H_2$", 1., 1.))
    pgm.add_node(daft.Node("H_3", r"$H_3$", 2., 1.))
    pgm.add_node(daft.Node("H_4", r"$H_4$", 3., 1.))
   
    pgm.add_edge("H_1", "H_2")
    pgm.add_edge("H_2", "H_3")
    pgm.add_edge("H_3", "H_4")

    pgm.add_edge("H_1", "Z_1")
    pgm.add_edge("H_2", "Z_2")
    pgm.add_edge("H_3", "Z_3")
    pgm.add_edge("H_4", "Z_4")
  
    pgm.render()

In [9]:

plot_HMM()

The following conditional independence assumptions encodes the graph $\mathcal G$:

$$ P(Z_t \mid Z_1, \dots, Z_{t-1}, H_1, \dots, H_{t}) = P(Z_t \mid H_t) $$

$$ P(H_t \mid Z_1, \dots, Z_{t-1}, H_1, \dots, H_{t-1}) = P(H_t \mid H_{t-1}) $$

Example: Robot locatization with observations¶

Addtionally we have a sensor which gives us an observation $Z_t$ which indicates, where the robot is located.

Bayes rule can be use to update our belief of the robot location:

$$ P(H_t=i \mid Z_t) = \frac{P(Z_t \mid H_t=i)P(H_{t}=i)}{P(Z_t)} $$

the Prior $P(H_{t}=i)$ is computed by the motion equation.
Measurement (Likelihood): $P(Z_t \mid H_t=i)$
Normalizer $P(Z_t) = \sum_i P(Z_t \mid H_t=i) P(H_{t}=i)$

In [10]:

prior = p[4]

In [11]:

plt.figure(figsize=(8,10))
plt.subplot(4, 1, 1)
plt.plot(range(10), prior,'ro')
plt.ylim(-0.0,1.1)
plt.ylabel("probability")
plt.xlabel("grid cells")
plt.title('Prior')

Out[11]:

Text(0.5,1,'Prior')

In [12]:

plt.figure(figsize=(8,10))
p_measurement = np.array([0., 0., 0., 0.05, 0.18, 0.54, 0.18, 0.05, 0., 0.])
plt.subplot(4, 1, 2)
plt.plot(range(10), p_measurement,'go')
plt.ylim(-0.0,1.1)
plt.ylabel("probability")
plt.xlabel("grid cells")
plt.title('Likelihood')

Out[12]:

Text(0.5,1,'Likelihood')

In [13]:

plt.figure(figsize=(8,10))
posterior = p_measurement * prior
posterior = posterior / posterior.sum()
plt.subplot(4, 1, 3)
plt.plot(range(10), posterior,'bo')
plt.ylim(-0.0,1.1)
plt.ylabel("probability")
plt.xlabel("grid cells")
plt.title('Posterior')

Out[13]:

Text(0.5,1,'Posterior')

1D Continuous State Space¶

Motion and Prediction¶

Assumption:

The Position is described by a Gaussian.

Limitation due to the Gaussian: unimodal

A Kalman Filter is a instance of an Bayes Filter: The representation of information is a probability density function.

Here we investigate the 1D Kalman Filter, i.e. we have a constant motion model.

Assumption:

the state transition probability $p(h_t \mid h_{t-1}, v_t)$ is linear with Gaussian noise (called linear Gaussian), in general for 1d:

$$ h_t = h_{t-1} + v_t + \epsilon $$

$v_t$ is the traveling distance in time $\Delta t$ (one time step).
$\epsilon$ is Gaussian noise with $\sigma_v$ (the noise of $v_t$ in $\Delta t$)

$$ p(h_t \mid h_{t-1}, v_t) = \mathcal N(h_{t-1} + v_t, \sigma_v) $$

So the motion is described by a Gaussian $\mathcal N$.

The inital position at $t=1$ changes to due the motion. The initial uncertainty of the position is increased by the uncertainty of the state transition probability.
This is described mathematically by the convolution of the two Gaussians.

The convolution of two Gaussians is again a Gaussian with (see e.g. [Thr06][Bro14]:

$\mu' = \mu + \mu_v$
$\sigma'^2 = \sigma^2 + \sigma^2_v$

Function for the prediction:

In [14]:

def predict(mu, sigma, mu1, sigma1):
    # convolution of two gaussians
    mu_ = mu + mu1
    sigma_ = np.sqrt(sigma**2+sigma1**2) 
    return mu_, sigma_

In [15]:

import scipy.stats

In [16]:

h_space = np.arange(-10.,10.,.05)
v_space = np.arange(-5.,9.,.05)
mu0 = -2.
sigma0 = 1.5
mu_v = 2.5
sigma_v = 1.9

In [17]:

h0_pdf = scipy.stats.norm(loc=mu0, scale=sigma0).pdf(h_space)
v_pdf = scipy.stats.norm(loc=mu_v, scale=sigma_v).pdf(v_space)
mu1,sigma1 = predict(mu0, sigma0, mu_v, sigma_v)
h1_pdf = scipy.stats.norm(loc=mu1, scale=sigma1).pdf(h_space)

In [18]:

plt.figure(figsize= (8,3))
plt.title("traveling distance in time $\Delta t$")
plt.plot(v_space, v_pdf,'g-')
plt.ylabel("probability density")
plt.xlabel("v")

Out[18]:

Text(0.5,0,'v')

In [19]:

plt.figure(figsize=(8,3))
plt.plot(h_space, h0_pdf,'b-', label="Position at $t$")
plt.plot(h_space, h1_pdf,'r-', label="Predicted Position at $t+\Delta t$")
plt.ylabel("probability density")
plt.xlabel("h")
plt.legend()

Out[19]:

<matplotlib.legend.Legend at 0x117091438>

Measurement¶

Assumption: The measurement probability is linear

$$ z_t = h_t + \delta $$ with:

$\delta$: zero-mean Gaussian noise of the measurement

or $$ h_t = z_t - \delta $$

(Note: the pdf of the Gaussian noise is symmetric.)

$$ p(z_t \mid h_t) = \mathcal N(z_t, \sigma_o) $$

For the measurement we use Bayes rule with the probability densities

$$ p(h_t \mid z_t) = \frac{p(z_t \mid h_t)p(h_t)}{p(z_t)} $$

with

$h \in \mathbb R$
The prior is the predicted position after a motion (step): $p(H_{t}=h)=\mathcal N(\mu',\sigma')$
Measurement Likelihood: $p(z_t \mid h_t) = \mathcal N(\mu_o,\sigma_o)$
- The measurement locates the robot at position $z_t = \mu_o$.
- $\sigma_o$ is the standard deviation of the measurement (e.g. sensor uncertainty).
Normalizer $p(z_t) = \int_{\mathcal H} p(z_t \mid H_t=h) p(H_{t}=h) dh$

Product of two Gaussians¶

$$ \mu = \frac{\sigma_o^2 \mu' +\sigma'^2\mu_o}{\sigma'^2+\sigma_o^2} $$

$$ \sigma^2 = \frac{1}{1/\sigma'^2 + 1/\sigma_o^2} $$

(for the derivation see e.g. [Bro14])

In [27]:

def measurement(mu, sigma, mu1, sigma1):
    # product of two gaussians
    mu_ = (sigma1** 2 * mu + sigma**2 * mu1)/ (sigma** 2 +sigma1** 2 )
    sigma_ = np.sqrt( 1./( 1./sigma**2 + 1./sigma1**2) )
    return mu_, sigma_

In [21]:

sigma_o = 3.
mu_o = -1.

In [22]:

o_pdf = scipy.stats.norm(loc=mu_o, scale=sigma_o).pdf(h_space)
mu, sigma = measurement(mu1, sigma1, mu_o, sigma_o)
h_pdf = scipy.stats.norm(loc=mu, scale=sigma).pdf(h_space)

In [23]:

plt.figure(figsize=(8,3))
plt.plot(h_space, h1_pdf,'r-', label="Prior")
plt.plot(h_space, o_pdf,'g-', label="Likelihood")
plt.plot(h_space, h_pdf,'b-', label="Posterior")
plt.ylabel("probability density")
plt.xlabel("h")
plt.legend()

Out[23]:

<matplotlib.legend.Legend at 0x1171b9da0>

Example¶

In [24]:

sigma_o = 1.6  
sigma_v = .9

In [25]:

observations = np.array([-2, -1.5, -0.4, 1.2, 2.1])
velocity = np.array([1, 1.1, 1.2, 1.2, 1.2])
h_space = np.arange(-6., 10., 0.05)

In [26]:

mu = 0. # initial position
sigma = 6. # initial standard deviation of the position
plt.figure(figsize=(10,12))

h_pdf = scipy.stats.norm(loc=mu, scale=sigma).pdf(h_space)
plt.subplot(6, 1, 1)
plt.plot(h_space, h_pdf,'b-', label="Position at t=0")
plt.ylim(0.,0.4)
plt.legend()
for i,(o, v) in enumerate(zip(observations, velocity)):
    plt.subplot(6, 1, i+2)  
    plt.ylim(0.,0.4)
    mu_, sigma_ = predict(mu, sigma, v, sigma_v) 
    h_pdf = scipy.stats.norm(loc=mu_, scale=sigma_).pdf(h_space)
    plt.plot(h_space, h_pdf,'r-', label="Prediction at t=%i"%(i+1))
    
    h_pdf = scipy.stats.norm(loc=o, scale=sigma_o).pdf(h_space)
    plt.plot(h_space, h_pdf,'g-', label="Observation at t=%i"%(i+1))
    
    mu, sigma = measurement(mu_, sigma_, o, sigma_o)
    h_pdf = scipy.stats.norm(loc=mu, scale=sigma).pdf(h_space)
    plt.plot(h_space, h_pdf,'b-', label="Position at t=%i"%(i+1))
    plt.legend()

$$\hat x$$

Literature and Links:¶

Fred Hamprecht: Lecture in Course Pattern Recognition (Video), Directed Graphical Models; State Space Models, Uni Heidelberg
C. Bishop: "Graphical Models" Talk at MLSS 2013 (1:38-)
Udacity MOOC: Thrun, Sebastian: Artifical Intelligence for Robotics Lesson 2 Kalman Filter
[Bro14] P.A. Bromiley: Products and Convolutions of Gaussian Probability Density Functions
[Thu06] Thrun, Sebastian, Wolfram Burgard, and Dieter Fox: Probabilistic robotics. MIT press, 2006.
Roger Labbe: Kalman and Bayesian Filters in Python, Online Book as jupyter notebook files