Kalman Filter 1D

In [1]:
import daft
In [2]:
from __future__ import division, print_function
In [3]:
%matplotlib inline

#from IPython.core.pylabtools import figsize
from matplotlib import pyplot as plt
import numpy as np

Markov Chain

In [4]:
def plot_markov_chain():
    pgm = daft.PGM([8.3, 1.05], origin=[-2., .3], aspect=1.)
    pgm.add_node(daft.Node("H_1", r"$H_1$", 0., 1.))
    pgm.add_node(daft.Node("H_2", r"$H_2$", 1., 1.))
    pgm.add_node(daft.Node("H_3", r"$H_3$", 2., 1.))
    pgm.add_node(daft.Node("H_4", r"$H_4$", 3., 1.))
   
    
    pgm.add_edge("H_1", "H_2")
    pgm.add_edge("H_2", "H_3")
    pgm.add_edge("H_3", "H_4")
 
    pgm.render()
In [5]:
plot_markov_chain()

The following independence assumption (Markov Assumption) holds for the graph:

$$ P(H_t \mid H_1, \dots ,H_{t-1}) = P(H_t \mid H_{t-1}) $$

In other words, the current state is statistically independent of the previous states given the state one time step before, i.e. if the chain starts at time $t=1$:

$$ \forall t \in \mathbb N^{+}, \forall t' \in \mathbb N^{+}: (t-t' \geq 2) \Rightarrow H_t \perp H_{t'} \mid H_{t-1} $$

Localization with motion on a grid

$H_t$ is the random variable for the state of a robot, $Val(H_t)$ are the grid indices.

  • e.g. $H_t=3$: the robot is located at time $t$ in grid cell with index 3.

Motion equation:

$$ P(H_t=i) = \sum_j P(H_{t-1}=j)P(H_t=i \mid H_{t-1}=j) $$

  • $P(H_t=i)$: Probability that the robot is located at grid cell index $i$ at time $t$
  • $P(H=i \mid H=j)$: Transition Probability

For localization the (relative) transition probabilities are typically translation invariant, e.g.

$$ P(H_t=2 \mid H_{t-1}=3) = P(H_t=1 \mid H_{t-1}=2) $$

The transition probability depends only on the local differenz between the states: $H_t-H_{t-1}$.

$$ P(H_t) = \sum_{H_{t-1}} P(H_{t-1})P(H_t - H_{t-1}) $$

So the motion equation is mathematically a convolution.

$$ (f * g)(H_t) = \sum_{H_{t-1}} f(H_{t-1}) g(H_t - H_{t-1}) $$

So if we know the localization of the robot at time $t=1$ and we do not observe any variable (no measurement), our knowledge of the robot location becomes more "blurry" with time.

Example:
  • 10 cells
  • the robot is located at time 0 in cell with grid index 4
  • motion:
    • go left with probability 10%
    • stay at position: 70%
    • go right: 20%
In [6]:
p = np.zeros([5,10])
p[0, 4] = 1 # the robot is located at time 0 in cell with grid index 4

# transition probabilities for 
# [P(H=1|H=i-1), P(H=1|H=i), P(H=1|H=i+1)] all other tp are zero
tp = np.array([0.1, 0.7, 0.2])
np.set_printoptions(precision=3)
for t in range(4):
    p[t+1] = np.convolve(p[t], tp, mode='same')    
print (p)    
[[0.000e+00 0.000e+00 0.000e+00 0.000e+00 1.000e+00 0.000e+00 0.000e+00
  0.000e+00 0.000e+00 0.000e+00]
 [0.000e+00 0.000e+00 0.000e+00 1.000e-01 7.000e-01 2.000e-01 0.000e+00
  0.000e+00 0.000e+00 0.000e+00]
 [0.000e+00 0.000e+00 1.000e-02 1.400e-01 5.300e-01 2.800e-01 4.000e-02
  0.000e+00 0.000e+00 0.000e+00]
 [0.000e+00 1.000e-03 2.100e-02 1.530e-01 4.270e-01 3.060e-01 8.400e-02
  8.000e-03 0.000e+00 0.000e+00]
 [1.000e-04 2.800e-03 3.020e-02 1.540e-01 3.601e-01 3.080e-01 1.208e-01
  2.240e-02 1.600e-03 0.000e+00]]
In [7]:
plt.figure(figsize=(10,8))
for t in range(4):
    plt.subplot(4, 1, t+1)
    plt.plot(range(10), p[t],'bo')
    plt.ylim(-0.0,1.1)
    plt.ylabel("probability")
_ = plt.xlabel("grid cells")

Note: This also works with a multimodal distribution.

Hidden Markov Models

In [8]:
def plot_HMM():
    pgm = daft.PGM([8.3, 2.05], origin=[-1., -0.3], aspect=1.)
    pgm.add_node(daft.Node("Z_1", r"$Z_1$", 0., 0., observed=True))
    pgm.add_node(daft.Node("Z_2", r"$Z_2$", 1., 0., observed=True))
    pgm.add_node(daft.Node("Z_3", r"$Z_3$", 2., 0., observed=True))
    pgm.add_node(daft.Node("Z_4", r"$Z_4$", 3., 0., observed=True))
    
    pgm.add_node(daft.Node("H_1", r"$H_1$", 0., 1.))
    pgm.add_node(daft.Node("H_2", r"$H_2$", 1., 1.))
    pgm.add_node(daft.Node("H_3", r"$H_3$", 2., 1.))
    pgm.add_node(daft.Node("H_4", r"$H_4$", 3., 1.))
   
    pgm.add_edge("H_1", "H_2")
    pgm.add_edge("H_2", "H_3")
    pgm.add_edge("H_3", "H_4")

    pgm.add_edge("H_1", "Z_1")
    pgm.add_edge("H_2", "Z_2")
    pgm.add_edge("H_3", "Z_3")
    pgm.add_edge("H_4", "Z_4")
  
    pgm.render()
In [9]:
plot_HMM()

The following conditional independence assumptions encodes the graph $\mathcal G$:

$$ P(Z_t \mid Z_1, \dots, Z_{t-1}, H_1, \dots, H_{t}) = P(Z_t \mid H_t) $$

$$ P(H_t \mid Z_1, \dots, Z_{t-1}, H_1, \dots, H_{t-1}) = P(H_t \mid H_{t-1}) $$

Example: Robot locatization with observations

Addtionally we have a sensor which gives us an observation $Z_t$ which indicates, where the robot is located.

Bayes rule can be use to update our belief of the robot location:

$$ P(H_t=i \mid Z_t) = \frac{P(Z_t \mid H_t=i)P(H_{t}=i)}{P(Z_t)} $$

  • the Prior $P(H_{t}=i)$ is computed by the motion equation.
  • Measurement (Likelihood): $P(Z_t \mid H_t=i)$
  • Normalizer $P(Z_t) = \sum_i P(Z_t \mid H_t=i) P(H_{t}=i)$
In [10]:
prior = p[4]
In [11]:
plt.figure(figsize=(8,10))
plt.subplot(4, 1, 1)
plt.plot(range(10), prior,'ro')
plt.ylim(-0.0,1.1)
plt.ylabel("probability")
plt.xlabel("grid cells")
plt.title('Prior')
Out[11]:
Text(0.5,1,'Prior')
In [12]:
plt.figure(figsize=(8,10))
p_measurement = np.array([0., 0., 0., 0.05, 0.18, 0.54, 0.18, 0.05, 0., 0.])
plt.subplot(4, 1, 2)
plt.plot(range(10), p_measurement,'go')
plt.ylim(-0.0,1.1)
plt.ylabel("probability")
plt.xlabel("grid cells")
plt.title('Likelihood')
Out[12]:
Text(0.5,1,'Likelihood')
In [13]:
plt.figure(figsize=(8,10))
posterior = p_measurement * prior
posterior = posterior / posterior.sum()
plt.subplot(4, 1, 3)
plt.plot(range(10), posterior,'bo')
plt.ylim(-0.0,1.1)
plt.ylabel("probability")
plt.xlabel("grid cells")
plt.title('Posterior')
Out[13]:
Text(0.5,1,'Posterior')

1D Continuous State Space

Motion and Prediction

Assumption:

  • The Position is described by a Gaussian.

Limitation due to the Gaussian: unimodal

A Kalman Filter is a instance of an Bayes Filter: The representation of information is a probability density function.

Here we investigate the 1D Kalman Filter, i.e. we have a constant motion model.

Assumption:

  • the state transition probability $p(h_t \mid h_{t-1}, v_t)$ is linear with Gaussian noise (called linear Gaussian), in general for 1d:

$$ h_t = h_{t-1} + v_t + \epsilon $$

  • $v_t$ is the traveling distance in time $\Delta t$ (one time step).
  • $\epsilon$ is Gaussian noise with $\sigma_v$ (the noise of $v_t$ in $\Delta t$)

$$ p(h_t \mid h_{t-1}, v_t) = \mathcal N(h_{t-1} + v_t, \sigma_v) $$

So the motion is described by a Gaussian $\mathcal N$.

The inital position at $t=1$ changes to due the motion. The initial uncertainty of the position is increased by the uncertainty of the state transition probability.
This is described mathematically by the convolution of the two Gaussians.

The convolution of two Gaussians is again a Gaussian with (see e.g. [Thr06][Bro14]:

  • $\mu' = \mu + \mu_v$
  • $\sigma'^2 = \sigma^2 + \sigma^2_v$

Function for the prediction:

In [14]:
def predict(mu, sigma, mu1, sigma1):
    # convolution of two gaussians
    mu_ = mu + mu1
    sigma_ = np.sqrt(sigma**2+sigma1**2) 
    return mu_, sigma_
In [15]:
import scipy.stats  
In [16]:
h_space = np.arange(-10.,10.,.05)
v_space = np.arange(-5.,9.,.05)
mu0 = -2.
sigma0 = 1.5
mu_v = 2.5
sigma_v = 1.9
In [17]:
h0_pdf = scipy.stats.norm(loc=mu0, scale=sigma0).pdf(h_space)
v_pdf = scipy.stats.norm(loc=mu_v, scale=sigma_v).pdf(v_space)
mu1,sigma1 = predict(mu0, sigma0, mu_v, sigma_v)
h1_pdf = scipy.stats.norm(loc=mu1, scale=sigma1).pdf(h_space)
In [18]:
plt.figure(figsize= (8,3))
plt.title("traveling distance in time $\Delta t$")
plt.plot(v_space, v_pdf,'g-')
plt.ylabel("probability density")
plt.xlabel("v")
Out[18]:
Text(0.5,0,'v')
In [19]:
plt.figure(figsize=(8,3))
plt.plot(h_space, h0_pdf,'b-', label="Position at $t$")
plt.plot(h_space, h1_pdf,'r-', label="Predicted Position at $t+\Delta t$")
plt.ylabel("probability density")
plt.xlabel("h")
plt.legend()
Out[19]:
<matplotlib.legend.Legend at 0x117091438>

Measurement

Assumption: The measurement probability is linear

$$ z_t = h_t + \delta $$ with:

  • $\delta$: zero-mean Gaussian noise of the measurement

or $$ h_t = z_t - \delta $$

(Note: the pdf of the Gaussian noise is symmetric.)

$$ p(z_t \mid h_t) = \mathcal N(z_t, \sigma_o) $$

For the measurement we use Bayes rule with the probability densities

$$ p(h_t \mid z_t) = \frac{p(z_t \mid h_t)p(h_t)}{p(z_t)} $$

with

  • $h \in \mathbb R$
  • The prior is the predicted position after a motion (step): $p(H_{t}=h)=\mathcal N(\mu',\sigma')$
  • Measurement Likelihood: $p(z_t \mid h_t) = \mathcal N(\mu_o,\sigma_o)$
    • The measurement locates the robot at position $z_t = \mu_o$.
    • $\sigma_o$ is the standard deviation of the measurement (e.g. sensor uncertainty).
  • Normalizer $p(z_t) = \int_{\mathcal H} p(z_t \mid H_t=h) p(H_{t}=h) dh$
Product of two Gaussians

$$ \mu = \frac{\sigma_o^2 \mu' +\sigma'^2\mu_o}{\sigma'^2+\sigma_o^2} $$

$$ \sigma^2 = \frac{1}{1/\sigma'^2 + 1/\sigma_o^2} $$

(for the derivation see e.g. [Bro14])

In [27]:
def measurement(mu, sigma, mu1, sigma1):
    # product of two gaussians
    mu_ = (sigma1** 2 * mu + sigma**2 * mu1)/ (sigma** 2 +sigma1** 2 )
    sigma_ = np.sqrt( 1./( 1./sigma**2 + 1./sigma1**2) )
    return mu_, sigma_
In [21]:
sigma_o = 3.
mu_o = -1.
In [22]:
o_pdf = scipy.stats.norm(loc=mu_o, scale=sigma_o).pdf(h_space)
mu, sigma = measurement(mu1, sigma1, mu_o, sigma_o)
h_pdf = scipy.stats.norm(loc=mu, scale=sigma).pdf(h_space)
In [23]:
plt.figure(figsize=(8,3))
plt.plot(h_space, h1_pdf,'r-', label="Prior")
plt.plot(h_space, o_pdf,'g-', label="Likelihood")
plt.plot(h_space, h_pdf,'b-', label="Posterior")
plt.ylabel("probability density")
plt.xlabel("h")
plt.legend()
Out[23]:
<matplotlib.legend.Legend at 0x1171b9da0>

Example

In [24]:
sigma_o = 1.6  
sigma_v = .9  
In [25]:
observations = np.array([-2, -1.5, -0.4, 1.2, 2.1])
velocity = np.array([1, 1.1, 1.2, 1.2, 1.2])
h_space = np.arange(-6., 10., 0.05)
In [26]:
mu = 0. # initial position
sigma = 6. # initial standard deviation of the position
plt.figure(figsize=(10,12))

h_pdf = scipy.stats.norm(loc=mu, scale=sigma).pdf(h_space)
plt.subplot(6, 1, 1)
plt.plot(h_space, h_pdf,'b-', label="Position at t=0")
plt.ylim(0.,0.4)
plt.legend()
for i,(o, v) in enumerate(zip(observations, velocity)):
    plt.subplot(6, 1, i+2)  
    plt.ylim(0.,0.4)
    mu_, sigma_ = predict(mu, sigma, v, sigma_v) 
    h_pdf = scipy.stats.norm(loc=mu_, scale=sigma_).pdf(h_space)
    plt.plot(h_space, h_pdf,'r-', label="Prediction at t=%i"%(i+1))
    
    h_pdf = scipy.stats.norm(loc=o, scale=sigma_o).pdf(h_space)
    plt.plot(h_space, h_pdf,'g-', label="Observation at t=%i"%(i+1))
    
    mu, sigma = measurement(mu_, sigma_, o, sigma_o)
    h_pdf = scipy.stats.norm(loc=mu, scale=sigma).pdf(h_space)
    plt.plot(h_space, h_pdf,'b-', label="Position at t=%i"%(i+1))
    plt.legend()

$$\hat x$$

Literature and Links: