feed-forward-neural-network-theano slides

Feed Forward Neural Networks

with Theano

Note: The code is just for demonstration purpose. Some python function are just for producing plots for slides (ipython reveal). So don't blame me for code quality.

from IPython.display import Image
Image(filename='./pics/Deeplearning 2.png') 

For each layer:

Input: $\vec x^T = (x_1, x_2, \dots x_n)$

Output of the j-th neuron: $$ h_j = \sigma(\sum_{i=1}^n w_{ij} x_i + b_j) $$

In matrix form ($W$ is a matrix):

$$ \vec h = \sigma(\vec x \cdot W + \vec b) $$
  • element-wise application of the activation function $\sigma$.

Train data - not linear separable

Feature Transformation

$$ \phi_1(\vec x) = x_1^2 $$$$ \phi_2(\vec x) = x_2^2 $$

In the new feature space the data is linear-separable:

Learning of the feature transformation by neural networks

  • The activity vector of the "Hidden-Layers" can be interpreted as a transformation of the input vector.
def logistic_function(x):
    return 1./(1. + T.exp(-x))
def relu(x):
    return T.switch(x<0, 0, x)

Feed Forward Neural Network with a hidden layer

Aktivity vector of the first hidden layer:

$$ \vec h^{(1)} = \sigma_1 \left(\vec x \cdot W^{(1)} + \vec b^{(1)} \right) $$

Activity of the output $\vec o$ (with only one output $o$ is a scalar):

$$ \vec o = \vec h^{(2)}= \sigma_2 \left( \vec h^{(1)} \cdot W^{(2)} + \vec b^{(2)} \right) $$
# (first) hidden layer
a = T.dot(X, W_h) + b_h

# activity function "rectified linear units"
h = relu(a)

# output neuron:
y = logistic_function(T.dot(h, W_o) + b_o)
fn_predict = theano.function(inputs = [X], outputs = y)
cost function and l2-regularization

cross_entropy = T.sum(-(T.dot(target, T.log(y)) + T.dot((1.-target), T.log(1.-y))))

l2_reg = T.mean(T.sqr(W_h)) + T.mean(T.sqr(W_o))

def get_train_functions(cost, v, target, learning_rate=0.01):
    gparams = []
    for param in params:
        gparam = T.grad(cost, param)

    for param, gparam in zip(params, gparams):
        updates.append((param, param - gparam * learning_rate))
    learn_fn = theano.function(inputs = [v, target],
                                   outputs = cost,
                                   updates = updates)
    return learn_fn

Decision Boundary

Curse of dimensionality

