linear-regression-with-theano slides

## Linear Regression with Theano¶

In [67]:
plt.plot(train_x, train_y,'b+')
plt.xlabel("x")
plt.ylabel("y")

Out[67]:
<matplotlib.text.Text at 0x10bfd7a10>

#### Input of the computational graph¶

In [68]:
# theano symbolic variables
x = T.vector("x")
target = T.vector("t")


#### Linear Model¶

In [69]:
def linear_model(x, p0, p1):
return  p0 + x * p1

# parameters of the model are theano shared variables
param0 = theano.shared(0., name="p0")
param1 = theano.shared(0., name="p1")

prediction = linear_model(x, param0, param1)

In [71]:
from IPython.display import Image
Image(filename=pics_file)

Out[71]:

### Cost function¶

In [72]:
cost = T.mean(T.sqr(target - prediction))


Update rule: $$\vec \theta \leftarrow \vec \theta - \alpha \nabla J(\vec \theta)$$

• Parameters as vector $\vec \theta$
• Cost function $J(\vec \theta)$
• Learning rate $\alpha$
• Nabla-operator: $\nabla = (\partial/\partial \theta_1, \partial/\partial \theta_2, \dots)^T$

#### Parameter update during learning¶

$$\theta_i \leftarrow \theta_i - \alpha \frac{\partial J(\vec \theta)}{\partial \theta_i}$$

#### Calculations of Gradients in Theano¶

In [73]:
g_param0 = T.grad(cost=cost, wrt=param0)

In [74]:
# learning rate
alpha = 0.0005

# update rule - a step in gradient descent
updates = [[param0, param0 - alpha * g_param0],
[param1, param1 - alpha * g_param1]]


#### Training function¶

In [75]:
train_func = theano.function(inputs=[x, target],
allow_input_downcast=True)


#### Training¶

In [76]:
nb_epochs = 500
cost_over_epochs = np.ndarray([nb_epochs])
for epoch in range(nb_epochs):
# full batch
c = train_func(train_x, train_y)
cost_over_epochs[epoch] = c

plt.plot(np.arange(nb_epochs), cost_over_epochs)
plt.xlabel("epochs")
plt.ylabel("cost")

Out[76]:
<matplotlib.text.Text at 0x10b429990>
In [77]:
plt.plot(train_x, train_y,'.')
p0 = param0.eval()
p1 = param1.eval()
plt.plot(np.array([-5., 5]), np.array([p0-5.*p1, p0+5.*p1]),'-')

Out[77]:
[<matplotlib.lines.Line2D at 0x109cb5b10>]