Christian Herta, HTW Berlin
Talk slides and teaching material for deep learning at
http://christianherta.de
Read (high dimensional) data and transform them into a "higher" representation to perform tasks / reach goals.
(typical) representation:
vector (or sequence of vectors)
$$ {\vec h} = (h_1, h_2, \dots h_n) $$(distributed representations)
</p>
Affine transformation
$$ \vec z = \hat W \cdot \vec x + \vec b $$followed by an element-wise application of a non-linear function $\sigma (\dots)$
$$ \vec h = \sigma ( \vec z ) $$e.g. for classification the representation of the last hidden layer are linear separable
representations for words (learned from sentences)
</p>
The nearest word of $$ \vec U - \vec G + \vec B $$ is Kiew.
e.g. Natural Language Processing:
</p>