Deep-Learning-Overview slides

## Deep Learning¶

#### an overview¶

Christian Herta, HTW Berlin

Talk slides and teaching material for deep learning at
http://christianherta.de

Engineering of Features: ### "Traditional" approach for image and speech¶ ### Deep learning is feature learning¶

##### learning of representations¶ depth is the number of transformation steps.

### "Machine Perception"¶

Read (high dimensional) data and transform them into a "higher" representation to perform tasks / reach goals.

##### High dimensional data¶
• Images / Videos
• Sound (Voice / Music)
• Natural Language
• Time Series

### What kinds of representations?¶

(typical) representation:

vector (or sequence of vectors)

$${\vec h} = (h_1, h_2, \dots h_n)$$

(distributed representations)

### (Simple) Feed Forward Neural Network¶

</p>

• Transforming the input vector through many layers.
• The hidden state of each layer corresponds to a representation of the input.

#### Layer of a simple feed forward neural network¶

Affine transformation

$$\vec z = \hat W \cdot \vec x + \vec b$$

followed by an element-wise application of a non-linear function $\sigma (\dots)$

$$\vec h = \sigma ( \vec z )$$
##### Prediction is easy if we have good representations¶
• e.g. for classification the representation of the last hidden layer are linear separable ### Word Embeddings¶

representations for words (learned from sentences)

</p>

• "low" dimensional space ($\sim 10^2$)
• Syntactic and semantic information is encoded in the space (directions)
• With simple vector arithmetics we can answer questions like
• Man is is related to Woman like King to ?
• Germany ($\vec G$) is related to Berlin ($\vec B$) like Ukraine ($\vec U$) to ?

The nearest word of $$\vec U - \vec G + \vec B$$ is Kiew.

#### Feature Transformations

• learning representations through many layers

### Convolutional Neural Networks</h3>¶

typical neural network for image and video processing ### Image Recognition¶

• Classification of Images
• ImageNet Dataset

#### Recurrent Neural Networks¶

• for sequence data
• have internal state which acts like a memory

e.g. Natural Language Processing:

• RNN Language Models
• represents sentences
• can generate (new) unseen sentences

#### Language Model (RNN enrolled in time)¶ ### Generative Models

Examples:
• Variational Autoencoder

## Encoder Decoder Models¶

#### Neural Machine Translation¶ #### Neural Machine Translation¶ (image from http://opennmt.net/)

#### Image Captions¶

</p>

• Encoder: Transforming the image into a vector representation.
• Decoder: Language model RNN transforms the vector representation into a sentence.

#### Image to Image Translation

by Conditional Generative Models: