The possible outcomes $i$ of a random experiment determine the value of a random variable $x \in \Omega_{\mathcal X} = \{ a_1, a_2, \dots ,a_l\}$
Examples:
Every outcome $a_i$ has a probability $P(x = a_i)$ (short: $p_i$)
with
The probability of a subset $T$ of $\Omega_{\mathcal X}$ is given by the sum of the probabilities of all elements in the subset:
$$ P(T) = \sum_{i \in T} p_i $$
Example:
Probability of a vocal $V$
The outcome of a random experiment can also be an ordered pair (or in general tuple) of random variables $x_1, x_2$ with
$P(x_1, x_2)$ is the joint probability of $x_1$ and $x_2$
The random variables $x_1$ and $x_2$ need not to be independent (see below).
Example:
Throw of a dice with $x \in {even, odd}$ and $y \in {prime, notPrime}$
The specification of the probabilities of all possible states (outcomes) determines a probability distribution.
The marginal probability $P(x)$ can be obtained from the join probabilities by summation:
$$ P(x_1 = a_i) \equiv \sum_{x_2 \in \Omega_{\mathcal X_2}} P(x_1 = a_i,x_2) . $$
or analog for the marginal distribution $P(x_2)$ (in shorter notation):
$$ P(x_2) \equiv \sum_{x_1 \in \Omega_{\mathcal X_1}} P(x_1, x_2) $$
Example:
For the throw of the dice (see above) $P(prime) = P(even, prime) + P(odd, prime) = 1/2$
The conditional probability is defined by: $$ p(x_1 = a_i| x_2 = b_j) \equiv \frac{P(x_1 = a_i, x_2 = b_j)}{P(x_2 = b_j)} $$
Example: For the throw of the dice (see above) $$ P(prime|even) = \frac{1/6}{1/6 + 2/6} = 1/3 $$
$$ P(x_1,x_2) = P(x_1 | x_2 ) P( x_2) = P(x_2 | x_1) P( x_1) $$
$$ P(x_1) = \sum_{x_2} P(x_1,x_2) = \sum_{x_2} P(x_1|x_2) P(x_2) $$
From the product rule follows the bayes rule: $$ P(x_2 | x_1) = \frac{ P(x_1 | x_2 ) P( x_2 ) } { P(x_1 ) } = \frac{ P(x_1 | x_2 ) P( x_2 ) } { \sum_{x_2'} P(x_1 | x_2' ) P( x_2') } $$
Two random variables are statistically independent if and only if (iff): $$ P(x_1, x_2) = p(x_1) p(x_2) $$
or $$ p(x_2 \mid x_1) = p(x_2) $$
Notation: $x_1 \perp x_2$
The joint probability of $n$ random variables $x_1, x_2, \dots x_n$ can be decomposed with the chain rule:
$$ p(x_1, x_2, \dots x_n) = p(x_1) p(x_2 \mid x_1) p(x_3 \mid x_1, x_2)\dots p(x_n \mid x_1, x_2, \dots x_{n-1}) $$
Two random variables $x_1, x_2$ are conditional independent given a third variable $x_3$ iff:
$$ P(x_1, x_2 \mid x_3) = P(x_1 \mid x_3) (x_2 \mid x_3) $$
Notation:
$x_1 \perp x_2 \mid x_3$
The expectation value of a function $f(x)$ is:
$$ \mathbb{E}_{\mathcal{X}}[f(x)] = \sum_x f(x) p(x) $$
For continuous variables $p(x)$ is a probability density function (pdf): $$ p(x) \geq 0, \int_{-\infty}^\infty p(x) dx = 1 $$
With the probability that the value of $x$ is in the intervall $[a,b]$:
$$ P(a \leq x \leq b) = \int_a^b p(x) dx $$
$$ P(x_1) = \int_{\mathcal X_2} P(x_1,x_2) dx_2 $$
The expectation value of a function $f(x)$ is:
$$ \mathbb{E}_{\mathcal{X}}[f(x)] = \int_\infty^\infty f(x) p(x) dx = \int_\mathcal{X} f(x) dp(x) $$
The expectation value of a function $f(x_1, x_2)$ is: $$ \mathbb{E}_{\mathcal{X_1,X_2}}[f(x_1,x_2)] = \int_\mathcal{X_1} \int_\mathcal{x_2} f(x_1,x_2) p(x_1,x_2) dx_1 dx_2 = \int_{\mathcal{X_1}\times\mathcal{X_2}} f(x_1,x_2) dp(x_1,x_2) $$