Feedforward Neural Networks
https://towardsdatascience.com/backpropagation-made-easy-e90a4d5ede55
- The input to the network is an $n$-dimensional vector.
- The network contains L-1 hidden layers having $n$ neurons each.
- Finally, there is one output layer containing $k$ neurons.
- Each neuron in the hidden layer and output layer can be split into two parts:
- pre-activation $(a_i)$
- activation $(h_i)$
- The input layer is called the 0-th layer and the output layer can be called the L-th layer.
- Between layers $i-1$ and $i (0 < i < L)$
- Weight = $W_i \in \mathbb{R^{n \times n}}$
- Bias = $b_i \in \mathbb{R^n}$
- Between the last hidden layer and the output layer,
- Weight = $W_L \in \mathbb{R^{k \times n}}$
- Bias = $b_L \in \mathbb{R^k}$
- The pre-activation at layer $i$ is given by
$$
a_i = b_i + W_i h_{i-1}
$$
- The activation at layer $i$ is given by
$$
h_i = g(a_i)
$$
where $g$ is called the activation function.
- The activation at the output layer is given by
$$
f(x) = h_L = O(a_L)
$$
where $O$ is the output activation function.