Feedforward Neural Networks

The input to the network is an $n$-dimensional vector.
The network contains L-1 hidden layers having $n$ neurons each.
Finally, there is one output layer containing $k$ neurons.
Each neuron in the hidden layer and output layer can be split into two parts:
- pre-activation $(a_i)$
- activation $(h_i)$
The input layer is called the 0-th layer and the output layer can be called the L-th layer.
Between layers $i-1$ and $i (0 < i < L)$
- Weight = $W_i \in \mathbb{R^{n \times n}}$
- Bias = $b_i \in \mathbb{R^n}$
Between the last hidden layer and the output layer,
- Weight = $W_L \in \mathbb{R^{k \times n}}$
- Bias = $b_L \in \mathbb{R^k}$

Untitled

$$ a_i = b_i + W_i h_{i-1} $$

$$ h_i = g(a_i) $$

    where $g$ is called the activation function.

$$ f(x) = h_L = O(a_L) $$

   where $O$ is the output activation function.