Activation Function

An activation function is a mathematical function applied to a neuron's output, determining the neuron's output or "activation." It introduces non-linearity into the network, enabling the model to learn from error and make complex predictions.

How Activation Function Works

1. Purpose:

  1. Non-linearity: Without an activation function, even with many layers, a neural network would behave just like a single-layer perceptron, because summing these layers would give another linear function. Non-linearity allows the model to capture patterns and relationships in the data.
  2. Thresholding: Activation functions help decide whether a neuron should be activated or not based on the weighted sum of its input.

2. Common Activation Functions:

  1. Sigmoid: Outputs values between 0 and 1. It's an S-shaped curve.
  2. Tanh (Hyperbolic Tangent): Outputs values between -1 and 1. It's also an S-shaped curve but centered at zero.
  3. ReLU (Rectified Linear Unit): Outputs the input if it's positive, otherwise outputs zero. It's computationally efficient and has become one of the default choices.
  4. Leaky ReLU: A variant of ReLU, it allows a small, non-zero gradient when the input is less than zero, addressing the "dying ReLU" problem where neurons never activate.
