An activation function is a mathematical function applied to a neuron's output, determining the neuron's output or "activation." It introduces non-linearity into the network, enabling the model to learn from error and make complex predictions.
How Activation Function Works
- Non-linearity: Without an activation function, even with many layers, a neural network would behave just like a single-layer perceptron, because summing these layers would give another linear function. Non-linearity allows the model to capture patterns and relationships in the data.
- Thresholding: Activation functions help decide whether a neuron should be activated or not based on the weighted sum of its input.
2. Common Activation Functions:
- Sigmoid: Outputs values between 0 and 1. It's an S-shaped curve.
- Tanh (Hyperbolic Tangent): Outputs values between -1 and 1. It's also an S-shaped curve but centered at zero.
- ReLU (Rectified Linear Unit): Outputs the input if it's positive, otherwise outputs zero. It's computationally efficient and has become one of the default choices.
- Leaky ReLU: A variant of ReLU, it allows a small, non-zero gradient when the input is less than zero, addressing the "dying ReLU" problem where neurons never activate.
Download this guide to delve into the most common LLM security risks and ways to mitigate them.
Lakera Guard protects your LLM applications from cybersecurity risks with a single line of code. Get started in minutes. Become stronger every day.
Several people are typing about AI/ML security. Come join us and 1000+ others in a chat that’s thoroughly SFW.