Gradient Descent is a first order iterative optimization algorithm used in machine learning and computational mathematics for finding the minimum value of a function. In Machine Learning and Data Science, it's commonly used to optimize loss functions and adjust parameters to minimize the error of a predictive model.
How Gradient Descent works
Gradient Descent starts with an initial set of parameter values and iteratively moves toward a set of parameter values that minimize the function. This iterative minimization is achieved by taking steps in the negative direction of the function gradient.
Below is a simplified version of how Gradient Descent works:
- Initialize random weights for your input features and set a learning rate.
- Calculate the gradient of the loss function. The gradient is a vector that points in the direction where the function increases the most. In case of gradient descent, we want to move in the opposite direction of the gradient because we want to minimize the loss.
- Update the weights by subtracting the product of the gradient and the learning rate from the current weights. We subtract because we want to move in the opposite direction of the gradient.
- Repeat steps 2 and 3 until the gradient is close to zero (you've found a minimum) or you've reached a maximum number of iterations.
There are different types of Gradient Descent algorithms based on how much data we use to compute the gradient of the objective function. These include Batch Gradient Descent, Mini-Batch Gradient Descent, and Stochastic Gradient Descent.
Download this guide to delve into the most common LLM security risks and ways to mitigate them.
Lakera Guard protects your LLM applications from cybersecurity risks with a single line of code. Get started in minutes. Become stronger every day.
Several people are typing about AI/ML security. Come join us and 1000+ others in a chat that’s thoroughly SFW.