A hyperparameter is a configuration variable that is external to a model and whose value cannot be estimated from data. It is set before the learning process begins. These values define the structure of the model - such as complexity, or capacity to learn, or how fast the model should learn.
How hyperparameters work
The performance of the learning algorithm can be significantly influenced by the hyperparameter values. Some examples of hyperparameters include the learning rate for models that learn from iterative processes like neural networks, the C and sigma in Support Vector Machines, or the depth of a decision tree, among others.
In a neural network, the learning rate is a hyperparameter that determines how much to change the model in response to the estimated error each time the model weights are updated. If the learning rate is small, the model may learn slowly, but it might also have a better chance of finding a good or optimal solution. If the learning rate is large, the model may learn quickly, but it could overshoot and never find a good solution.
However, choosing the right hyperparameters can be quite challenging, and there are no definite values that work for every problem. Different models and datasets require different hyperparameters. Therefore, trial and error or certain systematic approaches like grid search, random search, or automated methods like genetic algorithms are often used to optimize hyperparameters.
Download this guide to delve into the most common LLM security risks and ways to mitigate them.
Lakera Guard protects your LLM applications from cybersecurity risks with a single line of code. Get started in minutes. Become stronger every day.
Several people are typing about AI/ML security. Come join us and 1000+ others in a chat that’s thoroughly SFW.