LoRA is a training methodology designed to speed up the training process of large models while using less memory. By introducing pairs of rank-decomposition weight matrices (known as update matrices) to existing weights and solely training these added matrices, LoRA maintains the efficacy of large models without the extensive computational requirements traditionally associated with them.
How LoRA Works
- Addition of Update Matrices: Instead of retraining the entire model, LoRA introduces pairs of rank-decomposition weight matrices to the existing pretrained weights of the model. These matrices are called update matrices.
- Frozen Pretrained Weights: One of LoRA's distinct advantages is that it keeps the pretrained weights unaltered during the training process. This approach helps in preventing the phenomenon called catastrophic forgetting, where the model forgets previously learned knowledge.
- Memory Efficiency: The rank-decomposition matrices introduced by LoRA have far fewer parameters compared to the original model. This makes the newly trained weights (LoRA weights) lightweight and easily transferable.
Download this guide to delve into the most common LLM security risks and ways to mitigate them.
Lakera Guard protects your LLM applications from cybersecurity risks with a single line of code. Get started in minutes. Become stronger every day.
Several people are typing about AI/ML security. Come join us and 1000+ others in a chat that’s thoroughly SFW.