LSTM (Long Short-Term Memory) is a type of recurrent neural network (RNN) architecture that is designed to recognize patterns over time or sequences, by overcoming the challenge of learning long-term dependencies. This special type of RNN effectively solves the vanishing gradient problem which is common in standard RNNs. LSTM is widely used in deep learning, particularly in applications like machine translation, speech recognition, and time-series prediction where temporal sequences of data are critical.

How LSTM works

LSTM networks contain LSTM cells instead of the standard neural network layers. Each LSTM cell has a cell state and three essential components or gates: input gate, forget gate, and output gate.

  1. Input gate decides how much of the new information will be stored in the cell state. It involves two steps: firstly, a sigmoid function decides which values will be updated; secondly, a tanh function creates new candidate values that could be added to the state.
  2. Forget gate decides how much of the past data (i.e., the existing memory) can be forgotten. A sigmoid function is used to either completely forget a past memory with a '0' value or retain it with a '1' value.
  3. Output gate determines the output of the cell based on the cell state and the input. Again, it first uses a sigmoid function to decide which parts of the cell state will be output, followed by a tanh function to push these state values into the range between -1 to 1 and multiply it by the output of the sigmoid gate.

By carefully regulating the flow of information through these gates, an LSTM cell can maintain or forget its memory over long sequences, making it possible to learn from many time steps in the past.

Lakera LLM Security Playbook
Learn how to protect against the most common LLM vulnerabilities

Download this guide to delve into the most common LLM security risks and ways to mitigate them.

Related terms
untouchable mode.
Get started for free.

Lakera Guard protects your LLM applications from cybersecurity risks with a single line of code. Get started in minutes. Become stronger every day.

Join our Slack Community.

Several people are typing about AI/ML security. 
Come join us and 1000+ others in a chat that’s thoroughly SFW.