#TheAIAlphabet: L for Long Short Term Memory (LSTM)

The AI Alphabet |

Published September 28, 2023 |

Susanna Myrtle Lazarus

How does a restaurant server remember all the orders their customers have placed? They have a short-term memory, which can only hold a limited amount of information for a short period of time. So, if a customer places a complex order, they might have to write it down so they don’t forget.

But what if they had a long-term memory, where they could store information for a much longer period of time? This would allow them to remember all of their customers’ orders, even if they’re very complex.

Long Short Term Memory (LSTM) is a type of artificial intelligence (AI) that works like a long-term memory.

It can store information for long periods of time, and it’s very good at learning long-term dependencies in data.

To put it simply, it’s a type of neural network, but way smarter than your average one. Here’s how it works: think of your brain as a series of rooms, and each room stores information. LSTM, however, has a magical door in each room. These doors decide which info to keep and which to toss out. If something is super important, like your grandma’s secret recipe, it’ll stay in the room for a long time. But if it’s something trivial, like a pesky fly buzzing around, it gets booted out quickly.

In technical parlance, LSTM networks are made up of units called cells. Each cell has a long-term memory state, which is the information that the cell remembers over time.

LSTM cells also have three gates: an input gate, an output gate, and a forget gate.

The input gate controls how much new information is added to the cell’s long-term memory state.
The output gate controls how much information from the cell’s long-term memory state is passed on to the next cell.
The forget gate controls how much information is removed from the cell’s long-term memory state.

LSTM networks are trained by feeding them data and then adjusting the parameters of the cells so that the network can produce the desired output.

Now, how’s this used in AI? Imagine you’re teaching a computer to understand text. LSTM helps it remember the context of words in a sentence, so it doesn’t mix up “apple” the fruit with “Apple” the tech company. So, basically, LSTM is AI’s memory champ, ensuring it remembers what matters and forgets the noise.

Others uses include:

Machine translation: LSTM networks can be used to translate text from one language to another. Google Translate uses LSTM networks to translate text in over 100 languages.
Speech recognition: LSTM networks can be used to recognize speech and convert it to text. Siri and Alexa use LSTM networks to understand what you’re saying.
Time series forecasting: LSTM networks can be used to predict future events based on historical data. They can be used to predict the stock market or the weather.

Bonus: Watch the inventor of LSTMs, Jürgen Schmidhuber, talk about their creation and capabilities.

Check out #TheAIAlphabet series here.

Recent Blogs

#TheAIAlphabet: Z for Zero-Shot Learning

#TheAIAlphabet: Z for Zero-Shot Learning

The AI Alphabet

Zero-shot...

#TheAIAlphabet: Y for YOLO

#TheAIAlphabet: Y for YOLO

The AI Alphabet

Imagine...

#TheAIAlphabet: X for Xception

#TheAIAlphabet: X for Xception

The AI Alphabet

Xception,...

#TheAIAlphabet: W for Winograd Schema Challenge

#TheAIAlphabet: W for Winograd Schema Challenge

The AI Alphabet

The...

#TheAIAlphabet: V for Vector Quantized VAE-2 (VQ VAE 2)

#TheAIAlphabet: V for Vector Quantized VAE-2 (VQ VAE 2)

The AI Alphabet

Vector...

#TheAIAlphabet: U for Unawareness

#TheAIAlphabet: U for Unawareness

The AI Alphabet

Imagine...

#TheAIAlphabet: T for Turing Olympics

#TheAIAlphabet: T for Turing Olympics

The AI Alphabet

The...

#TheAIAlphabet: S for Stochastic Parrots

#TheAIAlphabet: S for Stochastic Parrots

The AI Alphabet

Imagine...

#TheAIAlphabet: R for Responsible AI

#TheAIAlphabet: R for Responsible AI

The AI Alphabet

Imagine...

#TheAIAlphabet: Q for Quantum AI

#TheAIAlphabet: Q for Quantum AI

The AI Alphabet

“Why was...

#TheAIAlphabet: P for Pre-Trained Models

#TheAIAlphabet: P for Pre-Trained Models

The AI Alphabet

You know...

#TheAIAlphabet: O for Orthogonality Thesis

#TheAIAlphabet: O for Orthogonality Thesis

The AI Alphabet

The...

#TheAIAlphabet: N for Neurosymbolic Learning

#TheAIAlphabet: N for Neurosymbolic Learning

The AI Alphabet

How do...

#TheAIAlphabet: M for Machine Consciousness

#TheAIAlphabet: M for Machine Consciousness

The AI Alphabet

Consciousn...

#TheAIAlphabet: K for Kalman Filtering

#TheAIAlphabet: K for Kalman Filtering

The AI Alphabet

You're...

#TheAIAlphabet: J for Jaccard Index

#TheAIAlphabet: J for Jaccard Index

The AI Alphabet

In the...

#TheAIAlphabet series: I for Inverse Reinforcement Learning

#TheAIAlphabet series: I for Inverse Reinforcement Learning

The AI Alphabet

How...

#TheAIAlphabet series: H for Human in the Loop

#TheAIAlphabet series: H for Human in the Loop

The AI Alphabet

Imagine...

#TheAIAlphabet: G for General Adversarial Networks

#TheAIAlphabet: G for General Adversarial Networks

The AI Alphabet

#TheAIAlph...

#TheAIAlphabet: F for Foundation Models

#TheAIAlphabet: F for Foundation Models

The AI Alphabet

Foundation...

#TheAIAlphabet: E for E3 Model

#TheAIAlphabet: E for E3 Model

The AI Alphabet

The E3...

#TheAIAlphabet: D for Deep Learning

#TheAIAlphabet: D for Deep Learning

The AI Alphabet

In the...

#TheAIAlphabet: C for Curse of Dimensionality

#TheAIAlphabet: C for Curse of Dimensionality

The AI Alphabet

Imagine...

#TheAIAlphabet: B for Bellman Equation

#TheAIAlphabet: B for Bellman Equation

The AI Alphabet

AI, AI...

#TheAIAlphabet: A for Attention

#TheAIAlphabet: A for Attention

The AI Alphabet

The...

Subscribe to the Crayon Blog

Get the latest posts in your inbox!