#TheAIAlphabet: L for Long Short Term Memory (LSTM)

The AI Alphabet   |   
Published September 28, 2023   |   
Susanna Myrtle Lazarus

How does a restaurant server remember all the orders their customers have placed? They have a short-term memory, which can only hold a limited amount of information for a short period of time. So, if a customer places a complex order, they might have to write it down so they don’t forget.

But what if they had a long-term memory, where they could store information for a much longer period of time? This would allow them to remember all of their customers’ orders, even if they’re very complex.

TheAIAlphabet - Thumb

Long Short Term Memory (LSTM) is a type of artificial intelligence (AI) that works like a long-term memory.

It can store information for long periods of time, and it’s very good at learning long-term dependencies in data.

To put it simply, it’s a type of neural network, but way smarter than your average one. Here’s how it works: think of your brain as a series of rooms, and each room stores information. LSTM, however, has a magical door in each room. These doors decide which info to keep and which to toss out. If something is super important, like your grandma’s secret recipe, it’ll stay in the room for a long time. But if it’s something trivial, like a pesky fly buzzing around, it gets booted out quickly.

In technical parlance, LSTM networks are made up of units called cells. Each cell has a long-term memory state, which is the information that the cell remembers over time.

LSTM cells also have three gates: an input gate, an output gate, and a forget gate.

  • The input gate controls how much new information is added to the cell’s long-term memory state.
  • The output gate controls how much information from the cell’s long-term memory state is passed on to the next cell.
  • The forget gate controls how much information is removed from the cell’s long-term memory state.

LSTM networks are trained by feeding them data and then adjusting the parameters of the cells so that the network can produce the desired output.

Now, how’s this used in AI? Imagine you’re teaching a computer to understand text. LSTM helps it remember the context of words in a sentence, so it doesn’t mix up “apple” the fruit with “Apple” the tech company. So, basically, LSTM is AI’s memory champ, ensuring it remembers what matters and forgets the noise.

Others uses include:

  • Machine translation: LSTM networks can be used to translate text from one language to another. Google Translate uses LSTM networks to translate text in over 100 languages.
  • Speech recognition: LSTM networks can be used to recognize speech and convert it to text. Siri and Alexa use LSTM networks to understand what you’re saying.
  • Time series forecasting: LSTM networks can be used to predict future events based on historical data. They can be used to predict the stock market or the weather.

Bonus: Watch the inventor of LSTMs, Jürgen Schmidhuber, talk about their creation and capabilities.

Check out #TheAIAlphabet series here.