AI, AI on the screen, which route is the best of all?
If AI concepts were retold as fairy tales, this twist of the famous dialog from Snow White and The Seven Dwarves would be the best way to summarize the Bellman Equation.
Simply put, this fundamental block of Reinforcement Learning helps us figure out the best route to pursue, under a given set of conditions. In this process of giving the best possibility, it takes into account both the immediate and future rewards. It can be used to train robots to play games, control robots, and even make financial decisions.
Let’s take Google Maps. One destination, two routes. To show you the best route, the intelligence behind it considers traffic and distance.
- Route A is considerably shorter than route B
- But route A has a bad case of traffic
- So, the waiting time + travel time in route A exceeds the travel time in route B
The system is sure to recommend route B, despite it being longer.
If we had to make it more concise,
Bellman Equation = Reward + γ * Value of next state