#TheAIAlphabet series: I for Inverse Reinforcement Learning

Published September 7, 2023 |

How would you teach a robot to play soccer? You can’t just tell it what to do, because it needs to learn for itself. So, you start by allowing it to observe the behavior of a human soccer player. This is where Inverse Reinforcement Learning (IRL) steps in.

With IRL, you are given the observed behavior of an intelligent agent. From this, you infer the reward function that the agent is trying to maximize. The reward function is a mathematical function that maps actions from its state to the reward it seeks. In this scenario, IRL is used to infer the reward function – scoring a goal – that the human player is attempting to do. Once you have the reward function, you can train the robot.

There are various approaches to IRL. One common technique is maximum entropy IRL, where the agent has certain restrictions while trying to maximize the reward. Still, it’s not overly restrictive, allowing the agent to explore different behaviors.

Another approach is apprenticeship learning, where there is an assumption that you have access to a “teacher” agent who is aware of the real reward function. The objective is to learn a reward function that is similar to the teacher’s reward function.

IRL is a powerful tool that can be used to learn the reward functions of complex agents. It can potentially revolutionize how we design and train artificial intelligence systems.

IRL’s magic lies in its ability to uncover the ‘why’ behind actions, whether they’re human behaviors or the maneuvers of machines. In a nutshell, it is a tool for understanding, replicating, and unveiling the hidden motivations that make it all possible. It connects the ‘what’ to the ‘why’ in artificial intelligence, turning actions into meaningful insights.

So, the next time you witness an AI or robot doing something spectacular, remember that IRL might be the wizard behind the curtain.