Legends of AI: Raj Reddy
October 13, 2023Legends of AI: Geoffrey Hinton
October 20, 2023#TheAIAlphabet
O for Orthogonality Thesis
Published October 19, 2023
The Orthogonality Thesis is the idea that when it comes to advanced AI, two important things – intelligence and motivation – can work independently of each other. In simpler terms, you can have a super-smart AI that’s really good at doing things, but it might not necessarily want the same things as humans.
It’s like having a super-smart robot, but you can’t predict what it’s going to do because its goals might not match ours. This idea reminds us that making sure AI behaves in ways that are safe and helpful is a big challenge. Because we can’t just assume that a super-smart AI will automatically have the right intentions. We need to be careful and make sure its motivations are in line with what’s good for us.
When we talk about AI, we’re often dealing with what’s called instrumental rationality. That means AI systems are really good at achieving their goals, but those goals might not align with our own. The Orthogonality Thesis is a wake-up call because it reminds us that making superintelligent AI systems that play nice with humans is trickier than it sounds.
That’s where the value alignment problem comes into play. We need to find a way to ensure AI’s goals match our values, or we might end up with robots that care more about counting sand grains than, you know, saving the world.
One way to do this is to carefully design the reward functions for AI systems. Reward functions define what an AI system is trying to achieve. By designing reward functions that align with human values, we can encourage AI systems to learn and behave in ways that are beneficial to humans.
Another way to ensure AI alignment is to develop mechanisms for human oversight of AI systems. This could involve developing ethical guidelines for AI development and deployment, or creating regulations that govern the use of AI systems.
The Orthogonality Thesis is like a neon sign that says, “Watch out for this!” It tells us to buckle up, focus on AI safety, and make sure we’re in control of our AI’s motivations, or we could have some crispy consequences on our hands.
When we talk about AI, we’re often dealing with what’s called instrumental rationality. That means AI systems are really good at achieving their goals, but those goals might not align with our own. The Orthogonality Thesis is a wake-up call because it reminds us that making superintelligent AI systems that play nice with humans is trickier than it sounds.
That’s where the value alignment problem comes into play. We need to find a way to ensure AI’s goals match our values, or we might end up with robots that care more about counting sand grains than, you know, saving the world.
One way to do this is to carefully design the reward functions for AI systems. Reward functions define what an AI system is trying to achieve. By designing reward functions that align with human values, we can encourage AI systems to learn and behave in ways that are beneficial to humans.
Another way to ensure AI alignment is to develop mechanisms for human oversight of AI systems. This could involve developing ethical guidelines for AI development and deployment, or creating regulations that govern the use of AI systems.
The Orthogonality Thesis is like a neon sign that says, “Watch out for this!” It tells us to buckle up, focus on AI safety, and make sure we’re in control of our AI’s motivations, or we could have some crispy consequences on our hands.
Recent Blogs
Subscribe to the Crayon Blog. Get the latest posts in your inbox!