The Berkeley Robot for the Elimination of Tedious Tasks—aka Brett, of course—holds one of those puzzle cubes for kids in one hand and with the other tries to jam a rectangular peg into a hole. It is unhappily, hilariously toddler-like in its struggles. The peg strikes the cube with a clunk, and Brett pulls back, as if startled.
But Brett is no quitter, because Brett is no ordinary robot: Nobody told it how to even get anywhere close to the right-shaped hole. Someone just gave it a goal. Yet with attempt after attempt, Brett improves, learning by trial and error how to eventually nail the execution. Like a hulking child, it has taught itself to solve a puzzle.
La-di-da, right? So easy a child could do it? Nope. This is actually a big deal in robotics, because if humans want the machines of tomorrow to be truly intelligent and truly useful, the things are going to have to teach themselves to not only manipulate novel objects, but navigate new environments and solve problems on their own.
If you want to teach a robot something, you can program it with strict commands to, say, assemble cars. But these days, you can also get a robot to learn in two cleverer ways. The first is known as imitation learning, in which you demonstrate how the robot should do something by joysticking it around. (Some robot arms also respond to you grabbing them and guiding their movements.)
The other way is known as reinforcement learning. This is how Brett goes about things. At no point does a human have to say, “Brett, this is how you get the peg in the hole.” Brett is just told that it’s something it needs to do. The AI powering the robot gets a reward (hence the term reinforcement learning) every time it gets closer to its goal. And over the course of about 10 minutes, Brett invents a solution.
Now, you’ve probably heard of AI using this kind of learning in a simulator. One famous and fascinating example is the bipedal AI that researchers told to move forward as fast as it could. Over time, it taught itself to walk and eventually run. That’s right, it invented running.
In a simulator, the AI can go through trial and error like that rapidly. But in real life, a robot works far slower. “If you think about something like reinforcement learning, where you learn from trial and error, the challenge is that often you need a lot of trial and error before you get somewhere,” says UC Berkeley roboticist Pieter Abbeel, who leads the learning research with Brett. “And so if you run it all in the real robot, it’s not always that easy to do.”
Part of the problem is that humans are still writing and refining the algorithms that allow a robot to learn. So what these researchers are chasing now is taking learning to the next level, specifically “learning to learn.” A programmer could keep tweaking Brett’s algorithm to get it to learn ever faster, sure. But what if the robot had the power to tweak itself? Meaning, the learning algorithm is itself learned.
“You could hope that maybe as a consequence you end up with a better algorithm than one that humans can design,” says Abbeel. “And you might have a reinforcement learning algorithm that maybe can have a robot learn to walk in a few hours rather than two weeks, maybe even faster.”
This is essential for building a robotic future that isn’t totally maddening. Without robots learning to learn, humans will have to hold their hands. “If we want a robot to be able to act intelligently in this incredibly diverse world that we have, it needs to be able to adapt very quickly to new scenarios,” says Chelsea Finn, a PhD student in Abbeel’s lab. “Every living room is different in a home, and if we train a robot just on a single living room it’s not going to be able to handle yours.”
Solving peg puzzles, then, is literally and figuratively child’s play. Brett’s descendants will be smarter, faster, and more dextrous—truly capable of navigating the chaos that is the human world. They’ve just got to learn a thing or two first.