Robot needs 50 tries to learn how to flip pancakes.

iandanforth · on July 23, 2010

Two interesting things to notice from the video:

1. In the failure trials the robot does not move its arm toward the object as it falls. This is distinctly inhuman! When we initiate a motor program we are constantly checking our prediction for the behavior of both our body and external objects against reality and then updating that program in real time. This bot apparently updates its program only after each trial.

2. The robot moves back to its standard starting position after each successful trial. This demonstrates the specificity of the motor pattern it's developed. Due to time pressure and the complexity of variations we deal with it is usually advantageous to learn a generalized pattern rather than a single pattern that works for a constrained set of starting conditions.

As a side note on the difficulty of this task, I agree with sukuriant that the paucity of the information the robot has, especially lack of fine grained touch, is a huge impediment. Secondly recall that in the human brain about 50% of the neurons live in the cerebellum which is strongly implicated in storing and updating fine grained motor patterns. (Gross patterns and intentions being initiated in the motor cortex).

Retric · on July 23, 2010

Watch a professional basketball payer shoot freethrows and you see an attempt to replicate initial conditions. It’s actually fairly normal when someone is trying to be precise, ex: Darts, billiards, tennis etc.

spif · on July 23, 2010

Interesting how the failures make it look eerily human. And it seems to me while watching that the robot becomes frustrated when failing. I know it's me projecting that, seems interesting to apply that to human interaction though. How much is projection and how much is real empathy.

wlievens · on July 25, 2010

If you haven't yet, read up on the concept of the "Uncanny Valley". It's relevant to what you express here.

ehsanul · on July 23, 2010

What I find particularly interesting is the sort of "superstition" many of these kinds of robots show. What I'm referring to in this case is how, after 50 tries, the robot arm here moves to its right before every flip attempt. I believe the same thing shows up with solutions from genetic algorithms and neural networks.

Seems like there is always some non-negligible probability that within the factors any learning robot takes into account as part of its success is a factor which is actually totally irrelevant. That's probably somewhat how our own brains operate as well.

chaosmachine · on July 24, 2010

Perhaps running the learning process multiple times and averaging(?) the results would eliminate irrelevant motions. Or perhaps the irrelevant parts are critical to the rest of the movements, and removing them would break things.

sesqu · on July 24, 2010

If the motions are truly irrelevant, then yes, they would disappear. But they could disappear as a function of the number of trials, so anything that still persists after 50 trials would take a long time to get rid of.

It is general a hard problem in adaptive learning algorithms that they only produce results. You can't tell the superstition from the intuition.

wlievens · on July 25, 2010

That's because this is an evolutionary (in a broad sense) method of learning, i.e. it does not involve analytical understanding of what it's trying to optimize, but rather it uses feedback from previous iterations to achieve the objective.

Evolution works in the same way: there's no analysis from the phenotypical world back to the genes ("oh, we should be taller; let's just tweak this gene", i.e. Lamarckism). Instead, it's just massively scalable trial and error.

In fact, the logical disconnect between effect and cause is probably a strength rather than a weakness.

ehsanul · on July 29, 2010

*In fact, the logical disconnect between effect and cause is probably a strength rather than a weakness.

Absolutely. I remember reading about an artificial life program which had evolving creatures try to survive in a very harsh artificial world. The strategies developed seemed less than optimal on the face of it, but when the writers of the program hand-coded their own supposedly "perfect" strategy, the increased efficiency of their strategy actually led to a lower overall survival rate. Only after that could they see why.

nazgulnarsil · on July 24, 2010

we probably "pointless move to the right" in some of our cognition and never notice it.

johnfn · on July 23, 2010

What's interesting to me is that the robot has a little pan wobble at the end of the flip. I wonder if that has some sort of advantageous effect on the outcome of a flip? Or maybe the robot is just 'superstitious' because it had a more successful run once when it added the wobble.

Terretta · on July 23, 2010

From the Vimeo description:

> "After 50 trials, the robot learns that the first part of the task requires a stiff behavior to throw the pancake in the air, while the second part requires the hand to be compliant in order to catch the pancake without having it bounced off the pan."

sukuriant · on July 23, 2010

The associated article seems to think that 50 steps is a long amount of time for learning this article. I would argue that 50 steps is nearly no time at all. Even though a human being may take less attempts to learn, 1) we are likely taking input from more senses than the robot is about what's occurring (we sense by stereoscopic sight, feel, etc. The robot may not sense by all of these.) 2) can apply knowledge from other domains in solving this problem (if memory serves me, this is part of the Holy Grail for artificial intelligence) 3) may make multiple attempts in our own minds before attempting to perform the activity physically again.

While I'm primarily experienced in Genetic Algorithms and NNs (so not re-enforcement learning, so much), 50 steps (or generations in a non-steady-state GA) is a very short amount of time, and so learning to properly coordinate multiple degrees of freedom into a successful activity in only 50 steps is, to me, pretty impressive.

sprout · on July 23, 2010

I'd say it's pretty impressive even for a human.

We also have the advantage of having some 'physics simulation' software if you will that lets us do some runs in our head before doing it physically.

electromagnetic · on July 23, 2010

Our biology is designed to anticipate gravity. Regardless of athletic training the average person is capable of adeptly throwing and catching a ball.

I work in construction and I regularly throw and catch objects like rolls of tape - duct tape sized rolls - 18V drills, levels, etc. but it's only when we throw something light (IE that air-resistance has an effect) and then you start getting screwed up. We don't expect things to slow down dramatically when it's thrown or falls.

hitonagashi · on July 23, 2010

We already have a substantial 'muscle memory' from doing other things with our arms as well I'd imagine...from an early age we are throwing balls, running around, picking things up.

caxap · on July 23, 2010

Reinforcement learning was applied after the basic model was initialized with imitation. Maybe that can partially explain the small number of steps.

fragmede · on July 23, 2010

re: 1) Vision assisted robots can do some amazing things - http://www.hizook.com/blog/2009/08/03/high-speed-robot-hand-...

(I think re-enforcement learning was more the point than 'can we teach a robot to flip pancakes')

jal278 · on July 23, 2010

what sort of neuroevolution stuff do you do, i'm a phd in that area right now

sukuriant · on July 24, 2010

By the end of it, you'll probably know more about the subject than I do. I'm a master's* student on my last semester. I don't actually do any work in the area, but I've taken a couple classes on neural networks and a class on genetic algorithms. Those were my major focus.

My preferred neural network structure/design is NEAT (developed by Dr. Kenneth Stanley) and its various derivatives. If I were to work more with neural networks, I would work to expand that design. (my work interests have expanded to include other stochastic algorithms, such as Monte Carlo Localization, as well)

* I was a doctoral student, but I couldn't acquire the advisor I desired (funding) and so I reduced to master's level

ezy · on July 23, 2010

I realize this isn't exactly the point of the whole exercise, but a stiff pancake that easily slides onto the pan is kind of cheating. :-)

A deformable pancake would make the experiment batter.

hugh3 · on July 23, 2010

If that was a pun, it was awful. If it was a Freudian slip, it was epic.

msg · on July 23, 2010

You seem to have very high standards for puns, and very low standards for Freudian slips.

Here is the epic pun you are looking for:

http://puzzlet.org/personal/wiki.php/KnockKnock/SoThisGuy

solfox · on July 23, 2010

I'm pretty sure it would take me about as long to master...

alain94040 · on July 23, 2010

Especially if you are blindfolded. Which are the conditions for the robot, I believe.

billybob · on July 23, 2010

Maybe. But once you learn it, would you be set for life? Or would you have to relearn a bit the next day?

It's interesting to combine the idea of trial-and-error learning with "save that strategy in permanent memory."

tassl · on July 23, 2010

Even though it doesn't use any learning process, I think that you might find this video interesting:

http://video.google.com/videoplay?docid=3757897210640719617#

The control uses the dynamics of the robot to optimize a trajectory to increase the weight that the robot can lift.

Tichy · on July 23, 2010

Next: try pancakes of varying shapes and sizes. I have a suspicion they might struggle with their approach.

bfung · on July 23, 2010

Or how about something more challenging... toss the pancake higher, or make the pancake rotate more rotations when tossed =P

chengas123 · on July 23, 2010

I'm curious if anyone knows why it sounded like there was a jet engine in the background throughout the video? Was it just a bad recording or was there some reason why the room had to have a crazy amount of ventilation?

Spark23 · on July 23, 2010

Has anybody found a better link, describing more technical details?

Lagged2Death · on July 23, 2010

"Artificial pancake." You know, as opposed to the ones that grow on trees. Great.

hackermom · on July 23, 2010

"WITH THIS ROBOT, HE PLANS TO ENSLAVE THE WORLD!" dramatic choir

Sarcasm aside, I'm sure stressed-out mothers and financiers alike rejoice...