1. In the failure trials the robot does not move its arm toward the object as it falls. This is distinctly inhuman! When we initiate a motor program we are constantly checking our prediction for the behavior of both our body and external objects against reality and then updating that program in real time. This bot apparently updates its program only after each trial.
2. The robot moves back to its standard starting position after each successful trial. This demonstrates the specificity of the motor pattern it's developed. Due to time pressure and the complexity of variations we deal with it is usually advantageous to learn a generalized pattern rather than a single pattern that works for a constrained set of starting conditions.
As a side note on the difficulty of this task, I agree with sukuriant that the paucity of the information the robot has, especially lack of fine grained touch, is a huge impediment. Secondly recall that in the human brain about 50% of the neurons live in the cerebellum which is strongly implicated in storing and updating fine grained motor patterns. (Gross patterns and intentions being initiated in the motor cortex).
Watch a professional basketball payer shoot freethrows and you see an attempt to replicate initial conditions. It’s actually fairly normal when someone is trying to be precise, ex: Darts, billiards, tennis etc.
Interesting how the failures make it look eerily human. And it seems to me while watching that the robot becomes frustrated when failing. I know it's me projecting that, seems interesting to apply that to human interaction though. How much is projection and how much is real empathy.
What I find particularly interesting is the sort of "superstition" many of these kinds of robots show. What I'm referring to in this case is how, after 50 tries, the robot arm here moves to its right before every flip attempt. I believe the same thing shows up with solutions from genetic algorithms and neural networks.
Seems like there is always some non-negligible probability that within the factors any learning robot takes into account as part of its success is a factor which is actually totally irrelevant. That's probably somewhat how our own brains operate as well.
Perhaps running the learning process multiple times and averaging(?) the results would eliminate irrelevant motions. Or perhaps the irrelevant parts are critical to the rest of the movements, and removing them would break things.
If the motions are truly irrelevant, then yes, they would disappear. But they could disappear as a function of the number of trials, so anything that still persists after 50 trials would take a long time to get rid of.
It is general a hard problem in adaptive learning algorithms that they only produce results. You can't tell the superstition from the intuition.
That's because this is an evolutionary (in a broad sense) method of learning, i.e. it does not involve analytical understanding of what it's trying to optimize, but rather it uses feedback from previous iterations to achieve the objective.
Evolution works in the same way: there's no analysis from the phenotypical world back to the genes ("oh, we should be taller; let's just tweak this gene", i.e. Lamarckism). Instead, it's just massively scalable trial and error.
In fact, the logical disconnect between effect and cause is probably a strength rather than a weakness.
*In fact, the logical disconnect between effect and cause is probably a strength rather than a weakness.
Absolutely. I remember reading about an artificial life program which had evolving creatures try to survive in a very harsh artificial world. The strategies developed seemed less than optimal on the face of it, but when the writers of the program hand-coded their own supposedly "perfect" strategy, the increased efficiency of their strategy actually led to a lower overall survival rate. Only after that could they see why.
What's interesting to me is that the robot has a little pan wobble at the end of the flip. I wonder if that has some sort of advantageous effect on the outcome of a flip? Or maybe the robot is just 'superstitious' because it had a more successful run once when it added the wobble.
> "After 50 trials, the robot learns that the first part of the task requires a stiff behavior to throw the pancake in the air, while the second part requires the hand to be compliant in order to catch the pancake without having it bounced off the pan."
The associated article seems to think that 50 steps is a long amount of time for learning this article. I would argue that 50 steps is nearly no time at all. Even though a human being may take less attempts to learn,
1) we are likely taking input from more senses than the robot is about what's occurring (we sense by stereoscopic sight, feel, etc. The robot may not sense by all of these.)
2) can apply knowledge from other domains in solving this problem (if memory serves me, this is part of the Holy Grail for artificial intelligence)
3) may make multiple attempts in our own minds before attempting to perform the activity physically again.
While I'm primarily experienced in Genetic Algorithms and NNs (so not re-enforcement learning, so much), 50 steps (or generations in a non-steady-state GA) is a very short amount of time, and so learning to properly coordinate multiple degrees of freedom into a successful activity in only 50 steps is, to me, pretty impressive.
Our biology is designed to anticipate gravity. Regardless of athletic training the average person is capable of adeptly throwing and catching a ball.
I work in construction and I regularly throw and catch objects like rolls of tape - duct tape sized rolls - 18V drills, levels, etc. but it's only when we throw something light (IE that air-resistance has an effect) and then you start getting screwed up. We don't expect things to slow down dramatically when it's thrown or falls.
We already have a substantial 'muscle memory' from doing other things with our arms as well I'd imagine...from an early age we are throwing balls, running around, picking things up.
By the end of it, you'll probably know more about the subject than I do. I'm a master's* student on my last semester. I don't actually do any work in the area, but I've taken a couple classes on neural networks and a class on genetic algorithms. Those were my major focus.
My preferred neural network structure/design is NEAT (developed by Dr. Kenneth Stanley) and its various derivatives. If I were to work more with neural networks, I would work to expand that design. (my work interests have expanded to include other stochastic algorithms, such as Monte Carlo Localization, as well)
* I was a doctoral student, but I couldn't acquire the advisor I desired (funding) and so I reduced to master's level
I'm curious if anyone knows why it sounded like there was a jet engine in the background throughout the video? Was it just a bad recording or was there some reason why the room had to have a crazy amount of ventilation?
1. In the failure trials the robot does not move its arm toward the object as it falls. This is distinctly inhuman! When we initiate a motor program we are constantly checking our prediction for the behavior of both our body and external objects against reality and then updating that program in real time. This bot apparently updates its program only after each trial.
2. The robot moves back to its standard starting position after each successful trial. This demonstrates the specificity of the motor pattern it's developed. Due to time pressure and the complexity of variations we deal with it is usually advantageous to learn a generalized pattern rather than a single pattern that works for a constrained set of starting conditions.
As a side note on the difficulty of this task, I agree with sukuriant that the paucity of the information the robot has, especially lack of fine grained touch, is a huge impediment. Secondly recall that in the human brain about 50% of the neurons live in the cerebellum which is strongly implicated in storing and updating fine grained motor patterns. (Gross patterns and intentions being initiated in the motor cortex).