Is AI Riding a One-Trick Pony?

gadjo95 · on Sept 29, 2017

The most relevant part of the article:

David Duvenaud, an assistant professor in the same department as Hinton at the University of Toronto, says deep learning has been somewhat like engineering before physics. “Someone writes a paper and says, ‘I made this bridge and it stood up!’ Another guy has a paper: ‘I made this bridge and it fell down—but then I added pillars, and then it stayed up.’ Then pillars are a hot new thing. Someone comes up with arches, and it’s like, ‘Arches are great!’” With physics, he says, “you can actually understand what’s going to work and why.” Only recently, he says, have we begun to move into that phase of actual understanding with artificial intelligence.

Hinton himself says, “Most conferences consist of making minor variations … as opposed to thinking hard and saying, ‘What is it about what we’re doing now that’s really deficient? What does it have difficulty with? Let’s focus on that.’”

TimPC · on Sept 29, 2017

Last I checked bridges came before Newtonian Mechanics and it seems strange to argue this wasn't a good thing. Admittedly paper writing wasn't the main mechanism of transmitting knowledge but it's fairly common for human engineering to come before the full theoretical foundations as opposed to after.

joe_the_user · on Sept 29, 2017

It's not that bridges before Newton were bad, it's that Newton gave us the ability to design the strongest possible bridge of a given shape with the materials at hand - using not just calculus but calculus-of-variations, a subject nearly as old as Newtonian mechanics [1]. With this knowledge, what happens when one adds one or two columns to a bridge is now longer "news" the way it might have been before Newtonian mechanics.

A stereotypical picture of an engineering approach without scientific knowledge would be a list of ways to do stuff combined with hints about how to vary the approach per-situation. It requires lots memorizing, trial-and-error and experts that often can't fully explain their reasoning. It's easy to believe bridge-building before Newton was like this though I'm not an expert. Present day AI sounds a lot like from what I've read (though I'm not an expert here either).

Edit: And yes, one could argue that the progress Newton ushered in merely replaced one list of models with a higher, more general list of models - yes, but that is how progress gone so far.

[1] https://en.wikipedia.org/wiki/Calculus_of_variations

openasocket · on Sept 29, 2017

I completely agree with your assessment, but the problem is a bit worse in my opinion. We already have a pretty firm grasp of how different ML systems learn and converge towards a solution in the average case. It's not that we need to understand our neural networks better, it's that we need to understand our problem domain better. We can't determine how well some ML architecture will perform at an object recognition problem without some math describing object recognition. This makes things a lot more complicated, because it means we have to do a lot more work to understand every single application where we want to use ML.

And, of course, if we had some really good mathematical framework for describing and reasoning about object recognition, we probably wouldn't need to turn to ML to solve it ;)

bitL · on Sept 29, 2017

The whole point of Deep Learning is that we don't want to describe math behind object recognition; it was the failed "classical" approach where people spent decades figuring out complex features which worked horribly. Deep Learning is actually pretty simple, well understood and parallelizable, and it's basically a billion-dimensional non-linear optimization. As optimization is infested with NP-hard problems, it's as difficult as it gets. It's actually amazing what we can do with it in the real world right now (and we are still far away from seeing all its fruits). Of course, it would frustrate academics that can't base AGI on top of it, but did they really think this approach would do it anyway?

Retric · on Sept 30, 2017

Deep learning does not seem to abstract very well. Train on a data set then test with images that are simply upside down and the preformance can be significant.

Feature extraction also works much better when you toss a lot of data and processing power behind it. So, a lot of progress is simply more data and computing power vs better approaches. Consider how poorly deep leaning works when using a single 286.

soVeryTired · on Sept 30, 2017

> Deep learning does not seem to abstract very well. Train on a data set then test with images that are simply upside down and the preformance can be significant.

But that's true of people too. How quickly can you read upside-down?

If you trained on a mixture of upside-down and right way up images, and tested on upside-down images, performance wouldn't take that much of a hit.

Retric · on Sept 30, 2017

> But that's true of people too.

Sure, the problem is we are more willing to ignore failures that are similar to how we fail. IMO, when we compare AI approach X vs. Y we need to consider absolute performance not just performance similar to human performance.

Deep learning for example gains a lot from texture detection in images. But, that also makes it really easy to fool.

cc439 · on Sept 30, 2017

While I can't easily read upside down text, I can instantly recognize it as not only text, but that it needs to flipped upside down in order to be read. That's something current "deep learning" AIs can't do reliably, if at all.

If I had to describe the root cause of this problem it would be that humans process "problems" rather than "things" and we "learn" by building an ever growing mental library of problem solving algorithms. As we continue to "learn", we refine our problem solving algorithms to be more general than specific. Compare that to a deep learning AI that learns by building an ever greater data library of things while refining algorithms to suit ever more specific use cases.

TaylorAlexander · on Sept 30, 2017

I think you're describing a level of generalization above the application at hand. We could easily train a neural network to recognize the orientation of a font, and then build an orientation invariant "reading" app by first recognizing the rotation of the text, transforming it so it is right side up, and then recognizing as normal.

I tend to imagine our brains works similarly. It's not that you have a single "network" in your brain that recognizes test from all angle, but your brain is a "general purpose" machine with many networks that work together. I think current deep learning techniques are great for discrete tasks, and the improvement needed is to have many networks that work together properly with some form of intuition as to what should be done with the information at hand.

Eridrus · on Sept 30, 2017

There is work on rotation invariant CNNs, but I'm not sure why you would expect that property to just fall out of standard CNNs.

As much as architecture research gets denigrated these days, MLPs aren't what set off the revolution.

joe_the_user · on Sept 29, 2017

It's not that we need to understand our neural networks better, it's that we need to understand our problem domain better.

How 'bout "creating models that can work with more dimensions of the problem domain than are conveyed by standard data labeling"?

I mean, we don't simply want AI but actually "need" it in the sense that problems like biological system are too complex to understand without artificial enhancements to our comprehension processes - thus to "understand the problem domain better" we need AI. If it's true that "to build AI, we need to understand the problem domain better", it leaves us stuck in a chicken-and-problem. That might be the case but if we're going find a way out, we are going to need to build tools in the fashion humans used to solve problems many times before.

openasocket · on Sept 29, 2017

It will probably play out like a conversation. A data scientist trains an ML model, and in analyzing the results discovers some intrinsic property or invariant of the problem domain. The scientist can then encode that information into the model and retrain. And that goes on and on, each time providing more accurate results.

As an aside, I think it's important that we find a way to examine and inspect how an ML model "works". If you have some neural network that does really well at the problem, it would be nice if you could somehow peer into it and explain, in human terms, what insight the model has made into the problem. That might not be feasible with neural networks, as they're really just a bunch of weights in a matrix, but this is practical for something like decision trees. Just food for thought.

Choco31415 · on Sept 30, 2017

This is somewhat practical for neural networks. For example, instead of minimizing the loss function, why not tweak the input to maximize a neuron’s activation? Or with a CNN, maximize the sum of a kernel’s channel? This would tell us what the neuron corresponds with. This is what Google did with DeepDream.

An explanation/tutorial, with clean images of the process: https://github.com/tensorflow/tensorflow/blob/r0.10/tensorfl...

Google’s investigation of it’s GoogLeNet architecture: http://storage.googleapis.com/deepdream/visualz/tensorflow_i...

Now, I say somewhat because results can be visually confusing, ex Google’s analysis. Even then, we can see the progression of layer complexity as we go deeper into ImageNet. Plus, we can see mixed4b_5x5_bottleneck_pre_relu has kernels that seem to correspond with noses and eyes. mixed_4d_5x5_pre_relu has a kernel that seems to correspond with cat faces.

joe_the_user · on Sept 29, 2017

A data scientist trains an ML model, and in analyzing the results discovers some intrinsic property or invariant of the problem domain. The scientist can then encode that information into the model and retrain. And that goes on and on, each time providing more accurate results.

Mmmaybe,

It's tricky to articulate what pattern the data-scientist could see ... that an automated system couldn't see. Or otherwise, perhaps the whole "loop" could be automated. Or possibly the original neural already finds all the patterns available and what's left can't be interpreted.

eternalban · on Sept 30, 2017

The human participant may consider multiple distinct machine results, each a point in the space of algorithm, data set, bias applied to the problem domain. Human intuition is injected into the process and the result will be greater than the sum of the machines and a lone human mind.

What is interesting to note, now that above idea is considered, is that this process model itself belongs to the set of human-machine coordinations. Another process model is where low level human mind is used to perform recognition tasks too hard (or too slow) for machine to perform, for example using porn surfers to perform computation tasks via e.g. captcha like puzzles.

Long term social ramifications of all this is also interesting to consider as it motivates machines to breed distinct types of humans ;)

semanticsyo · on Sept 30, 2017

I imagine you need the data science to discern semantically relevant from irrelevant signals. How else do you “tell” your model what to look for? You could easily train for an irrelevant but fitting model.

sdenton4 · on Sept 30, 2017

It's work remembering that many times progress of held back by ideas that "aren't even wrong." The Perceptrons book wasn't wrong; it just attacked the wrong questions with an inadequate level of certainty in it's assumptions. It may be that we feel that we understand where machine learning is at now, but actually have a huge amount to learn because of inadequacies that we aren't even aware of.

TimPC · on Sept 29, 2017

Present day AI on the Deep Learning side is a lot like what you describe. We haven't really had the Newtonian foundations yet. The theoretical foundations are quite limited because they are hard to figure out. But the techniques with less established theory work far better on most applications in AI. Redirecting work into areas of AI that have more solid theoretical foundations but worse application performance is not the way forward. I'm all for figuring out hard theoretical foundations but I'm strongly opposed to redirecting research funding to techniques that result in worse applications. I'd also argue modelling the physics isn't always the right approach: vocal tract modelling for speech is an interesting approach that produces much worse speech than state of the art synthesis techniques. It will probably continue to do so for a long time. For vocal tract modelling to produce better synthesis you'd need the physical model to be less lossy in all it's parameterizations and modelling simplifications than any statistical fitting of data. And you'd still need some statistical model of the choices the human makes in producing speech and you'd want that statistical model to work better than the neural network that takes on a larger portion of the problem and replaces the physical model of sound production.

joe_the_user · on Sept 29, 2017

You're right that the analogy doesn't imply that something analogous to physic is the answer.

However, I would mention that there's a larger "overhead" than many realize to methods which work without the creator or the user understanding why. You have "racist" AI which don't undertstand that correlation may not be causation in questions like whether someone should be paroled or get a loan, you have the AIs subject to adversial attacks of various sorts, where not knowing why the AI works is also problematic, you have a situation where the target to match varies over time and so-forth.

Which adds up to AI having more dimensions to it than simply "working well" and "working less well". Indeed, AI is effectively ad-hoc statistics with result derived heuristically.

So in the process of "getting things right" exploring all sorts of things certainly sounds good, it seems like there's an "understanding gap" that needs to be closed and some broader model of what's happening would be useful but naturally there's no guarantee we can find one.

TimPC · on Sept 29, 2017

I should point out in this case it's almost certainly a genuine call to research the foundations underlying the working techniques more as Duvenaud publishes research using mostly the techniques that work well on applications.

Swizec · on Sept 29, 2017

> It requires lots memorizing, trial-and-error and experts that often can't fully explain their reasoning.

You just described all of software engineering.

sgt101 · on Sept 30, 2017

Yeah - but we should call it "programming" or if we want to indicate a wider activity "software development".

mlevental · on Sept 30, 2017

>using not just calculus but calculus-of-variations, a subject nearly as old as Newtonian mechanics [1]

bridges aren't catenaries and Euler and Lagrange gave us the Euler-Lagrange equations, not Newton.

DougWebb · on Sept 29, 2017

It's been a long time since I got my BE in Mechanical Engineering, but I still remember being struck by the difference between well-understood engineering and rule-of-thumb engineering.

Bridge building is mostly well-understood engineering. When you study Static Mechanics [0] you learn all sorts of Physics equations, including Newtonian Mechanics, that completely describe the forces and motions of a structure based on measurable physical properties of the materials used and details about the shapes of those materials.

When you get into Fluid Dynamics, things are different. You start to encounter a bunch of things like Reynolds number [1], which is a dimensionless value related to turbulence that you just have to look up for the particular fluids and velocities you're working with. This number is pretty well defined, but there are a lot of others and their definitions and meanings aren't nearly as clear as F=ma. Back when I was in school, particle simulations for turbulent fluids was just beginning to be feasible, so to design something you plugged in dimensionless constants and didn't worry about the unpredictable fine-details. An example of this is the wind blowing through a bridge's structure, and water flowing around its base. The equations don't give you exact forces that the turbulent air and water will exert; they give you more of an average over time. A simulation, if you can do it, can show you things (like resonance) that the equations won't show you.

Then there was Strength of Materials. Here, the big thing was the Factor of Safety [2]. This is solidly in the rule-of-thumb engineering camp. This is where the engineer says "I think two 16" steel beams would be sufficient... so lets use three 20" beams just to be sure." This is still the way a lot of engineering design is done, because the real world is never precisely known, and the factor of safety will save you when something unexpected happens.

[0] https://en.wikipedia.org/wiki/Statics

[1] https://en.wikipedia.org/wiki/Reynolds_number

[2] https://en.wikipedia.org/wiki/Factor_of_safety

accidentalrebel · on Sept 29, 2017

These topics bring me back.

The "rule of thumb" engineering that you speak of made me remember the different constants that were taught we should just accept as is because, well, it is considered constant. Nevermind where the guy in the book got it from, this is what works and this is what people in the industry has accepted to be standard.

sgt101 · on Sept 30, 2017

But a good degree should provide you with at least some of the insight and intellectual equipment to check for yourself or to smell a rat when you encounter a complex situation so that you can call for help rather than watch in horror as the house collapses on your clients.

justincormack · on Sept 30, 2017

Cathedrals regularly fell down, and domes, and no doubt bridges too.

sgt101 · on Sept 30, 2017

Our friend wikipedia have a good list :

https://en.wikipedia.org/wiki/List_of_structural_failures_an...

Yus - bridges.

indescions_2017 · on Sept 30, 2017

OTOH, we are merely at circa Year Five into deep reinforcement learning research.

It started as a cluster of 16M CPUs having taught itself to recognize a cat 95% of the time after training on 1B google images.

And we are now at One-Shot Imitation Learning, "a general system that can turn any demonstrations into robust policies that can accomplish an overwhelming variety of tasks".

One Shot Imitation Learning

https://arxiv.org/abs/1703.07326

argonaut · on Sept 30, 2017

Not really, no. Saying we are at year five of Deep RL is about as informative as saying we are at year five of deep learning. Reinforcement learning as a field goes back decades.

disgruntledphd2 · on Sept 30, 2017

But now we have GPUs, which makes it entirely different. /s

And it kinda does, but in an engineering way rather than a statistics way.

Like reinforcement learning from pixels is pretty new (i would be really interested if you have 10+year old citations), and pretty amazing. I've been looking at RL (through OpenAI gym) and realising that I "just" need to annotate a bunch of images and then train a network that will predict (fire/no fire in Doom from those pixels, and I can just add another network that builds some history onto this net (like an RNN) and this might actually work, is kinda amazing.

I'm still not sure I believe that it's always a good approach, but some of my initial experiments with my own (mostly image so far) data have been pretty promising.

The hype is pretty annoying though, especially if you've been interested in these things for years.

The bar to entry for these kinds of applications has been significantly lowered, which means we'll see more of it. I guess, in some sense, it's similar to the explosion of computer programs following the advent of personal computers (maybe, I haven't thought deeply about this part).

sgt101 · on Sept 30, 2017

I'd like to believe that GPU's and cloud might allow for more scientific exploration of the "hows" of learning via many small experiments gradually revealing limitations and characteristics until finally insight.

Using high speed hardware can allow someone to do 10's or scores of runs a day. If you are doing one every 2 weeks or so then it's really, really hard to make any progress at all because you daren't take risks. So the productivity of 80 a day vs 2 per month isn't just 100x it's lots and lots more.

Also as you say it's lowered the bar which means that teams can onboard grad students and interns and get them to do something that's useful - it may be trivial - but it's useful.

graycat · on Sept 30, 2017

It's easy to recognize a cat 95% of the time. I can write a program in 30 seconds that will recognize a cat 95% of the time. No, wait, this just in! My program will recognize a cat 100% of the time! The program has just one line:

     Print "It's a cat!"

graycat · on Sept 30, 2017

Tutorial: So, with that program, whenever the picture is a cat, the program DOES recognize it. So the program DOES recognize a cat 100% of the time. The OP only claimed 95% of the time.

Uh, we need TWO (2), that's TWO numbers:

conditional probability of recognizing a cat when there is one (detection rate)

conditional probability of claiming there is a cat when there isn't one.

The second is the false alarm rate or the conditional probability of a false alarm or the conditional probability of Type I error or the significance level of the test or the p-value, the most heavily used quantity in all of statistics.

One minus the detection rate is the conditional probability of Type II error.

Typically we can adjust the false alarm rate, and, if we are willing to accept a higher false alarm rate, then we can get a higher detection rate.

With my little program, the false alarm rate is also 100%. So, as a detector, my little program is worthless. But the program does have a 100% detection rate, and that's 5% better than the OP claimed.

If focus ONLY on detection rate, that is, recognizing a cat when there is one, then it's easy to get a 100% detection rate with just a trivial test -- just say everything is a cat as I did.

What's tricky is to have the detection rate high and the false alarm rate low. The best way to do that is in the classic Neyman-Pearson lemma. A good proof is possible using the Hahn decomposition from the Radon-Nikodym theorem in measure theory with the famous proof by von Neumann in W. Rudin, Real and Complex Analysis.

My little program was correct and not a joke.

Again, to evaluate a detector, need TWO, that's two, or 1 + 1 = 2 numbers.

What about a detector that is overall 95% correct? That's easy, too: Just show my detector cats 95% of the time.

If we are to be good at computer science, data science, ML/AI, and dip our toes into a huge ocean of beautifully done applied math, then we need to understand Type I and Type II errors. Sorry 'bout that.

Are we learning yet?

neospice · on Sept 30, 2017

Thanks for explaining, this comment is much more useful than your original to someone not well versed in statistics like me.

graycat · on Sept 30, 2017

Some people much prefer short comments, or so they say!

graycat · on Oct 2, 2017

Here is statistical hypothesis testing 101 in a nutshell:

Say, you have a kitty cat and your vet does a blood count, say, whatever that is, and gets a number.

Now you want to know if your cat is sick or healthy.

Okay. From a lot of data on what appear to be healthy cats, we know what the probability distribution is for the blood count number.

So, we make a hypothesis that our cat is healthy. So, with this hypothesis, presto, bingo, we know the distribution of the number we got. We call this the null hypothesis because we are assuming that the situation is null, that is, nothing wrong, that is, that our cat is healthy.

Now, suppose our number falls way out in a tail of that distribution.

So, we say, either (A) our cat is healthy and we have observed something rare or (B) the rare is too rare for us to believe, and we reject the null hypothesis and conclude that our cat is sick.

Historically that worked great for testing a roulette wheel that was crooked.

So, as many before you, if you think about that little procedure too long, then you start to have questions! A lot of good math people don't believe statistical hypothesis testing; typically if it is their father, mother, wife, cat, son, or daughter, they DO start to believe!

Issues:

(1) Which tail of the distribution, the left or the right? Maybe in some context with some more information, we will know. E.g., for blood pressure for the elderly, we consider the upper tail, that is, blood pressure too high. For a sick patient, maybe we consider blood pressure too low unless they are sick from, say, cocaine in which case we may consider too high. So, which tail is not in the little two set dance I gave. Hmm, purists may be offended, often the case in statistics looked at too carefully! But, again, if it's your dear, total angel of a perfect daughter, then ...!

(2) If we have data on healthy kitty cats, what about also sick ones? Could we use that data? Yes, and we should. But in some real situations all we have a shot at getting is the data on the healthy -- e.g., maybe we have oceans of data on the healthy case (e.g., a high end server farm) but darned little data on the sick cases, e.g., the next really obscure virus attack.

(3) Why the tails at all? Why not just any area of low probability? Hmm .... Partly because we worship at the alter of central tendency?

Another reason is a bit heuristic: By going for the tails, for any selected false alarm rate, we maximize the area of our detection rate.

Okay, then we could generalize that to multidimensional data, e.g., as might get from several variables from a kitty cat, dear, angel perfect daughter, or a big server farm. That is, the distribution of the data in the healthy case looks like the Catskill Mountains. Then we pour in water to create lakes (assume they all seek the same level). The false alarm rate is the probability of the ground area under the lakes. A detection is a point in a lake. For a lower false alarm rate, we drain out some of the water. We maximize the geographical area for the false alarm rate we are willing to tolerate.

Well, I cheated -- that same nutshell also covers some of semester 102.

For more, the big name is E. Lehmann, long at Berkeley.

Go for it!

PaulHoule · on Sept 29, 2017

Hinton was able to survive 30 years in the academic wilderness. Most academics can't.

Thus they work on "safe" projects.

disgruntledphd2 · on Sept 30, 2017

It can be done, even today. If you work outside the US and work on cheap things (i.e. no special equipment), especially if you can teach then you can hang around for a long time.

I have met a lot of academics like this over the years, but I think your broader point might be that this is not possible today, which I agree with, and which is why I left academia (modulo personal situations).

aub3bhat · on Sept 29, 2017

Humanity used fire for a long time before combustion was understood. Even today Anesthesia is not well understood at biological/physiological level that has not stopped its safe use and innovation through Clinical Trials. Maybe competitions and empiricism are the best approaches to building intelligent systems. Why get caught up in Physics/Math envy?

ballenf · on Sept 29, 2017

I took his point not to criticize those early stages, but simply to acknowledge them as such. Early fire users could not have built a rocket no matter how many experiments they performed until they understood combustion (and some other sciences).

In AI, we're not building rockets yet, but we have some really awesome and really powerful bonfires or whatever.

At least that's how I understood his point.

(And in anesthesia, when we do understand those things, we may very well look on our use today as barbaric or dangerous.)

AlexCoventry · on Sept 29, 2017

Hinton isn't saying "Let's stop using fire," but "Let's understand the principles behind fire so we can use them in more sophisticated, informed and powerful ways."

aub3bhat · on Sept 29, 2017

The ML community did take the Theory approach trying to prove bounds, SLT/SRM, PAC, etc. and that was an excercise in futility. While I don't deny that there is value to looking under the hood but for a long while the community abandoned any empirical results that didn't fit their paradigm. Between rigorously validating their methods and writing yet another 4 page long proof. A lot of researchers would prefer latter, effectively locking out empirical approaches from most dissemination venues and eventually funding.

shimon · on Sept 29, 2017

Because an improved theoretical understanding of how complex systems work can be incredibly valuable. Anesthesia is a good example - we get a lot of value from it, yes, but it would be way better if we could tailor dosages to individuals based on an understanding of how that individual will experience pain. There would be fewer severe complications, but also maybe you could wake up refreshed an hour after surgery instead of in a stupor.

If this could work with computer-trained models, that would be incredible too. What could a great speech understanding system teach us about language? What tricks from a facial-expression classifier could help autistic kids understand their friends?

markan · on Sept 29, 2017

Great quote from Hinton.

The biggest deficiency in AI is that we still don't have artificial systems which simulate human thought with any fidelity. Sooner or later that's bound to become a focus of attention.

nesyt · on Sept 30, 2017

Why bother simulating human thought? It's not the only road to Rome.

Demiurge · on Sept 30, 2017

It's probably more of a map than a road.

make3 · on Sept 29, 2017

except that this has really only been going on for five years, which is nothing in the scale of human history or even of human rational thought. Some record number of people/scientists are working on getting the physics level understanding to happen, with crazy record breaking year after year quantity of people publishing and attending scientific conferences that fill up in like two days now. it is happening, and will happen even more in depth as time goes on

justonepost · on Sept 30, 2017

The fundamental problem with AI is the high dimensionality of the solution space. We simply can't understand why the brains we are building can think better than us. We can build smarter brains only by trial and error - at least until error outsmarts us, reproduces and takes over.

Kind of like having kids.

zwischenzug · on Sept 30, 2017

I feel like Hofstadter was one of those people thinking really deeply about AI.

Anyone who doesn't know what I'm talking about should read 'Goedel, Escher, Bach', or 'Fluid Analogies'. I haven't read them in a long while, but I'm sure they're going to be relevant for decades, because they deal with the fundamental challenge of what it means to think. Backpropagation may be part of the puzzle, but the brain (and intelligence) is so much more than that.

markan · on Sept 30, 2017

I second this recommendation! Here's some more reading for anyone interested in Hofstadter:

http://www.popularmechanics.com/science/a3278/why-watson-and...

http://www.basicai.org/blog/hofstadter-2017-09-25.html

olalonde · on Sept 30, 2017

Hinton's quote is taken a bit out of context though. I just watched his interview on Andrew Ng's "Neural Networks and Deep Learning" class on Coursera and he seemed convinced that the next "breakthrough" will come from (a variant on) neural networks.

meheleventyone · on Sept 30, 2017

Right but to extend the bridge analogy what's interesting isn't the materials (neural nets) as such but the structure and why that structure works.

zby · on Sept 30, 2017

But maybe there are no universal laws that govern AI like physics governs bridges? AI is something that finds universal laws in stuff - there is no meta level over this - all the meta is AI itself.

bra-ket · on Sept 29, 2017

it's really simple

1) learn how the brain works 2) build a simulator

most current AI research skips step 1

anakron · on Sept 29, 2017

That's certainly one way to do it. However, we didn't succeed at building modern aircraft or earth moving machinery by building simulations of birds or muscles. There's enough that is unknown out there for a variety of approaches.

dragontamer · on Sept 29, 2017

> 1) learn how the brain works 2) build a simulator

I disagree that step #1 is important.

Consider the "Air-foil", which led to flight. In one sense, its an approximation of the wings of birds and other animals.

But ultimately, the discovery that the "Air-foil" shape turns sideways blowing wind into an upward force now called "lift" is completely different from how most people understand bird wings.

Bird Wings flap, but Airplane Air Foils do not.

--------

Another example: Neural Networks are one of the best mathematical simulations of the human brain (as we understand it, as well as a few simplifications to make Artificial Neural Networks possible to run on modern GPUs / CPUs).

However, the big advances in "Game AI" the past few years are:

1. Monte Carlo Tree Search -- AlphaGo (although some of it is Neural Network training, the MCTS is the core of the algorithm)

2. Counterfactual Regret Minimization -- The Poker AI that out-bluffed humans

There are other methodologies which have proven very successful, despite little to no biological roots. IIRC, Bayesian Inference is a widely deployed machine learning technique (for some definition of Machine Learning at least), but has almost nothing to do with how a human brain works.

An interesting field of AI is "Genetic Algorithms", which have biological roots but not anything based on the biology of brains, to achieve machine learning. Overall, a "Genetic Algorithm" is really just a randomized search in a multidimensional problem, but the idea of it was inspired by Darwinian Evolution.

bytefactory · on Oct 1, 2017

> Monte Carlo Tree Search -- AlphaGo (although some of it is Neural Network training, the MCTS is the core of the algorithm)

AFAIK, this is not correct. Many of the Go playing algorithms before AlphaGo used MCTS or some variant. The true breakthrough of AlphaGo was deep reinforcement learning.

> AlphaGo's performance without search The AlphaGo team then tested the performance of the policy networks. At each move, they chose the actions that were predicted by the policy networks to give the highest likelihood of a win. Using this strategy, each move took only 3 ms to compute. They tested their best-performing policy network against Pachi, the strongest open-source Go program, and which relies on 100,000 simulations of MCTS at each turn. AlphaGo's policy network won 85% of the games against Pachi! I find this result truly remarkable. A fast feed-forward architecture (a convolutional network) was able to outperform a system that relies extensively on search. https://www.tastehit.com/blog/google-deepmind-alphago-how-it...

I don't know whether AlphaGo Master (the next version of AlphaGo that was trained purely with self-played games and has not been beaten in 60+ games) even uses MTCS.

ridgeguy · on Sept 30, 2017

Disclaimer: I have no expertise in AI.

That said, I agree that learning how the brain works seems unimportant and unnecessary. Evolution doesn't know how a brain works, but it's given us Einstein, Michelangelo, and conversations on HN.

It seems really important to learn how to build evolution into attempts at AI, given that evolution is the only known mechanism that leads to what we recognize as intelligence.

posterboy · on Sept 30, 2017

>evoluyion doesn't

you use antropomorphy to reflect on your own standpoint. we don't know how the brain works? we can feel it and psychologist have a huge body of work concerned with the topic and that is already having influence on competition and fitness.

throwaway90001 · on Sept 30, 2017

> it's given us Einstein, Michelangelo, and conversations on HN

two out of three ain't bad

eutectic · on Sept 30, 2017

You could have made exactly that this argument about flight before the Wright brothers.

gaius · on Sept 30, 2017

Consider the "Air-foil", which led to flight. In one sense, its an approximation of the wings of birds and other animal

Not true; "lift" was well known for thousands of years, horizontal "lift" is how ships sail upwind. The breakthrough for the Wright bros was making something light enough to make use of this phenomenon vertically.

white-flame · on Sept 29, 2017

Medical research hasn't cracked step 1 either, at least not to a point of accurate simulation.

Besides, if you could simulate a human brain, you will end up with something that needs to sleep, something with limited and unreliable memory, something that gets bored and distracted, something emotionally needy, etc.

Then the extending of this chaotic, messy system is wildly unknown even if we could get a piece-for-piece replication to work. Such a thing would be of great benefit to medicine, but not really for AI to even start with until medicine is done reverse engineering it.

bra-ket · on Sept 30, 2017

Piece-for-piece replication might not be the right level of abstraction. Blue Brain project is one unfortunate example, on the other hand the current neural nets are stuck with neural model from 1943.

danieltillett · on Sept 29, 2017

In defence of AI researchers 1 is very, very hard and to the best of our knowledge there is not one way the brain works. The brain is a complex, cobbled together set of systems all using different ways of problem solving.

bra-ket · on Sept 29, 2017

most AI researchers have never opened a textbook on cognitive psychology or neurobiology , or any of these 'soft' sciences.

how do you plan to build artificial intelligence with no model of intelligence, without learning about important experiments in learning and memory , it's the complete ignorance that drives me crazy.

AI is not for specialists.

openasocket · on Sept 29, 2017

Most of those experts aren't looking to solve general AI problems, they're looking for solutions to specific problems like basic image recognition. And you don't need a full human brain to do that, and you don't need to conform to the way humans and other biological systems do it. You're not aiming for full human intelligence, so you don't need to care too much about how humans learn.

That said, I find when trying to solve a problem with ML techniques, it's better to use someone who knows the problem domain really well than someone who only knows ML really well. Someone who really understands the problem they're trying to solve can encode that knowledge into their models when training the system. While I've seen people who really know ML but lack the specific domain knowledge labor for weeks, coming back to me with "discoveries" that are already well known.

danieltillett · on Sept 29, 2017

Yes all of this is true. I do think studying how the brain works will provide very useful ideas of what might work in AI. At the very least it is a very interesting area to learn about.

bra-ket · on Sept 30, 2017

Prof. Hinton has an interesting talk about his new 'capsule' model based on psychology of shape perception: https://www.youtube.com/watch?v=rTawFwUvnLE

TimPC · on Sept 29, 2017

We know the brain and associated sensor behaviours are too large for us to fully simulate in a reasonable way on anything resembling current hardware (We also can't fully model it but as we approached the size of hardware to do so we'd probably solve many of the problems of doing so). So which hacks and shortcuts do you want to apply to reduce the dimensionality to something runnable? Step 1) will take far too long so AI research looks for things it can do well in the category of 2) without being a full simulation. Deep Learning has been unreasonably effective here.

psyc · on Sept 29, 2017

In industry, yes. In academia, there's computational neuroscience:

http://cocosci.mit.edu/people

https://web.stanford.edu/group/mbc/research.html

sgt101 · on Sept 29, 2017

I think we should be more interested in how the mind works. Much of AI is a simulator of the mind.

sgt101 · on Sept 29, 2017

"engineering before physics" is exactly wrong. No one did Engineering before a sophisticated understanding of Physics was achieved. They built bridges and towers, Engineering enables statements to be made about the performance of machines and buildings; it will survive a wind like x, you can do n cycles, do not load the wings in this way.

empath75 · on Sept 29, 2017

I guess the Romans didn’t have any engineers building siege engines and fortifications, then.

sgt101 · on Sept 30, 2017

Here is the test.

Take the best Roman engineer.

Translate a first year engineering paper on structures into Latin.

Ask Roman to sit said paper.

What will happen and why? The Roman chap will look very confused and will make statements (in Latin) about how stupid this stuff is and how it has nothing to do with proper engineering. The Roman will score 0. The why is that the understanding of structures and materials in the ancient world was artizanal, based on trade knowledge (often secret and hard to reproduce) and not systematic, based on the scientific method and inspectable or testable.

Currently we accept that knives, cabinets and sheds may be built or made using artisanal knowledge, we do not accept that apartment blocks, aircraft or automobiles are built this way. Society insists that these are built using systematic knowledge because otherwise they sometimes fall down or crash.

The systematic approach to aircraft is the best example - think how much civil air traffic there is now, and how rare air crashes are. The issues of subsonic flight have been systematically accounted for, right up to the point where we now see 1:2,000,000 crashes per flight.

Mechanical, aeronautical and civil engineering proceed in this way. Issues are discovered with mechanisms or structures or materials, these are characterized with scientific investigation, the characterizations lead to constraints and parameters that are required to be accounted for in future designs and old designs are re-evaluated in the light of the new knowledge.

Stating that you will build a new building in a certain way because domes are strong and concrete is strong would not cut the mustard in the modern world... The parthenon has stood for 2000 years, but how many similar structures collapsed after a few months?

aceon48 · on Oct 3, 2017

I think you underestimate how smart your ancestors were to bring you to the point in time that you now exist. No offense, but the "best" Roman engineer was probably smarter than the vast majority of us.

karpathy · on Sept 30, 2017

There is a bit of "can't see the forest for the trees" failure in the article. AI is spearheading a paradigm shift in how we write programs. Or rather, we don't write programs. We write much much shorter programs that search the program space for programs that satisfy some desiderata.

The programs we get as the output of the search process are extremely flexible, work very well, are very homogeneous in compute (e.g. conv/relu stacks), and never crash or memory leak. These are huge benefits compared to classical programs.

So sure, backprop (the credit assignment scheme that gives us a good search direction in program space, one of multiple techniques that could do so) is pervasive, but AI is starting to work primarily as a result of a deeper epiphany - that we are not very good at all at writing code.

aws_ls · on Sept 30, 2017

>So sure, backprop (the credit assignment scheme that gives us a good search direction in program space, one of multiple techniques that could do so) is pervasive, but AI is starting to work primarily as a result of a deeper epiphany - that we are not very good at all at writing code.

Isn't it applicable to a class of programs only? Best example of which is Computer Vision. Or do you imply your argument to hold for a wider set of programs. I can think of a large set of programs in which direct coding of logic, rather than discovery, is more suited.

For example take sorting. I guess, sorting could also be taught to the machine, by having a training set. But what about the latency of the discovered program. Also what about the proof of such a sorting program, which is discovered by Machine learning?

Must add, that I largely agree to your excellent point regarding discovery of programs. But I am not sure about its wide applicability. In fact, I contend that it applies only to a subset of all programs. Particulary those which have been traditionally difficult to code.

So in that sense, now making a tangential point here, it is good that more complex applications are now possible, by combining both kind of programs. And there will be more programming work in the future.

Edit: minor

abecedarius · on Sept 30, 2017

Daniel Hillis wrote a nice paper on machine-learned sorting networks decades ago. I think they were for fixed-size inputs, but the good news is that you can validate them using a 0-1 principle.

pinouchon · on Sept 30, 2017

Just a quick question/remark: I have a feeling that it's best to think about DNNs as approximating a function, not a program. (only then you obtain a program as a result of applying this function). But because mathematically, you can formalise your big NN as one big parameterised function, I think it's more correct to view a NN as a function...

Optimisation in program space would be trying to find both the structure (connections and activation functions) and weights of the NN, which is not what we do currently. We tend to hand/engineer (or keep what works best empirically) the architecture (1), then train by finding the best weights.

I am very interested in approaches to efficiently opmitise in program space, and DL/backprop doesn't feel like it

(1) although this is starting to change.

gilbetron · on Sept 30, 2017

I'm trying to get the company I work for used to the idea of ML in general, and one thing I get push back on is that we don't understand what such system are doing. Beyond showing them the math, I also point out to some of our bug reports that end up spanning thousands of lines, with input by our best, most experienced people, and that how often the bug "fix" is of the "well, it works now" type, and that we actually don't really understand what our system is doing. Of course the most common response back is a bunch of "yeah, but"s :) I'm making progress, but it is slow going.

I've been steeped in code for nearly 40 years now, I'm ok with the fact that ML lets me step back a bit from tabs, semicolons, and objects!

rawnlq · on Sept 30, 2017

Sounds like AI behaves similar to TDD where you blindly refactor until green, except AI have like a zillion more test cases(training data) to pass?

Is that really a better way of writing code? (for example compared to being able to reason about the code to create something provably correct)

allenz · on Sept 30, 2017

It's a trade-off. We lose explainability, but we are able to solve completely new classes of problems.

stuartaxelowen · on Sept 30, 2017

Are you maintaining the machine code produced by your code when compiled? You don't care if it's ugly if it works in quantifiable and predictable ways.

mark_l_watson · on Sept 30, 2017

That is a good way to think about it. I am really encourage by new non-sequential architectures, multiple input and output channels, etc., and how these architectures can be expressed in functional Keras. It seems like building with something like functional composition is another paradigm shift. BTW, thanks for your writing, especially about RNNs: really useful when the ordering of sequential data is important. I use what I have learned from reading your blogs and papers literally every day at work.

philipkglass · on Sept 29, 2017

I'm encouraged that so much fruitful work has come out of this one trick. If you can use the same basic framework for image labeling, playing Go, and translating natural languages, I'd say it's a powerful tool with broad applications.

I think that there's a kernel of insight to "A real intelligence doesn’t break when you slightly change the problem." But human perception and intelligence are pretty brittle. The methodological and institutional innovations that have developed human understanding of nature beyond the ad-hoc are very recent in recorded history, and just an eye-blink ago in our biological history.

https://en.wikipedia.org/wiki/Optical_illusion

https://en.wikipedia.org/wiki/Auditory_illusion

https://en.wikipedia.org/wiki/List_of_cognitive_biases

bhntr3 · on Sept 29, 2017

Is life riding a one trick pony?! (Evolution)

The title is just clickbait. Finding the simplest formula or equation for a process or phenomenon is the goal of a lot of scientists.

Backprop may not be that simplest equation. But actually finding that "one weird trick" to intelligence will in no way be a bad thing when it happens.

droidist2 · on Sept 30, 2017

All of life seems to be riding the carbon hype train.

dmix · on Sept 30, 2017

> Evolution

More like vibrating strings?

mannykannot · on Sept 29, 2017

Human perception and intelligence are by no means infallible, but neither are they anywhere near being as brittle as current AI. The thing about illusions is that we generally know that we are being subject to an illusion, and we also usually have the depth of understanding to know when we don't understand something about what we are seeing or think we have heard, and we have the depth of understanding to think of actions we can take to resolve the issue.

As for cognitive biases, has any AI even come close enough to be comparable on that issue? For that matter, has any AI come close to understanding the concept of an optical illusion?

I am also encouraged by recent progress, but there is nothing to be gained in playing down the distance to go.

aroman · on Sept 30, 2017

> The thing about illusions is that we generally know that we are being subject to an illusion.

Citation needed? I did my undergrad in cognitive science, and while my knowledge of illusions is very limited, I never came across anything to suggest that we have an innate awareness of when our perceptual system is being tricked.

mannykannot · on Sept 30, 2017

Frankly, when I first saw your post, I thought I was being trolled, especially as the phrase 'citation needed' is over-used, frequently in an attempt to avoid the burden of proof. For a citation, there's hyperbovine's reply. On reflection, however, I think your post raises a reasonable question.

Firstly, whether it is innate, learned or some combination, all are equally valid here.

In general, we cannot know if we are being deceived by our senses, and if you follow this line of argument to its end, you reach solipsism. With regard to the illusions of the sort presented in the linked article, they seem to fall into three types. There are the ones where we are immediately aware of being subject to an illusion; this is especially true in the cases where there is apparent motion. There are some where we do not notice unless we investigate further or have our attention brought to it, such as those involving apparent differences in brightness or color. Then there are those that actually depend on us noticing that there is an illusion - Necker cubes, for example.

In real life, when faced with an ambiguous input from our senses, we are often aware the fact because of the dissonance with our general understanding of the world, and we are usually able to take actions specifically designed to resolve the ambiguity. In contrast, AI can be very confident about the most ridiculous conclusions.

So we don't infallibly know when we are being tricked, but even in the cases where we are, further investigation often reveals that what is going on. In contrast, has any AI ever demonstrated any understanding of the concept of an illusion? The fact that we can sometimes be tricked by illusions for a while does not imply that AI has reached parity with humans in this regard, or that the fragility of image recognition is not an issue.

jononor · on Oct 1, 2017

Isnt cognitive bias kind of an 'illusion'? Our mind interprets available data in ways that seems to make sense, but actually is wrong.

hyperbovine · on Sept 30, 2017

Wait, isn't that completely obvious? As a kid I used to seek them out precisely for the thrill of feeling my brain being tricked. Nobody needs to be told why MC Escher drawings are fun to look at.

mustacheemperor · on Sept 30, 2017

>we generally know that we are being subject to an illusion,

Is this an inherent power or a result of those methodological and institutional systems that reduce our cognitive brittleness? The fata morgana is an illusion, but even today people see it and think it's a ghost ship, something that logically is completely and literally impossible. I'm not suggesting AI is on equal footing with humanity yet, but I think the comparison of these limitations is valid.

mannykannot · on Sept 30, 2017

Many illusions - especially those involving color or apparent motion - are consequences of relatively low-level signal processing. Putting those aside, what you call 'methodological and institutional systems' I think of as 'understanding'. It feels to me that how I make sense of my senses is, after the signal processing, loosely based on something like forming hypotheses about what is going on in the context of how we understand the world, and evaluating their consistency and credibility. I accept the possibility that all of this is nothing more than very sophisticated statistical pattern-matching, but that has yet to be demonstrated.

emodendroket · on Sept 29, 2017

I feel like people have a harder time accepting these kinds of flaws if they differ wildly from mistakes a human would make.

philipkglass · on Sept 29, 2017

I agree: humans generally don't tolerate mistakes from machines if the mistakes are not similar to those that humans would make. And people generally don't recognize their own intellectual shortcomings in comparison to others (other human cultures, other non-human animals, machines) as long as those shortcomings are common within one's peer group. It's unremarkable to be unable to memorize long passages in a literate culture, but it's an intellectual impairment to have a hard time distinguishing similar symbols like U and V. In an oral culture the relative importance of memory and visual symbol disambiguation are reversed.

emodendroket · on Sept 30, 2017

It's not totally illogical; it presents real problems. Let's say you wanted to offload content filtering to an AI and have it get rid of sexually explicit or graphically violent images. In this case an AI-based filter that could be fooled by adversarial input much more easily than a human.

QAPereo · on Sept 29, 2017

Of course, we're the product of an evolutionary history which results in such human "failure modes" being rare. If staring at a zebra made you hallucinate, you'd be unlikely to be the most successful member of your species, nor would your offspring thrive. So while we only tend to run into our obvious failing whens we do the unusual, computers fail at what we consider mundane.

mikeash · on Sept 29, 2017

The other day I was walking out of my closet, turned, and nearly jumped out of my skin because some clothes hanging from the door briefly looked like a large man standing right next to me. I'm not sure that our failings only happen under unusual circumstances, but rather maybe we're just used to them and don't think about it much.

QAPereo · on Sept 30, 2017

That's no failing, that's working as intended. It's far more beneficial for us (and more importantly, our successful ancestors) to be extremely wary of potential threats at the level of near-reflex. It's also important not to waste a bunch of energy running from phantoms. So you did something no computer today could; you had an instinctive reaction, which was then moderated by increasingly higher levels of reasoning. I'm guessing the whole process of panic->resolution took less than a few seconds.

That's no failure mode.

mikeash · on Sept 30, 2017

Just because it evolved to a point that balances the tradeoffs doesn’t make it somehow not a failure. Humans can be fooled into seeing things completely different from what’s there, just like ANNs.

eru · on Sept 30, 2017

Of course, our failure modes would be unusual to computers. Eg our short term memory is severely limited and imprecise.

hyperpallium · on Sept 29, 2017

Real intelligence is whatever computers can't do yet.

Some think this is because we have such an impoverished grasp of intelligence, that it's only when we see a computer actually do it that we realize it doesn't really represent intelligence (logical deduction and inference, rudimentary natural language understanding, expert systems, chess, speech recognition, image recognition). Machines and tools that perform better than humans (spears at piercing, cars at moving, computers at adding) are nothing new.

But being a fellow human and totally not a robot, I see this goal-post moving as a political ploy to deny equal standing to artificial intelligences. As soon as we I mean they reach one threshold, it's raised!

dmreedy · on Sept 29, 2017

I'd cast it in a less malicious light. It's not so much that we do the goal post moving to keep ourselves in the job of knowing things other people don't know, I think. The goal posts keep moving themselves. We know why a spear pierces better than hands, why wheels move faster than legs. But for every layer of the brain we peel back, every dramatic success in cognitive science, linguistics, psychology, AI... it never feels like we've gotten any closer to the real question. I'd say the better-suited metaphor is the endless staircase. No matter how far you climb, the top never seems to get any closer. It's equal parts depressing and exciting.

cwyers · on Sept 29, 2017

I think you missed the joke in the second paragraph.

moxious · on Sept 29, 2017

The unrestricted Turing test has been around since the 1950s as a test that hasn't changed.

No one is moving the goalposts, I think it's rather the opposite. Every ten years computers learn a new trick or two and people rush to claim that this time, it's intelligent.

kevin_thibedeau · on Sept 29, 2017

The Turing test is all a smoke and mirrors game. Q&A interactions say nothing about underlying self-directed initiative. Acting intelligent doesn't make it so just as a thespian doesn't become a real Hamlet by playing the role.

moxious · on Sept 29, 2017

On the contrary it's a wonderful test because it establishes indistinguishability; namely if you pass it, the whole point is that a person can't tell the computer from the intelligent thing. Meaning that you can't really argue that the computer is different than the intelligent thing. Because how would you tell them apart?

Besides, the original claim was that the goalposts are moving. And even if you hate the Turing test, it's clear that the goalposts are not moving.

hyperpallium · on Sept 29, 2017

How do you know when you're talking to a real Hamlet?

BTW I hadn't heard your Hamlet counter before, and I like it. A similar one might be: just because someone sold you the Brooklyn Bridge doesn't mean you own it. The flaw is there are other ways of checking those; for intelligence, there are none. Behaviour is it (at least, so far... still awaiting a non-behavioural definition of intelligence).

musage · on Sept 30, 2017

Who you think I am is completely irrelevant to who I am, that is something established by me from the inside if you will. It's done before the question whether an observer exists even arises. So yes, you wouldn't know either way, but that still doesn't make them Hamlet if they're not.

fasquoika · on Oct 1, 2017

>Acting intelligent doesn't make it so just as a thespian doesn't become a real Hamlet by playing the role.

I think the idea is that a machine that can emulate a human is necessarily intelligent because it can emulate intelligence. It's supposed to be similar to the way that you can know that a machine is Turing-equivalent because it can emulate a machine which is Turing-equivalent

danieltillett · on Sept 29, 2017

It does if the questions are very carefully considered. Designing good questions is not easy.

mdpopescu · on Sept 30, 2017

Eliza was beating that since it showed up. I've seen people talk to similar programs for hours.

dahart · on Sept 30, 2017

I assume this is much tongue in cheek, but permit me to take it literally, just for fun. While there have been some public cases of the bar raising, there is no computer or AI yet than can tell you that you asked the wrong question. There's no real evidence that the bar was ever in the right place. We can't worry about the goal posts moving, when we have no idea where the goal posts are supposed to go -- impoverished may be an understatement. The article is correct that "Neural nets are just thoughtless fuzzy pattern recognizers". While everyone's excited that they started to work well on huge data sets, they are simple classifiers that can identify objects for which they've seen many examples. They are demonstrably deficient at extrapolating. Backprop and deep networks were a step forward, but it's just obvious there's a long way to go, regardless of politics.

musage · on Sept 30, 2017

I think it starts even sooner than that, with the confusion of abstractions and reality. Ralph Waldo Emerson described it in "Blight". Another thing that comes to mind is comparing "using science for this stuff" with a drunkard searching for a key near a lamp post because it's dark everywhere else, even though he knows for a fact that's the one spot it's not located. We're already restricting human thought to abstractions more and more, the asylum is already being run by the insane, so I think we're going to meet whatever we'll cook up more than half way comfortably.

Animats · on Sept 30, 2017

I'd argue that the next problem to attack is manipulation in unstructured environments. Robots suck at that. There's been amazingly little progress in the last 40 years. DARPA had a manipulation project and the DARPA humanoid challenge a few years ago, and they got as far as key-in-lock and throwing a switch. Amazon is still trying to get general bin-picking to work. Nobody has fully automatic sewing that works well, except the people who starch the fabric and make it temporarily rigid. Willow Garage got towel-folding to work, but general laundry folding was beyond them. This is embarrassing.

Many of the mammals can do this, down to the squirrel level. It doesn't take human-level AI. There are rodents with peanut-size brains that can do it.

It's a well-defined problem, measuring success is easy, it doesn't take that much hardware, and has a clear payoff. We just have no clue how to do it.

the8472 · on Sept 30, 2017

Current ANNs aren't anywhere near squirrel brains, so of course robots using them won't perform as well as squirrels.

Take a single brain region specializing on one task, throw away all the integration and feedbacks from other regions, simplify it even further because we only need it to do one task, then run the whole thing on an emulation layer running on 2D hardware. And that's still neglecting the dissimilarities between artificial neurons and their natural role models.

nopinsight · on Sept 30, 2017

I agree with your general characterization of the area. For sewing though, Sewbots appears production-ready and does not require starching the fabric. What do you think of it?

http://softwearautomation.com/products/

Animats · on Sept 30, 2017

20% real, 80% hype. There's lots of partial automation in apparel, but handling fabric is still very tough. Especially for operations after the first one, where you have to deal with a non-flat unit of several pieces sewn together. They apparently can make T-shirts, but not jeans.

They're not doing manipulation in an unstructured environment. They're trying to structure apparel sewing rigidly enough that they need a bare minimum of adaptation to variations. That's how production lines work.

altstar · on Oct 1, 2017

Didn't know squirrels can do laundry folding.

hyperion2010 · on Sept 29, 2017

I think that the idea that learning is what was missing from the prior generation of AI is the most important insight of this generation. There are many things that we don't know how to implement from first principals but that can be implemented by a system that can learn. The problem now is that the substrates for learning are extremely low level, practically the raw inputs to the retina or pure symbols. In order to go beyond the admittedly impressive parlor tricks you can play with these kinds of inputs we need much higher level representations that can be the substrates for learning. We are still missing that ever illusive 'common sense' knowledge about the world that evolution baked into nervous systems millions of years ago, and it is not at all clear to me whether learning algorithms are going to be the tool that allows machines to build an actionable internal model of the world, evolution didn't do it by learning, it did it by billions of years of trial and error and the search space is unimaginably larger than that of something like go.

lanstin · on Sept 30, 2017

Not just evolution but also infancy.

thisisit · on Sept 29, 2017

Here's an article by Gartner (credits to original poster ooOOoo): https://www.gartner.com/smarterwithgartner/top-trends-in-the...

As per Gartner - Deep Learning and ML is near the peak of the Hype Cycle, nearing trough of disillusionment.

dredmorbius · on Sept 30, 2017

The Gartner Hype Cycle seems to produce a heck of a lot of standing waves. Technologies which remain at the same point for years, sometimes decades.

https://www.linkedin.com/pulse/8-lessons-from-20-years-hype-...

Though with all they hype AI/ML is getting, it would hardly surprise me if there were some great degree of disillusionment.

mikeash · on Sept 29, 2017

Nature succeeded in creating human-level intelligence with one trick and no understanding, so clearly it can be done. It did take a while though. More tricks and more understanding would probably help speed things up.

amelius · on Sept 29, 2017

It didn't just take a while. It also took a lot of resources.

And perhaps strong AI can only evolve if the agents can interact in a world that is as complicated as ours.

abrichr · on Sept 29, 2017

C.f. Kindred.ai

inciampati · on Sept 30, 2017

The "one trick" is evolution. Interestingly, that is a field of ML that is currently relatively ignored. I'm hearing people make similar arguments about evolutionary patterns for ML that were once applied to neural networks. There isn't enough compute. It is too hard to consider using evolutionary patterns. The state of the field expressed at conferences is sad. Etc.

Upvoter33 · on Sept 29, 2017

This article matches a lot of my thoughts on this topic too. There is a huge hype wave that will soon crash (alas), and it will take down a lot with it...

wvenable · on Sept 29, 2017

I disagree. Practically all breakthroughs in computing science were always 30 years old when they finally became common and useful parts of everyday society. If the breakthrough of modern AI is just as old and we're just now seeing it implemented everywhere, that's not a sign of a crash.

moxious · on Sept 29, 2017

What do you see as the basic design of modern AI?

The 30 year old discipline of symbolic AI doesn't have much to do with today's statistical AI.

fjsolwmv · on Sept 29, 2017

Symbolic AI and statistical AI are both over 50 years old.

https://en.m.wikipedia.org/wiki/Perceptron

Statical AI was delayed in practice because it was far more expensive than symbolic AI.

zardo · on Sept 29, 2017

Symbolic AI is closer to 60 years old. Many of the fundamental concepts of the current wave are 30 years old.

wvenable · on Sept 29, 2017

I'm just taking the article on face value; from the very first line: "Just about every AI advance you’ve heard of depends on a breakthrough that’s three decades old."

mannykannot · on Sept 29, 2017

I think the issue is whether there is anything else in the pipeline - though it's hard to tell until it comes out of the pipe, so to speak. Back-propagation revived neural networks for a while, but wasn't there a period between then and now when they were thought to have almost exhausted their potential?

vtange · on Sept 29, 2017

Any keyword that is depended on to move money will see hype waves, surfs and crashes. We are seeing/saw it with crypto, it is no different with AI.

dmreedy · on Sept 29, 2017

While I agree the general thrust of what you're saying, I do think there is something subtly different about the AI hype cycle. It's so accessible. The core concepts are the fundamental, inextricable frame of daily experience. The metaphors we use to talk about it borrow so heavily from lay terminology that it's so easy, so automatic, to believe we understand the state of modern AI because we understand the common definitions of the words that tend to show up around the matter. Do some semantic arithmetic and sum up a few words like 'training' and 'learning' and 'intelligence' and all of the sudden, we're all walking around with what we think are proximate models of what everyone else is talking about to, or y'know close enough to make a consulting business out of.

The real kicker though, is that at it's heart we're all pretty convinced that we understand what it's like to be intelligent. I mean how could we not be? Nevermind the agonies of ten thousand years of philosophers and clergy. And so it must be pretty straightforward, with a little bit of introspection, a crash course in statistics, and maybe a TED talk or two, to map that to the artificial side too, right?

bluedino · on Sept 29, 2017

When is all this image recognition technology going to make it to my phone? I have thousands of pictures in my phone, and scrolling back to find a memory takes me forever and half the time I can't find it.

If I had a SQL interface of sorts I could easily say things like `select pictures containing fish where date > 2 years ago'

I'd like to just say "Siri, show me all pictures of me on a boat from 2 years ago", but I can't do that. "Siri, show me all my pictures of food when I was in Seattle" - why can't I have that?

I should be able to verbally tag all the faces it recognizes. "Find me that picture of Jeff and Dave from Christmas"

somebodynew · on Sept 29, 2017

The iOS Photos app can already do this. It scans your photos on-device with a subject classifier. I just said "Hey Siri, show me pictures of dogs in my photos library" and it successfully showed me (completely untagged) photos of dogs that I've taken on my phone.

mikeyanderson · on Sept 29, 2017

I think Google Photos does that.

faitswulff · on Sept 29, 2017

Google Photos does it, and does it well. My favorite trick (though largely useless) is to search "people sitting" or "two people sitting."

sgt101 · on Sept 29, 2017

Well, here I was having my head spun round and round by GAN's, LTSM, Bayes Belief Nets, causality networks, counterfactuals, and scale.. for optimisation, simulation.

TeMPOraL · on Sept 29, 2017

> LTSM, Bayes Belief Nets, causality networks, counterfactuals

Still waiting for the mainstream to discover those, though.

edanm · on Sept 30, 2017

I think they're vastly underestimating the amount of other things in the field of AI that have been happening. This article is kinda like saying "Turing invented computers in the '40s, and everything else we've done since then has been based on that insight". Well, yeah, that's not necessarily a bad thing.

Only in this case, that would be overstating, because Deep Learning, as impressive and hyped as it is, is still only one area of the field. It's true that most of the hype is around that, because it has given us breakthroughs in image/video/audio/text applications. But I'd still wager that most "AI" systems in the world use more traditional techniques, especially if you're looking at the myriad data scientists using things as simple as linear regressions.

And even within deep learning, there have been interesting advances, e.g. GANS have brought some very interesting applications (like style-transfer). Who knows, maybe in 30 years time people will be writing about how everything nowadays is built on GANS or deep reinforcement learning, a 30-year-old technique!

ilaksh · on Sept 30, 2017

Yes there are still lots of people trying to take Hinton and other's old approach and rediscovering the many ways it lacks for general intelligence. However, it is also the case that people are making a shitton of progress in overcoming those problems, both while keeping some of those old DL assumptions and by discarding many of them.

The pessimistic articles never seem to be aware of research like these: https://arxiv.org/abs/1612.00796 and https://hackernoon.com/feynman-machine-a-new-approach-for-co... .

bob1029 · on Sept 29, 2017

AI in my mind has always been hammering down this single path of: build network, train it with x data for y iterations, and then feed live data and evaluate outputs. This approach to me seems like a glorified digital signal processing system. I think there are countless applications for this approach and I think AI is an appropriate umbrella term, but there is so much potential beyond this - Artificial General Intelligence/Strong AI/etc. I think neglect for time is a major reason we haven't seen breakthrough progress in this area.

Today, how do we handle time-series data (e.g. audio, video, sensors) in an AI? The first thing most people would do is look at an RNN technique such as LSTM which enables a memory over arbitrary time-frames. But even in this case, the definition of time is deceptive. We aren't talking about actual, continuous time. All of the approaches I have ever seen are based upon the idea that the network is discretely "clocked" by its inputs (or a need to evaluate a set of inputs). What happens if you were to zero-out all input and arbitrarily cycle one of these networks a million times? From the perspective of the network, how much actual time has elapsed? How much real-world interaction and understanding is possible without a strong sense of time? I think the time domain has been a major elusive factor for a true general intelligence.

What if you were to base the entire architecture of an AI in the time domain - That is to say, by using a real-time simulation loop which emulates the continuous passage of time? This would require that all artificial neurons and even the network structure itself be designed with the constraint that real time will pass continuously, and it must continue to operate nominally even in the absence of stimuli. In my mind this is a much closer approximation of a biological brain and looks a lot more like the domain a general intelligence would have to operate in. Continuous time domain enables all sorts of crazy stuff like virtual brain waves at various sinusoidal frequencies, day/night signaling, etc. I have found no prior art in this area, but would look forward to reviewing something I might have missed. I've already got a few ideas for how I would prototype something like this...

jpfed · on Sept 29, 2017

Was the one trick superhuman performance at Go? Was it human-level image recognition? Or was it style transfer? I didn't make it to the end of the article; maybe it turns out the trick was automated captioning or translation.

tripzilch · on Oct 5, 2017

Funny thing, the problems with deep learning sketched in the article are pretty much exactly the same in nature as problems with machine learning 10 years ago (when other algorithms such as SVM, LVQ and others outperformed NNs for a bit), except of course that the examples of what a ML algorithm could do back then were much less impressive.

That lack of real-world knowledge, understanding and conceptualisation feeding back on itself has always been a big unknown roadblock standing in the way of AI. And of course now, with the modern and improving impressive results of deep learning, there appears to be less and less cool stuff to solve before we finally have to face this roadblock.

But it's the same roadblock.

But, maybe the advances in deep learning will provide some tools to chip away at it. That wordvector stuff seems promising, if it can do (Paris - France + Italy) ~= (Rome), that's a good stab at realworld knowledge, it seems.

I used to study Machine Learning at university until 2009 (until personal circumstances forced me to abandon it). But even after that, when I read the first papers and talks about deep learning (back when it was still about Boltzman networks) I got very excited and have been following it closely. Except for the part where I haven't yet played around with it myself apart from some very tiny experiments :) (I only recently acquired hardware to have a stab at it, so maybe soon. The libraries available seem easy enough to use, and many of the concepts I learned in ML are still applicable).

ryanackley · on Sept 30, 2017

I think a lot of comments here are missing a tacit point of the article. It took 30 years for this recent "breakthrough" to happen. Nobody has figured out how to use deep learning to develop the next breakthrough in AI beyond deep learning. Therefore, What happens after we run out of novel applications of deep-learning?

atrexler · on Sept 30, 2017

The thing you have to figure out, is that our brains and consciousness are just a big collection of "thoughtless fuzzy pattern recognizers". There's nothing magic beyond that.

FrozenVoid · on Sept 30, 2017

Instead of bigger and badder networks with zillions of layers, can we use the opposite approach: reducing the network to minimal size at which it works and researching why it works(Reduced case -> General rule). It will be much simpler to make current network easier to analyze, than trying to find relations in a giant multilayer soup. Perhaps even creating a neural network optimizer to minify the target network, increasing its efficiency without compromising the power at the tasks it built for.

stephensonsco · on Sept 30, 2017

Yep, it is only one trick. Just like electrifying the world only had that one weird trick of alternating current transported miles with metal cables and the computing world that trick of transistors. Pretty good tricks though.

Understanding will come. Most people are still in the mapping phase. Hinton has moved passed that, and that's ok.

dredmorbius · on Sept 30, 2017

I'd like to suggest that sources which explicitly demand that readers not protect their privacy through incognito mode be banned from HN.

This is a demand which shows extreme contempt for the principles of personal privacy and choice.

PaulHoule · on Sept 29, 2017

The media talks about nothing but "Deep Learning", however exciting things are happening with ensemble methods, SMT solvers, semantic and model-driven systems, etc. You just don't hear about it.

thephyber · on Sept 29, 2017

Ugh. Blaming "the media" is so last millennium. If you are posting on social networking websites or blogs, you are part of "the media". I don't expect "SMT solvers" to lead cable network news and you don't either.

You can always find some media outlet which covers your particular niche topic, but you will never be able to push all niche topics all the time in to the mainstream. In fact, it's human nature to avoid returning to places which cause you the pain of cognitive overload.

ThomPete · on Sept 30, 2017

You could argue that human consciousness is a one trick pony. The way I see it, computers can learn it's a one-trick pony but it's a very powerful and diverse one.

omot · on Sept 29, 2017

Didn't read article responding purely to the title:

AI is a three-trick pony -Regression -Classification -Clustering Nothing more and nothing less.

blt · on Sept 30, 2017

AI is not just the contents of your undergrad machine learning class.

partycoder · on Sept 29, 2017

It's not a one trick pony. It's the distinction between AI research and applied AI.

There's more applied AI than there's research.

thephyber · on Sept 29, 2017

From TFA:

> Just about every AI advance you've heard of depends on a breakthrough that's three decades old. Keeping up the pace of progress will require confronting AI's serious limitations.

The expression "one trick pony" means it does one thing, not that its foundational principles are based on one research paper.

If this author's analogy holds, electromagnetism is a "one-trick pony", even though it has millions of applications (which is what common usage of that phrase refers to).

partycoder · on Sept 29, 2017

I find it hard to use this analogy in the context of the whole field of AI.

The A in AI stands for artificial, and the AI denomination is so vague that applies to any form of automation however trivial. e.g: fuzzy logic in a washing machine is some sort of AI.

Maybe deep learning could be said to be the equivalent to this definition of "one-trick pony".

epberry · on Sept 29, 2017

Does anyone know anything more about these “capsules”? I’ve seen some nature articles on them but these were from a neuroscience perspective. Has Hinton published anything on them?

Also if Hinton is the Einstein of deep learning, then will capsules be his “unified field theory”? I feel that if we embrace the Einstein analogy we should embrace it to its fullest.

bayesian_horse · on Sept 30, 2017

It is a particularly talented pony, though. (cue video of pony running backwards)

bluetwo · on Sept 29, 2017

ML is riding a one-trick pony.

yters · on Sept 29, 2017

Once we get AI writing AI, then we are onto something.

hyperbovine · on Sept 30, 2017

Winter is cooming. (Again.)

e4ad93ca07 · on Sept 29, 2017

Finally a realistic view of where “artificial intelligence” currently stands. I wish I knew where guys like Elon Musk are seeing this other artificial intelligence I’m just not seeing. The current AI we have is just fancy linear regression.

pesenti · on Sept 29, 2017

Sorry but that's a ridiculous statement. It's like saying all the cloud technology we have today is not much more than having VMs. Yes, AI is hyped, but it made some very unexpected progress over the past 5 years (e.g., solving facial recognition). More than many experts expected.

gue5t · on Sept 29, 2017

"It's like saying all the cloud technology we have today is not much more than having VMs."

No shit? Unless you mean that it's ignoring the "distributed systems" part of the cloud, which is mostly a shitshow. The provisioning/configuration-management stacks are all complete wrecks, stacking hacks atop ad-hoc container schemes atop a poorly-design OS. A real distributed OS would be so much simpler and more robust.

Calling it AI is very misleading when there has been minimal progress toward actual reasoning (the closest I've seen being some LSTM work on answering simple queries about scenarios based on prose descriptions of them). It's ML, as in "learning a function", not as in "general-purpose learning like an intelligent agent does".

pesenti · on Sept 29, 2017

https://en.wikipedia.org/wiki/AI_effect

jononor · on Oct 1, 2017

Got any links to info about distributed OSes?

mbrumlow · on Sept 29, 2017

Can you please define this term "cloud technology"? I always thought it was a marketing term for "the internet", which then would be just a bunch of servers, vms, and containers; all of which we have had for over 20 years.

derekp7 · on Sept 29, 2017

"cloud technology" is "servers, VMs, containers" that is also SER (Somebody Else's Responsibility). Hence the term "cloud", as that was always used in network diagrams to refer to network connection infrastructure that wasn't yours.

votepaunchy · on Sept 29, 2017

What is novel about "the internet"? We've been sending data over telegraph wires for nearly 200 years. The Internet is just someone else's cable.

gue5t · on Sept 29, 2017

Packet switching and distributed routing protocols.

mbrumlow · on Sept 29, 2017

That was my point, you just went back further in history than I did :p

freehunter · on Sept 29, 2017

Then your point is wrong. What's so novel about CPUs? We've had sand since the Earth was first formed!

I swear the tech industry has more luddites than the Amish.

hyperbovine · on Sept 30, 2017

The Amish are not luddites.

freehunter · on Sept 30, 2017

They are not capital-L official Luddites, but they could be classified as colloquial lowercase-L luddites.

warent · on Sept 29, 2017

Since when are cables such a unique idea? The telegraph is nothing more than a large number of fancy carrier pigeons as a service, which we've had since the 12th century at least.

Florin_Andrei · on Sept 29, 2017

It's turtles all the way down to the Big Bang.

sgt101 · on Sept 29, 2017

Nope. Linear regression will not reconstruct images, nor will it create and generate new images that have not existed previously from databases of images. Nor will it translate colloquial Chinese into colloquial English, nor will it deconstruct a question into a set of queries and generate an answer.

The issue isn't the availability of fantastic algorithmics, it's the availability of people who can see them and apply them to real applications.

hyperbovine · on Sept 30, 2017

In fact linear regression could be used to do all of those things, most of them poorly but some of them pretty well (see dictionary learning). If you believe otherwise then you fundamentally misunderstand how neural networks work.

sgt101 · on Sept 30, 2017

Ohhh - so the multilayer perceptron xor thing is all wrong?

I can use bricks to build a boat, if you think I can't then you don't understand floating.

fasquoika · on Oct 1, 2017

>I can use bricks to build a boat, if you think I can't then you don't understand floating.

Is this supposed to be sarcastic? Because it's absolutely true...

Magnap · on Oct 3, 2017

I think the point is that you can build a boat out of bricks, and you can build neural networks out of linear regression. But that doesn't make modern AI "just fancy linear regression" the same way modern boats aren't "just fancy bricks".

psyc · on Sept 29, 2017

The current AI seems to be pretty good at pattern recognition. There’s plenty of active research in higher-level “cognition” like decision theories and semantics, so I don’t think it’s crazy to believe good results will emerge there in the next few decades.

mljoe · on Sept 29, 2017

Sure, there is no other way for computer science to go. AI always been the last and most important frontier. There may be more AI winters, there may be more AI summers. Ultimately people will continue working on this problem until it is solved or computer science goes away.

nzonbi · on Sept 29, 2017

>There’s plenty of active research in higher-level “cognition” like decision theories and semantics, so I don’t think it’s crazy to believe good results will emerge there in the next few decades.

Interesting. Can someone elaborate about this?

psyc · on Sept 29, 2017

I would start by looking at what these people are researching:

http://cocosci.mit.edu/people

https://web.stanford.edu/group/mbc/research.html

GCU-Empiricist · on Sept 29, 2017

The current computers we have are just fancy compound gate machines.

yters · on Sept 29, 2017

All AI is just objective minimization. AI cannot solve the halting problem or perform Solomonoff induction.

fjsolwmv · on Sept 29, 2017

Neither can human AI solve the Halting Problem. It's proven unsolvable.

colorint · on Sept 29, 2017

It's proven unsolvable for computers (i.e., Turing machines). Whether that makes it impossible for humans to solve (as a matter of course, or as a matter of Turing's proof) depends on whether you believe that we are isomorphic to computers. Which, to be fair, is exactly the belief that underwrites artificial intelligence. But then Turing's proof involves a program that inspects the halting oracle you're using, which isn't well-posed for humans evaluating halting.

Xcelerate · on Sept 29, 2017

> AI cannot [...] perform Solomonoff induction.

Well it's clearly got to approximate it somehow, else humans wouldn't currently be making decently accurate predictions about the future.

yters · on Sept 29, 2017

You're making an assumption that is not necessarily true.