Hacker News new | past | comments | ask | show | jobs | submit login

I remember reading about Cyc when I was a kid, thinking it was pretty cool. I remember reading breathless articles with emotional quotes from the system where it asked existential questions like "am I alive?". I grew up with Sci-Fi, Video Games and Anime where the protagonist had an AI that did all sort of wonderful things for them. I'm by default a fan of the Cyc project and cheer a little whenever I see anything about Cyc.

So this is not meant to take anything away from the folks who work on (Open)Cyc, but I think it's time to ask some hard questions about this kind of AI approach:

Cyc (open or otherwise) is about 30 years old and has yet to demonstrate any sort of serious advancement over other, much lower friction approaches (actually it's hard to find utility at all). My personal feeling is that it represents a failed approach. This is not the same as a failure, it was a very important experiment that tried to answer the question (paraphrasing the WP article), "is it possible to craft an AI that can perform human-like reasoning using an ontology of common sense knowledge?" I think after 30 years the result has been an interesting curiosity, but the answer is a pretty strong "no".

There are more important questions though. Let's say that OpenCyc represents 30% of the state goal -- and since the goal from the WP page is a bit nebulous let's refine it and say 30% to the reasoning capacity of a 6-8 year old human child (6-8YOHC). And let's say that miracle upon miracles, Cyc is able to move closer to that goal at a steady 1% per year...and after 70 more years Cyc is now as smart as a 6-8YOHC. Is this desirable? I mean, is this a good thing to have? I don't mean this question to come from a source of deep philosophy or moralistic questions...but have you ever tried to get a 6-8YOHC to do any task more complex then follow a few simple instructions with lots of supervision (even getting personality and desires of the child out of the way)? A digital 6-8YOHC measured just on intelligence might, just might, be smart enough to follow a simple question like "find me news articles about the Libor scandal". But is that worthwhile? Is 100 years of R&D to achieve this useful?

The approach in the past was to not tackle the general purpose problems the Cyc was always meant to but to focus the approach on specific problem sets and build "expert systems". This is apparently what the (limited) list of use-cases Cyc is being applied towards is. This again means that Cyc has failed. "Common sense" knowledge means "general purpose" a 6-8YOHC knows as much about terrorism or biomedicine as it does about car engines...in other words it should perform equally well on any of those things because none of those things are "common sense". "It's hot outside so I shouldn't wear a winter coat" is a common sense thing.

So what are we left with? We're back to ontologies and various flavors of knowledge management and reasoning systems, all of which in my experience are faddish cover terms for the kind of AI research which just didn't go anywhere and haven't demonstrated much more than adding an interesting aside to what were already pretty darn good search results in my favorite search engine when I go to hunt down showtimes for a movie I want to see.

The number of dead, dying and useless Semantic Graph projects that are out in the world are astonishing. The lack of any real useful tools built with them is even more astonishing. And I don't mean tools that let you inspect the Knowledge Base (very exciting stuff like e.g. "is President Obama in the Knowledge Base? What is he connected to?") which I guess are useful sort of search engines in some contexts where the information is impossibly well curated (e.g. wikipedia). But I mean things like automated teachers, digital accurate diagnosticians, AIs that write good Pop Songs and hold interviews with the entertainment press, anything that might represent intelligence. It seems like whenever some of the pieces of these things show up (WP, Khan Academy, etc.) they are eschewing the direction that Semantic Research is taking us and going with a lower friction, higher payoff route. Why isn't there an AI that can suck in WP and Project Gutenberg and then generate and give me Khan academy lessons for K-12? Or at the very least provide NPCs in a game that are capable of having at least a 6-8YOHC level of conversation.

The answer has always been "wait till we add more objects/entities/facts/assertions/things/stuff/etc.". This is a bad answer, and unfortunately is the one that challengers to this approach get served back to them. Feeding "stuff" into a model and waiting for the spark of life to kick in just doesn't seem to be cutting it. There are ontologies with literally billions of things in them that are nothing more than strangely structured databases with odd query interfaces.

I'm a fan of Cyc, and it feels like it makes sense, and I'd love to be proven wrong, but I'm forced to lump it and similar approaches into yet another dustbin of evolutionary dead ends along with atomic powered cars and dinosaurs. I'm afraid Cory Doctorow was right http://www.well.com/~doctorow/metacrap.htm

I'd love to know for example that Cyc, as the most mature common sense reasoning system was the "go to" when Google started building the voice search in Jelly Bean and the Google Semantic Search and every few months we'll just be blown away by what's happening when we turn over a 30% 6-8YOHC on nearly every bit of human knowledge ever collected and it'll start doing really really useful amazing stuff for us instead of saving me a few smudges on my screen when I want to set an appointment or look up how tall Brad Pitt is.

I hate to turn this into a vent, but I just feel so...crestfallen...at the seemingly limitless lack of serious progress in the entire field (not just Cyc) despite what are probably millions of man years put into it.

The final hard question then is...is this kind of approach the right one?




+1 a very good vent, and anyone who has had a long term involvement with AI can agree with parts of or most of.

Do you really think that we should give up after 50 years of little success in the really difficult AI problems? I think we should keep trying. Failures are OK if they are affordable and we can learn from them.


To be honest, I don't know. Six months ago I was ready to write off the whole endeavor, then I see things like this https://plus.google.com/100130762972482716067/posts/BN5qjTEN...

and feel like mankind just took a little baby step in the right direction.

I don't think a 6-8 year old human child (6-8YOHC)is the right goal. Or a Star Trek like computer is either, talking to computers gets boring (talking to 6-8 year olds gets boring too!). It's the AI equivalent of the Gorilla Arm in interface design (perhaps "Gorilla Mind" is a good term?) and like I said, is achieving this really useful? Even if achieving the near term goal isn't useful, does it lead into a next goal that's useful?

I think the right goal is to brain storm ideas, "what would we like a computer to do for me?" then start inching towards those.

As an off-the-cuff example, build an AI scaffold of some sort, point it at Wikipedia and Project Gutenberg and have it generate a Khan Academy style educational program for K-12. Get it to start teaching, then use the feedback from the teaching to better model how humans think and learn.

There, that's my contribution to the world of AI. A quick brainstorm with something that could be useful for people. I would find it hard to believe that after half a century, researchers in the field haven't come up with similar kinds of exercises. But I keep seeing more myopic answers.

Getting an AI to tell me a famous movie star's shoe size is not interesting because it really only saves me time, but doesn't do something new for me. Likewise expert systems for diagnostics, any experienced human with a reference can do this job! AI seems too focused on replacing the "human who can do this job with a reference" (RAHWCDTJWAR) and not enough on augmenting what humans can already do to make them better, or doing complex tasks like instructing a class. The problem is not that we're emotionally driven to keep the "human with a reference" in the equation, but that over time, that human has proven to provide better results!

Any brainstorming ideas that fit this mold should be rejected as vectors for the field. If it even smells like a project is turning into a RAHWCDTJWAR run!

Similarly the 6-8YOHC is the wrong direction. Let's be honest, 6-8YOHC aren't very useful or knowledgeable. Let's stop trying to make AIs that are do-nothing ignoramuses. I don't need an AI that knows that when it's hot out I shouldn't wear a winter coat. I already know this.

Like I said at the beginning, I think Cyc and similar approaches have been valuable as a line of research and inquiry, but have ultimately provided so much failure and so little progress, it's obvious that it's not the right way to go. Knowing this is so very important. But I keep feeling like this message isn't getting across and this basic approach to AI has long long overstayed its welcome.


I understand where you're coming from. In my college days I bought all those thick academic books on "machine learning" when the term meant Lenat more than it meant statistics.

That route does seem to have failed us. Or at least not have gone much of anywhere in the years since. At the same time, I'm not sure the Google-y approach to machine learning has made real progress in the last 10 years.

Sure, we've got cars that may or may not be able to handle actual road conditions, but search, and more importantly any sign whatsoever of computers knowing what we want to do has stalled out for quite some time now.

I dunno. There's got to be another approach that yields more progress than either of the semantic or statistical paths.


That route does seem to have failed us. Or at least not have gone much of anywhere in the years since.

I'd like to reference the recent Higgs result as a compare contrast example from a different field.

The search for the Higgs is slightly younger than the search for AI - but of about the same age so it's worth comparing. It took a very long time to yield basic results -- namely "does it exist?" The search for the Higgs was pure Research. The day after the Higgs discovery nothing changed in the world except that we now know it exists. Given 20-30 more years of R&D we might get a hoverboard, or faster blenders or something, and the total time investment will have been about 70 years from "notion on a chalk board" to "hoverboard".

AI researchers might use the Higgs as an example to not poo poo their field since they are still in that long long time between theoretical proposal and working discovery. Detractors might say "but all you propose is just shoving more factoids into your AI model hoping it springs to life"! The analogy with the Higgs is that researchers were for a long time simply proposing to build bigger and bigger accelerators until the Higgs fell out.

I'm not a physicist but I'm hoping that there was a stronger theoretical framework surrounding the Higgs then "let's keep crashing stuff into each other harder and harder till what we want comes out of it". Likewise, I'm not a Cyc-style semantic AI researcher (or an AI researcher of any particular type), but I'm hoping that the field has more going for it than "let's keeping tossing factoids into our Semantic Graph until it springs to life".

I'm willing to think that a 70 year R&D time is worth it if we end up with commercially ready 1.0 equivalent Minds at the end http://en.wikipedia.org/wiki/Mind_(The_Culture).

But at this point I don't think we're any closer to this then we were 20 or 30 years ago. It's a perpetual Research horizon at this point. To put it back into perspective with the Higgs, 15 years ago they were starting construction on the LHC.

AI researchers will lament the lack of funding in their field, etc. But I have yet to hear a compelling research direction the field would go if it were suddenly gifted the cost of an LHC or two to advance the field.


This isn't a retort to such a well written comment, more just a question as to how you would view http://www.20q.net - as I think it's fair to say they've created a system that likely would outperform a 6-7YOHC at performing twenty questions.

As to human utility, they did manage to entertain a fairly large number of people with their one off electronic 20q devices.


You know, this brings up a very valid point, and in some ways this represents traversing the guts of a kind of simple expert system. Is it useful? Dunno, I suppose if I was in a foreign country and had an object and didn't know what it was, and it could help me identify it, then sure.

But the more valuable point is this, what kind of 6-7YOHC answers people's questions? Most children at that age have achieved a kind of self-feeding curiosity and spend most of their time asking "why" questions...slowly populating their own knowledge base provided a convenient and willing human to answer them.

In the case of a human, the intelligence precedes the facts. We're wired to desire to build our model of reality. Autodidacticism (on some level) is the default. Our extant models of intelligence (living creatures) aren't pumped full of factoids till one day, sometime before sexual maturity, the spark of intelligence kicks in and off we go! Our fundamental AI approach is literally putting the cart before the horse. Perhaps this is for lack of enough study into building productive self-learning systems?

On to 20q:

The problem with 20q is that it largely depends on its own internal model fitting yours. With three tries it sort of got one answer right (it guessed stereo speaker when I was thinking head phones) and I was just picking stuff up off of my desk and answering its questions. My guess of where it fell down is that my model of the world might be less firm than its own.

Using headphones for example: it asked two interesting questions

13: Is it used by the police? "Sometimes"

14: Does a basketball player use it? "Irrelevant"

But my answers could have just as easily been "irrelevant" to both, or "sometimes" to both or "yes" to both and off down a different decision path it goes.

A naive better answer from the system at the end of a "round" might be "so I dunno, what is it?" and fit that into its system. But the honest truth is that it would also slowly be populated by junk. Both intentional and unintentional.

Fitting everybody's personal model of the world into a single model is only going to fail since even in the best of cases people have similar models of the world they walk around with in their head, but are often grossly different or even contradictory within a single person's model let alone multiple people (don't even get me started with the discussion my wife and I had last night over a topic we were at absolutely opposite ends of)!

Yet! People deal with this every day by interacting with multiple people, weighing new input, and either keeping or discarding the information either because it fits well into their existing model, or a vast array of other reasons (forgot, drunk, heard it during an emotional moment, needed it for a test, etc.)...we can even remember negative assertions "a cow is not a moon rocket".

So 20q is a great example of a "human who can do this job with a reference" (RAHWCDTJWAR) syndrome. Clever, but ultimately not as useful as just asking somebody.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: