Hacker News new | past | comments | ask | show | jobs | submit login
Computer reads the internet (techeye.net)
37 points by ankimal on Oct 12, 2010 | hide | past | favorite | 23 comments



At the moment NELL thinks that the First Amendment is a musical instrument, the Second Amendment is a 'hobby,' and is completely unwilling to admit to any knowledge of the fifth amendment at all.

You can insert your own joke here— but I actually think those are some fascinating insights about the internet.


Interesting errors might be a better gage of intelligence than the things a learner gets right.

Surely we have all experienced young kids getting something wrong in a way that showed how bright they are?


Just for the fun of it I asked Cleverbot (cleverbot.com) what the first amendment was. Here was the response:

    What is the first amendment?
    I don't know. What is the first duty?
NELL at least has a guess, even if it is wrong. Of course is someone told Cleverbot that the first amendment was a musical instrument it would probably repeat that as well.


I once had Cleverbot and an ELIZA clone "talk" to each other. After a few back and forth exchanges, Cleverbot spit out this:

  Eliza clone: What does that suggest to you?
  Cleverbot: A program like Eliza.


Official site for the project: http://rtw.ml.cmu.edu/rtw/


Some of these facts are super hilarious and interesting. Ex I just saw: Long Term Risk is a board game ;p The irony of that statement is awesome.


I'm glad stuff like this is being worked on.

But the fact is, it won't really work until we have strong AI. The more I learn about it, the more I'm convinced that nothing except human-level intelligence can properly parse satire, irony, metaphor and simile.

Natural language is just too contextual. Sentence parsing and vocabulary identification/disambiguation are really hard problems, but even if we solved them entirely we still wouldn't be able to make sense out of a plain-text corpus.

That said, I think NLP research is one of the most promising routes to discovering how to build a strong AI.


"Strong AI" isn't really a meaningful term anymore, because we keep moving the goalposts:

) "A computer will be intelligent when it can beat a human at chess. Oh, Deep Blue did that? Umm, ok, then...."

) "A computer will be intelligent when it can recognize a human face. Oh, every digital camera on the market has basic versions of that built in? Ummm, ok, then...."

) "A computer will be intelligent when it can TRANSLATE between human languages! Hah! Try THAT on for size, you geeky boffins! Wait, what? Google has been doing that for years, and AltaVista did it back in 1995? Oh. Ummmm, well...."

) "A computer will be intelligent when it can pass an unrestricted Turing tes- [someone else whispers to the speaker] What's that? Computers have passed the Turing test? And humans have failed it? Um, right, moving on!"

"Strong AI" is the magic incantation that means "a computer that is so smart we don't understand how it works any more"--because, if we DO understand how it works, then it's just another computer program.

This says interesting things about why we regard humans as intelligent. If we actually advance neuroscience et al to a sufficient point that we deeply understand the brain and the mind, will we stop thinking of ourselves as self-aware and free-willed?


I disagree. I think people have always had a pretty good idea what they meant by intelligence - the examples you cite are just things that they have pulled out of their hat as examples of something "only" an intelligent entity could do.

Turns out they were (partially) wrong about the capabilities of non-intelligent systems, but it doesn't mean that the concept of general intelligence isn't meaningful. Hard to define, yes, but I think most of us would know it when we see it (or interact with it). I'd actually be pretty comfortable defining it by the original discussion - can it meaningfully process a corpus of idiomatic human language, "learn" from it, and respond intelligibly to arbitrary questions on it? This certainly isn't a lower bound on intelligence, by any means, but if it does happen, I don't think there's anyone who won't be convinced.

By the way, do you happen to have citations for the computers passing the Turing test? I'd love to see transcripts of that... It is of course possible to pass a Turing test with an Eliza-like program, but it depends largely on the judge and the time allotted.


Tom Mitchell is on a roll. He also got press for the fMRI mind reader: http://singularityhub.com/2009/04/24/devices-that-read-peopl...


It's about time. I wondered why no one had hooked up a learning AI to Wikipedia/dbpedia and the rest of the Internet. Cyc always semed like a good idea but bad execution to me. Manually feeding in data takes a lot of time and effort. Why not just let something soak up the entire Internet and correct it if it learns the wrong things? Good luck to CMU. Maybe this will become the search engine of the future.


Sorry to single you out, but comments like this really get me. It's OBVIOUSLY MUCH harder than it looks to you.

It's about time! I wondered why no one had hooked up a fuel cell to cleanly power a car.

It's about time! I wondered why no one had proved P != NP.

It's about time! I wondered why no one had used [insert pop science news headline] to solve [insert major science-related problem].

If you've got a simple solution to a complex problem, you're probably not seeing the whole picture. Or you should be implementing it yourself.


Even as I wss typing the above post, I knew I would be singled out because of how I said it. I understand the complexity of what they're trying to do (BS comp sci, coder for 20 years). What I meant to imply was that finally someone is tackling such difficult problems head on and out in the open. This is science fiction becoming a reality. This could be AI that does more than drive cars or fulfil my Netflix orders. So color me excited with optimism. But yes, I see your point and thanks for me calling out. I should have been more explicit in my comment.


http://xkcd.com/793/

Edit: I thought this was highly relevant to what the parent was saying.


Well, Cyc comes from a somewhat different philosophical viewpoint. It is "commonsense reasoning" broadly, but its goal is more to encode "correct" knowledge for a whole range of basic things, rather than necessarily just common knowledge in the sense of knowledge that everyone has. Less "imitate humans" and more "analyze these domains correctly and consistently and encode them in the computer once and for all as a basic substrate on which other things can be built/learned".

Part of the reason for that approach is that it uses a fairly standard logic, in which you really need some consistency, contradictions are a problem, etc.; whereas if you were to just collect the sum of what the average people on the street think, there are a ton of inconsistencies, outright contradictions, omissions, illogical beliefs, etc., which you'd need a completely different model from standard logic to deal with.

It harkens back somewhat to the 19th-century dream of a "philosophy machine", where we'd formalize various kinds of concepts (Newtonian physics, different theories of causation, theories of time, etc.), then the computer could tell us which propositions are valid under which theories. So Cyc actually hires philosophers to encode some of its ontology, runs some things by physicists, etc. It's very much a non-psychological approach, attempting to build an ontology that in some sense is a "correct" representation of the world by consulting experts on each area to be formalized and making sure all the areas are consistent with each other.


> why no one had hooked up a learning AI to Wikipedia/dbpedia and the rest of the Internet.

This is because Natural Language Processing is apparently #@!$%@^ hard. Sure it's easy for a computer to extract things from a pile of assertions, and it's not even that difficult to work with fuzzy/probablistic assertions. But turning ordinary everyday English (or Spanish or whatever) into a pile of assertions (with appropriate certainties) is something that's still being figured out.


Google and many other programs learn from Internet for years now.


This story would be better titled: "Computer reads and learns from the Internet"

Text-to-voice is already very established. However, "machine learning from the Internet" with its intelligent crawling, parsing, computational linguistics logic, and AI involved is the truly impressive aspect here. Don't you think?


Project homepage: http://rtw.ml.cmu.edu/rtw/ No source but you can browse or download the database.

NELL even has a twitter feed: http://twitter.com/#!/cmunell


Particularly interesting and somewhat comedic is it's complete lack of bias. I'm curious as to how it is absorbing all this data, as the order in which it processes the data could greatly effect how it chooses it's beliefs.


These commenters deserve many internet points:

  Henry T. - so when does NELL become self aware? :-)
  Jon M - As soon as it reads this page ;-)


-1 blogspam; this has been in the NY Times and everyplace else...


Why would they correct it and say Klingon isn't an ethnicity? It pretty much is, only a fictional one.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: