Hacker News new | past | comments | ask | show | jobs | submit login
Stanford AI Class Lecture Notes (docs.google.com)
147 points by ziyadb on Oct 10, 2011 | hide | past | favorite | 22 comments



Google still has a very long way doing an adequate translation between a language such as Chinese and English. I'd be surprised if Google was taking such a pedestrian approach in solving that particular problem.

As it stands, languages such as Chinese are intrinsically implicit in nature. In fact, the more adept at the language, the more you can express with less. If you follow the literature back a couple thousand years, the amount expressed in a few characters is absolutely astounding.

If you take the example they use at the bottom regarding wonton, it's down right criminal to map the grammar in such a hurried manner. For one, just from the romanization of wonton, the AI should be able to gauge that it's looking for 2 characters and not 1 (1 character per syllable). However, in the case of the menu, the wonton egg drop soup drops a character to save some space.

Taking a straight forward CFG approach will never result in an accurate translation. What may work well is to do multi-pass contextual analytic processing in parallel.


Statistical translation that Google uses works quite well to translate between most Indo-European languages. However, it has problems in other languages like China and Finnish, but they are not necessarily insurmountable if you combain statistical approach with some domain logic and a lot of material.

Developers of Google Translate described to me that Finnish is causing problems because the excessive inflection [1] in Finnish language needs a couple orders of magnitude more translated material to make statistical approach to work. At least that time it was not easy to obtain. Interestingly enough, the official EU documents and meeting translations are one of the best sources of 1-to-1 translations as they are translated to all languages in EU.

[1] http://en.wikipedia.org/wiki/Inflection#Uralic_languages_.28...


I agree to a point. For one, I would not group Finnish and Chinese together. Now, I can't speak much for Finnish, but I am positive a statistical approach to Chinese provides very poor results. However, with that said, I do agree that domain logic and a large relational data set is a good approach. Essentially relationships between words and symbols need to be established with false positive and false negative associations broken down and re-established in near real time to come close to a perfect solution. Basically, the problem is too expensive to solve.


One of the reasons that the EU documents work so well is that they're painstakingly written simultaneously in all the official languages. There is not a master copy from which other translations are made. This also means that if, say, the German working group writes a clause that is only approximately translateable to French, both groups will have to re-write their versions so that there is as close to possible as an exact correspondence between them.

Other possible corpuses are mostly literary, which are of course subject to significant rewriting for stylistic reasons.


I hope we can get discussion like this going on every lecture (if not here, on Reddit or in the official discussion forums). Do you have any pointers towards reading on translation strategies that are more "realistic"?


Couldn't this understanding of intelligence limit the ways in which we can describe and emulate the activity of intelligent creatures?

For example, the interfaces and processors are all very clearly defined and separated in those diagrams. Unfortunately, natural intelligence does not seem to work in the same way. The inputs to a real human do not get processed in the same places, even when they might be coming from the same sensor. Obviously the patellar reflex doesn't make it past the spinal cord, and I've never actually believed that the spectrum of intelligent behaviors can be sorted into "conscious" or "unconscious" categories, by including some sort of wet Boolean or whatever.

We could think of the brain's implementation as the sum of its internal and external interfaces, but how the hell would we model that without involving unreasonable error margins?


Brains are a problem that is drastically out of scope as far as AI is concerned. You're absolutely right, these are just very simplistic models and examples, meant to convey the general AI jargon. This isn't intended to apply to "complex" systems; for example the idea that you have a very basic sensor, a black box, and a very basic actuator works well for a specific problem set, but it's not intended to work as a model for brains (or even parts of brains).

I believe the source of your disappointment is a matter of overall expectation about what AI research is intended for. The objective of AI is not to create intelligent beings, it's to model and create programs that solve narrowly predefined problems. AI as a field is not at all identical to AGI (=artificial general intelligence). Whenever you're talking about brains or things like "common sense", that's AGI. Over the years AI research has produced many good models for single components of our mental subsystems though. But researchers have not actually concerned themselves with AGI until very recently.


That's true, but most of modern AI is about designing useful intelligent agents, rather than creating artificial consciousness.


Well, that's a cop-out.


This is extremely frustrating, has the login been removed from ai-class.com? I can't get the videos to work (except the intro) and I can't find anywhere to login. I've tried Chrome and FF.

Update: Now the videos are saying I need Flash 9 (intro previously worked). Bizarre. I just went to youtube to watch the videos, unfortunately they are not organized well or queued so searching for them is a pain.

This page at least has them all easily accessible: http://www.youtube.com/user/knowitvideos#p/u


Exported to PDF and re-uploaded for convenience.

http://dl.dropbox.com/u/5137/pdf-link/StanfordAI-UnitOne.pdf


Most excellent sir!

Do you think you will be doing this more?

I would be incredibly appreciative if you were to set up a mailing list! :D

It's only an idea ;)



Thanks!


Thanks for that. Google Docs still doesn't play nice with Google Hosted, and it's a bit of a pain to work around.


LOL, here's what I just got when I tried to log in

'Ooops

Our servers are off having a quick coffee break. Wait a second and refresh the page. If you still get this message, we apologize and ask that you try again a little later.'


If you repeatedly get this message, remember to click reload. It seems that it can get cached in your browser.

EDIT: It seems that it isn't just that and the site is just flaky (overloaded, I guess).


I thought they were going to be posting videos of the lectures... but when I logged in today and couldn't see the it anywhere. Did I misunderstand something? Is it posted? If yes, where?


They're on the left sidebar. Click 'Welcome to AI' and you'll see the videos. :)


Actually come to think of it, I bet IBM's G2 system would work amazingly well with translation.


I too would like to have discussions like xarian's thoughts for all lectures.

Regarding the lecture notes - is there a wiki where we can all contribute to? Earlier today, the google doc was complaining about too many people editing the document.


This is great. Thanks for sharing and please keep it coming! =D




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: