Hacker News new | past | comments | ask | show | jobs | submit login

From where would AI get a utility function by which to value things? Seems like it would have to be specified exogenously, unless people are seriously considering some sort of "emergent utility function".



The utility function can be specified in a way to build up something that looks like an internal motivation system. This is often referred to as intrinsically motivated reinforcement learning.

A recent paper by 2 Google Deepmind researchers on this topic:

Variational Information Maximisation for Intrinsically Motivated Reinforcement Learning Shakir Mohamed and Danilo J. Rezende http://arxiv.org/abs/1509.08731

and a somewhat older survey paper on the same topic:

How can we define intrinsic motivation? Pierre-Yves Oudeyer, Frederic Kaplan http://www.pyoudeyer.com/epirob08OudeyerKaplan.pdf


If the AI is designed to be an a self-motivated decision-making agent, some form of utility function will be already built in by its architects. One both hopes and fears that it will also be provided with a manner of updating and actualising that utility function in altering circumstances.


The same way it developed a desire to get rid of the human race.


But that desire is instrumental for performing many of the possible goals we might have specified, since humans are at best "useless matter" and at worst "actively preventing my actions" unless we were very careful with the goals. Therefore the desire to get rid of the human race is actually a logical consequence of most utility functions, rather than being directly specified.

By contrast, utility functions don't just appear when you think hard enough about a problem. The desire to get rid of the human race does just appear like that, if you're super-powerful and have any of a certain huge set of goals, but your set of goals does not simply come into existence ex nihilo.


An AI, well and truly advanced beyond the intelligence capability of mankind is by definition unknowable to us. Your speculation about an AI's utility functions is akin to an earthworm's nerve bundle considering your consciousness.

O the depth of the riches both of the wisdom and knowledge of God! how unsearchable are his judgments, and his ways past finding out! For who hath known the mind of the Lord? or who hath been his counselor?


OK, I'll amend that to "there is a known mechanism by which the desire-to-eliminate-humanity may arise from pure thought, but no known mechanism by which a utility function may arise from pure thought".


> Therefore the desire to get rid of the human race is actually a logical consequence of most utility functions, rather than being directly specified.

That's only true in the sense that there are infinitely many more Real numbers than whole numbers. Even so, I would argue there are infinitely many utility functions that directly involve the welfare of the human race, and you would be stupid to design an AI who didn't have a majority of its utility functions directly involving measures of human welfare. (Also, recognizing minimum welfare levels as well as median and average.)




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: