Hacker News new | past | comments | ask | show | jobs | submit login

It’s amazing how someone so smart can be so naive. I do understand conceptually the idea that if we create intelligence greater than our own that we could struggle to control it.

But does anyone have any meaningful thoughts on how this plays out? I hear our industry thought leaders clamoring over this but not a single actual concrete idea of what this means in practice. We have no idea what the fundamental architecture for superintelligence would even begin to look like.

Not to mention the very real counter argument of “if it’s truly smarter than you it will always be one step ahead of you”. So you can think you have safety in place but you don’t. All of your indicators can show it’s safe. Every integration test can pass. But if you were to create a superintelligence with volition, you will truly never be able to control it, short of pulling the plug.

Even more so, let’s say you do create a safe superintelligence. There isn’t going to be just one instance. Someone else will do the same, but make it either intentionally unsafe or incidentally through lack of controls. And then all your effort is academic at best if unsafe superintelligence really does mean doomsday.

But again, we’re far from this being a reality that it’s wacky to act as if there’s a real problem space at hand.




While the topic of "safe reasoning" may seem more or less preliminary before a good implementation of reasoning, it remains a theoretical discipline with its own importance and should be studied alongside the rest, also largely irregardless if its stage.

> We have no idea what the fundamental architecture for superintelligence would even begin to look like

Ambiguous expression. Not implemented technically does not mean we would not know what to implement.


You’re assuming a threat model where the AI has goals and motivations that are unpredictable and therefore risky, which is certainly the one that gets a lot of attention. But even if the AI’s goals and motivations can be perfectly controlled by its creators, you’re still at the mercy of the people who created the AI. In that respect it’s more of an arms race. And like many arms races, the goal might not necessarily be to outcompete everyone else so much as maintain a balance of power.


There’s no safe intelligence, so there’s no safe superintelligence. If you want safer superintelligence, you figure out how to augment the safest intelligence.



"how someone so smart can be so naive"

Do you really think Ilya has not thought deeply about each and every one of your points here? There's plenty of answers to your criticisms if you look around instead of attacking.


I actually do think they have not thought deeply about it or are willfully ignoring the very obvious conclusions to their line of thinking.

Ilya has an exceptional ability extrapolate into the future from current technology. Their assessment of the eventual significance of AI is likely very correct. They should then understand that there will not be universal governance of AI. It’s not a nuclear bomb. It doesn’t rely on controlled access to difficult to acquire materials. It is information. It cannot be controlled forever. It will not be limited to nation states, but deployed - easily - by corporations, political action groups, governments, and terrorist groups alike.

If Ilya wants to make something that is guaranteed to avoid say curse words and be incapable of generating porn, then sure. They can probably achieve that. But there is this naive, and in all honesty, deceptive, framing that any amount of research, effort, or regulation will establish an airtight seal to prevent AI for being used in incredibly malicious ways.

Most of all because the most likely and fundamentally disruptive near term weaponization of AI is going to be amplification of disinformation campaigns - and it will be incredibly effective. You don’t need to build a bomb to dismantle democracy. You can simply convince its populace to install an autocrat favorable to your cause.

It is as naive as it gets. Ilya is an academic and sees a very real and very challenging academic problem, but all conversations in this space ignore the reality that knowledge of how to build AI safely will be very intentionally disregarded by those with an incentive to build AI unsafely.


It seems like you're saying that if we can't guarantee success then there is no point even trying.

If their assessment of the eventual significance of AI is correct like you say, then what would be your suggested course of action to minimize risk of harm?


No, I’m saying that even if successful the global outcomes Ilya dreams of are entirely off the table. It’s like saying you figured out how to build a gun that is guaranteed to never fire when pointed at a human. Incredibly impressive technology, but what does it matter when anyone with violent intent will choose to use one without the same safeguards? You have solved the problem of making a safer gun, but you have gotten no closer to solving gun violence.

And then what would true success look like? Do we dream of a global governance, where Ilya’s recommendations are adopted by utopian global convention? Where Vladimir Putin and Xi Jinping agree this is for the best interest of humanity, and follow through without surreptitious intent? Where in countries that do agree this means that certain aspects of AI research are now illegal?

In my honest opinion, the only answer I see here is to assume that malicious AI will be ubiquitous in the very near future, to society-dismantling levels. The cat is already out of the bag, and the way forward is not figuring out how to make all the other AIs safe, but figuring out how to combat the dangerous ones. That is truly the hard, important problem we could use top minds like Ilya’s to tackle.


If someone ever invented a gun that is guaranteed to never fire when pointed at a human, assuming the safeguards were non-trivial to bypass, that would certainly improve gun violence, in the same way that a fingerprint lock reduces gun violence - you don't need to wait for 100% safety to make things safer. The government would then put restrictions on unsafe guns, and you'd see less of them around.

It wouldn't prevent war between nation-states, but that's a separate problem to solve - the solutions to war are orthogonal to the solutions to individual gun violence, and both are worthy of being addressed.


> how to make all the other AIs safe, but figuring out how to combat the dangerous ones.

This is clearly the end state of this race, observable in nature, and very likely understood by Ilya. Just like OpenAI's origins, they will aim to create good-to-extinguish-bad ASI, but whatever unipolar outcome is achieved, the creators will fail to harness and enslave something that is far beyond our cognition. We will be ants in the dirt in the way of Google's next data center.


I mean if you just take the words on that website at face value, it certainly feels naive to talk about it as "the most important technical problem of our time" (compared to applying technology to solving climate change, world hunger, or energy scarcity, to name a few that I personally think are more important).

But it's also a worst-case interpretation of motives and intent.

If you take that webpage for what it is - a marketing pitch - then it's fine.

Companies use superlatives all the time when they're looking to generate buzz and attract talent.


A lot of people think superintelligence can "solve" politics which is the blocker for climate change, hunger, and energy.


> There isn’t going to be just one instance. Someone else will do the same

NK AI (!)


We're really not that far. I'd argue superintelligence has already been achieved, and it's perfectly and knowably safe.

Consider, GPT-4o or Claude are:

• Way faster thinkers, readers, writers and computer operators than humans are

• Way better educated

• Way better at drawing/painting

... and yet, appear to be perfectly safe because they lack agency. There's just no evidence at all that they're dangerous.

Why isn't this an example of safe superintelligence? Why do people insist on defining intelligence in only one rather vague dimension (being able to make cunning plans).


Yann LeCun said it best in an interview with Lex Friedman.

LLMs don't consume more energy when answering more complex questions. That means there's no inherent understanding of questions.

(which you could infer from their structure: LLMs recursively predict the next word, possibly using words they just predicted, and so on).


LLMs don't consume more energy when answering more complex questions.

They can. With speculative decoding (https://medium.com/ai-science/speculative-decoding-make-llm-...) there's a small fast model that makes the initial prediction for the next token, and a larger slower model that evaluates that prediction, accepts it if it agrees, and reruns it if not. So a "simple" prompt for which the small and large models give the same output will run faster and consume less energy than a "complex" prompt for which the models often disagree.


I don't think speculative decoding proves that they consume less/more energy per question.

Regardless if the question/prompt is simple or not (for any definition of simple), if the target output is T tokens, the larger model needs to generate at least T tokens, if the small and large models disagree then the large model will be called to generate more than T tokens. The observed speedup is because you can infer K+1 tokens in parallel based on the drafts of the smaller model instead of having to do it sequentially. But I would argue that the "important" computation is still done (also the smaller model will be called the same number of times regardless of the difficulty of the question, bringing us back to the same problem that LLMs won't vary their energy consumption dynamically as a function of question complexity).

Also, the rate of disagreement does not necessarily change when the question is more complex, it could be that the 2 models have learned different things and could disagree on a "simple" question.


Or alternatively a lot of energy is wasted answering simple questions.

The whole point of the transformer is to take words and iteratively, layer by layer, use the context to refine their meaning. The vector you get out is a better representation of the true meaning of the token. I’d argue that’s loosely akin to ‘understanding’.

The fact that the transformer architecture can memorize text is far more surprising to me than the idea that it might understand tokens.


LLMs do consume more energy for complex questions. That's the original CoT insight. If you give them the space to "think out loud" their performance improves.

The current mainstream models don't really incorporate that insight into the core neural architectures as far as anyone knows, but there are papers that explore things like pause tokens which let the model do more computation without emitting words. This doesn't seem like a fundamental limitation let alone something that should be core to the definition of intelligence.

After all, to my eternal sadness humans don't seem to use more energy to answer complex questions either. You can't lose weight by thinking about hard stuff a lot, even though it'd be intuitive that you can. Quite the opposite. People who sit around thinking all day tend to put on weight.


> Way faster thinkers, readers, writers and computer operators than humans are

> Way better educated

> Way better at drawing/painting

I mean this nicely, but you have fallen for the anthropomorphizing of LLMs by marketing teams.

None of this is "intelligent", rather it's an incredibly sophisticated (and absolutely beyond human capabilities) lookup and classification of existing information.

And I am not arguing that this has no value, it has tremendous value, but it's not superintelligence in any sense.

LLMs do not "think".


Yeah well, sorry, but I have little patience anymore for philosophical word games. My views are especially not formed by marketing teams: ChatGPT hardly has one. My views are formed via direct experience and paper reading.

Imagine going back in time five years and saying "five years from now there will be a single machine that talks like a human, can imagine creative new artworks, write Supreme Court judgements, understand and display emotion, perform music and can engage in sophisticated enough reasoning to write programs. Also, HN posters will claim it's not really intelligent". Everyone would have laughed. They'd think you were making a witticism about the way people reclassify things as not-really-AI the moment they actually start to work well, a well known trope in the industry. They wouldn't have thought you were making a prediction of the future.

At some point, what matters is outcomes. We have blown well past the point of super-intelligent outcomes. I really do not care if GPT-4o "thinks" or does not "think". I can go to chatgpt.com right now and interact with something that is for all intents and purposes indistinguishable from super-intelligence ... and everything is fine.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: