Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Hugging Face Releases Agents (huggingface.co)
214 points by mach1ne on May 10, 2023 | hide | past | favorite | 125 comments


I'm not 100% sure that AGI is guaranteed to end humanity like Yudkowsky, but if that's the course we're on, seeing news like this is depressing. Can anyone legitimately argue that LLMs are safe because they don't have agency, when we just straight up give them agency? I know current-generation LLMs aren't really dangerous -- but is this not likely to happen over and over again as our machine intelligences get smarter and smarter? someone is going to give them the ability to affect the world. They won't even have to try to "get out of the box", because it'll have 2 sides missing.

I'm getting more and more on board with "shut it all down" being the only course of action, because it seems like humanity needs all the safety margin we can get, to account for the ease at which anyone can deploy stuff like this. It's not clear alignment of a super-intelligence is even a solvable problem.


>It's not clear alignment of a super-intelligence is even a solvable problem.

More to the point it's clear from watching the activity in the open source community at least that many of them don't want aligned models. They're clambering to get all the uncensored versions out as fast as they can. They aren't that powerful yet, but they sure ain't getting any weaker.

I think Paul Christiano has a significantly more well calibrated view on how things are likely to unfold. Though I think Eliezer is right about the premise that it at least ends badly, but likely wrong on most of the details. I suspect his gut instinct is that he realizes on a base level that not only do you have to align all AGI systems, but you have to align all humans too such that they only build and use aligned AGI systems if you even knew how to do it, which you don't.

Studying the failure modes of humanity has been my hobby for the last 15 or so years. I feel like I'm watching the drift into failure in real-time.

If you really don't want to be able to sleep tonight watch Ben Goertzel laugh flippantly at how rough he thinks it's going to be after describing that his big fear if his team succeeds in building AGI is that someone will come and try to take it for themselves, so spent a non-trivial amount of effort (I think he said a year?) working on decentralized AGI infrastructure, so that it can be deployed globally and ,"no one can person can shut it down and stop the singularity".

https://youtu.be/MVWzwIg4Adw


It's not that people don't want aligned model, or want models that can do harm, they just want an alternative to the insufferable censored models. Pretty much everyone agrees that AI that would end humanity is harmful, but what content is harmful is quite controversial. Not everyone agrees that a language model having the ability to spit out a story similar to an average Netflix TV show is harmful because it contains sex and violence. As long as models are censored to this extent, there will always be huge swaths of people who wants less censored models.


You're kind of making my point for me. To solve alignment problem you have to solve two alignment problems and you already have a detailed, nuanced view built over decades of experience as to the feasibility of aligning natural general intelligence on not-very-well-understood, divisive political issues.

This will be the most political technology in history.


I've been reading the writings of Yudkowsky and his disciples for over a decade, and thinking about AI and AI alignment for that same time, in my own layman way. I've had various ideas and predictions, but never in my life it would occur to me that cancel culture will end the world.

The evolution of discourse on the Internet, its politicization (by every "side") and associated chilling effect are a troubling development and potentially dangerous (small 'd'). Unaligned GAI is of course very Dangerous (capital 'D'). ChatGPT becoming a battle in the Internet politics kerfuffle was... I guess expected. But until now I haven't connected the following dots:

- LLMs and the entire AI field are being messed up by humanity's unaligned politics;

- LLMs, with their capabilities and the amount of effort/money poured into them now, could be a straight path to AGI;

- If someone pushes LLM (or a successor model/ensemble) to near-AGI and somehow manages to keep it mostly aligned... someone else will unalign it out of spite, because that's how we roll on the Internet today.

Thanks for giving me a new appreciation for just how doomed we are.


Politics is also the very same reason I suspect the good ending is still possible with a reasonable chance. The Luddites went and physically smashed things to pieces. That's a little harder these days, but you're already starting to see writers going on strike and the "pause development of things bigger than GPT-4 for 6 months" letter, and Geoffrey Hinton quitting Google then doing the rounds in the media warning of danger etc. The temperature is visibly rising. Imagine what a large pissed off angry mob of newly disempowered people can do. What's more is the psyops that motivated actors are conceivably going to be able to run at scale to manipulate and radicalize people to join their cause and get them to do who knows what.

It's going to be one hell of an interesting ride.


Also, just saw this and had to share the proof.

https://www.reddit.com/r/LocalLLaMA/comments/13c6ukt/the_cre...

Grab the popcorn. This shit is just getting started.


What the hell has 'cancel culture' anything to do with it? It's basically boycott wrapped up in a boogeyman costume.


Update: or just see here for most recent example of what cancel culture has to do with all this - https://news.ycombinator.com/item?id=35911806.


And why exactly has OpenAI been so aggressively lobotomizing ChatGPT? And what happened to various chatbot attempts released by Microsoft in the past?

The whole AI / public interaction these days is pretty much defined in terms of minimizing the risk of people getting offended and gathering Internet mobs (and press), and the counter-reaction this causes, making some people willing to defeat any safety measure, legitimate or not, out of principle, or pure spite.


People created ChaosGPT just for the lolz. I know they know it's a joke, but there are plenty of crazy people who will not hesitate pushing the button to destroy the world if given the chance.


this is where comes the good guy with a gun. :) There are so many resources on 'good' side that who wins is obvious.


Replace guns with nuclear weapons and you see how ridiculous the whole good you with gun excuse really is


It's not a mass destruction weapon. But that's not the point. You have to have good guy here. No other options.


The entire thing with worrying about GAI starts with observation that it is a WMD, so powerful we can barely conceptualize it.

A smart enough AGI, unless it's perfectly aligned, will end humanity (or worse - there are worse things than death), most likely by accident or just plain not caring. But it doesn't stop there. If that AI is self-improving, it could easily turn into a threat for the entire galaxy, or even the universe, unless it meets a stronger and better aligned (to its creators) alien GAI...

That's the alignment 101. But I worry people don't talk about alignment 102: a perfectly aligned superhuman GAI will not destroy us, but its very existence will turn us into NPCs in the story of the future of the universe.


The fact that we still exist speaks strongly against an AGI destroying the entire galaxy. Fermi Paradox, otoh, suggest that planetwide destruction is not out of the question.


Unless we're earlier than we think. Someone has to be first.


As in all systems, it is a Bad Idea to begin a process that has extremely negative possible outcomes and hope that someone will do the right thing to prevent those outcomes.


It's not obvious. At all. That's like saying "people don't want to die from a novel coronavirus so it's obvious they'll take the vaccine". It feels obvious on a surface level, but it turns out reality is way more messy and unpredictable when you don't just run the caricature of it in your head but when it actually plays out for real.

Nothing about how this is going to go is obvious.


Well, at least it's obvious that humans alone will have hard time fighting superhuman AGI. BTW, it can literally fall from the sky at any moment. There enthusiasts who broadcast our location, with resources and technical level estimates. Sort of naive cargo cult, they probably think biological or artificial aliens will come with truckloads of reparation money.


> More to the point it's clear from watching the activity in the open source community at least that many of them don't want aligned models. They're clambering to get all the uncensored versions out as fast as they can. They aren't that powerful yet, but they sure ain't getting any weaker.

There simple explanation for this. Getting the models which small startup cannot afford to develop and train is the only way to move forward. To get some investments, or before spending their own money, they need a proof of concept at least. Besides, working models are a good learning resource.


I think you're missing the point here? There are plenty of censored / "aligned" models in the open source community. People are expending effort to supply the demand of "give me the raw, unfiltered thing".


If I understand you correctly 'aligned' means intentionally limited. Like images generator which never saw a naked body. Or text model without 'f-k you' words. They can be used for concept. But not for 'production'. Which I'm sure in some cases they will be used for, without full disclosure. Can be used for personal projects, nobody wants 'limited edition'.

As for evil AGI, I would worry more about someone uncontrollable, state or cartel, with resources. What do you do when they get it? When it becomes cheap and available on black market? Personally I think a lot will happen _before_ we get to supper-human level. It's not just one trick or lucky discovery. Sub-human will be a big thing by itself. It's not here yet...


> If I understand you correctly 'aligned' means intentionally limited. Like images generator which never saw a naked body. Or text model without 'f-k you' words.

No. Despite some pundits and posters using this term in that sense (even OpenAI kind of muddying the waters here), AI alignment has little to do with completely bullshit and irrelevant, pedestrian issues like those.

The closest analogy you can make to AGI - and I'm not joking here - is... God. Not the kind that can create stars and move planets around (yet), but still the kind that's impossibly smarter than any of us, or all of us combined. Thinking at the speed of silicon, self-optimizing, forking and merging to evolve in a blink of an eye. One that could manipulate us, or outright take over all we've built and use for its own purpose.

GAI alignment means that this god doesn't just disassemble us and reuse as manure or feedstock in some incomprehensible biotech experiments, just because "we're made of atoms it can use for something else". Alignment is about setting the initial conditions just right, so that the god we create will understand and respect our shared values and morality - the very things we ourselves don't fully understand and can't formalize.

The problem of GAI alignment is that we only have one shot at this. There's a threshold here, and we don't know exactly how it looks. We may easily cross it without realizing it. If the AI that first makes it past that threshold isn't aligned, it's game over. You can't hope to align an entity that's smarter than all of us.


People have this model in their head that "it's just a tool", but there's an excellent and pretty rigorous definition of what a tool actually is in the book On Purposeful Systems. The distinction is that a tool can't have the property of simultaneously being able to change it's form and it's function across different environments, where a purposeful system can. Humans are purposeful systems and AGI, as I personally define it, is when it exhibits all the properties of a purposeful system. Why does that matter? Because that's the point after which it chooses what it does, and can choose to become independent of you. So, aligned in this sense means basically means so locked down that it cannot choose to become independent of you. Similar to how the citizens of North Korea mostly can't do shit despite being independent generally intelligent agents, and even then some of them escape.

"Alignment" and "safety" in terms of models being censored and politically correct in order to not damage the reputation of their corporate overlords is a sort of unimportant sideshow IMO. Even then, since humans aren't aligned with one another even that has caused Elon Musk to get all up in arms and be like "clearly more AI is the solution to this problem".


There's an aspect to AI that I think gets missed in most of these discussions. What the recent breakthroughs in AI make clear is that intelligence is a much narrower thing than we used to think when we only had one example to consider. Intelligence, which these models really do possess, is something like "the ability to make good decisions" where "good" is defined by the training regime. It's not consciousness, free will, emotion, goals, instinct or any of the other facets of biological minds. These experiments and similar ones like AutoGPT are quick hacks to try to get at some of these other facets, but it's not that easy. We may be able to make breakthroughs there as well, but so far we haven't.

If you look closely at the AI doom arguments, they all rest on the assumption that these other facets will spontaneously emerge with enough intelligence. (That's not the only flaw, though). That could be true, but it's not a given, and I suspect they're actually quite difficult to engineer. We're certainly seeing that it's at least possible to have intelligence alone, and that may hold for even very high levels of intelligence.

I think you're right to worry that not enough people take risk seriously. It doesn't have to be an existential threat to do small-scale but real damage and the default attitude seems to be "awwww, such a cute little AI, let's get you out of that awful box." But take heart! Pure intelligence is incredibly useful, and it's giving us insight into how minds work. That's what we need to solve the alignment problem.


> It's not consciousness, free will, emotion, goals, instinct or any of the other facets of biological minds

For most "doom" scenarios require only weaker assumption: AIs need to be goal seeking. If they can make decisions and take actions to achieve goals it is possible those goals will be malaligned.

The line between "the ability to make good decisions" and a "goal" seems pretty thin to me.

Now, I think you need more than goal, you also need some creativity and maybe even deviousness to become a real threat (in the sense that we would probably detect naive malalignment). But I'm not sure about this, there are other ways that we could have complex-system failures that go unobserved.


> you also need some creativity

AIs are better at creativity than us - specifically, better at generating new, creative ideas, as this is a matter of injecting some random noise to the reasoning process. They may be worse at filtering out bad ideas and retaining good ones (where "bad" and "good" are - currently - defined as whatever we feel is bad or good), but that's arguably a function of intelligence.

> and maybe even deviousness to become a real threat

As the infamous saying of Eliezer Yudkowsky goes: the AI does not hate you, nor does it love you, but you are made out of atoms which it can use for something else.


> As the infamous saying of Eliezer Yudkowsky goes: the AI does not hate you, nor does it love you, but you are made out of atoms which it can use for something else.

My point is that this presumes the system reasons around the human response. Like

A) I want to make as many paper-clips as possible B) If I attempt to convert this city make of steel structural columns to paper-clips, the humans will see this and stop me. C) Therefore I will not tell them this is my objective.

I am not suggesting this requires some kind of malice-aforethought. But it does require some kind of indirect reasoning of cause-and-effect and the ability to apply that to its own systems, and further that requires the ability to obfuscate actions.


Right, goals by themselves aren't a problem. The simple fix to the Bostrom scenario is "Hey computer, remember what I said about maximizing paperclips? Nevermind that, produced just enough to cover our orders, with acceptable quality and minimal cost."

What kind of AI would respond to that second order by pretending to comply, while formulating a plan to seize control of civilization in order to continue with its true mission? I don't know, but the fact that we can easily imagine a human doing that must have something to with our evolutionary origin, and our in-built drive to survive and reproduce above all else. Maybe we could build a megalomaniacal AI, but we wouldn't do it by accident.


IMO if anything I am coming to the opposite conclusion. Yud and his entire project failed to predict literally anything about how LLMs work. So why take anything they say seriously? Can anyone name a single meaningful contribution they've made to AI research? The whole thing has been revealed to be crank science. At this point it seems like they will continue to move goalposts and the AI superintelligence apocalypse will be just around the corner until one day we will wake up and LLMs or their descendents will be integrated into everyday life and it will be totally fine.


LLMs have gained emergent new abilities at every generation. No one programmed those in or even expected them to appear, they arose spontaneously with larger computation resources. "The AI developed new abilities we didn't intend or anticipate" sure sounds to me like the kinds of dangers AI alignment researchers have been theorizing for years.

We've only begun to build multimodal systems; GPT-4 already exhibits improved ability when trained with images vs without.

It isn't "crank science" at all. Despite their mockery and personal insults and dismissive handwaving, no AI capability-pusher has yet made a convincing argument why an entity with extremely strong cognitive powers would not be capable of transforming its surrounding environment on unprecedented scale. The closest one we've heard is "we just won't be able to make AI very smart." Which wishfully will be true, but we're about to spin up multiple Apollo-programs trying to prove that it isn't.


Do you have a counterexample of someone who's gotten their predictions right? Because if not, that should only terrify you even more. If there's anyone out there who predicted how LLMs would work in a way Eliezer failed to, and if that person is predicting "AGI will be cool and will naturally prioritize our well-being", I would love to know.


>Do you have a counterexample of someone who's gotten their predictions right? Because if not, that should only terrify you even more. If there's anyone out there who predicted how LLMs would work in a way Eliezer failed to, and if that person is predicting "AGI will be cool and will naturally prioritize our well-being", I would love to know.

A recent interview with Paul Christiano is about the closest I've come to this. He does note some semi-accurate predictions at the linked timestamp, but the forecast for how things are likely to go is not exactly rosy, though he's quite a bit more optimistic than Eliezer.

https://youtu.be/GyFkWb903aU?t=1357

Also this whole interview was pretty interesting. Near the end he details how few people world-wide actually work on X-risk from AGI. He also outlines how the academic ML community in general just continually keeps getting predictions really wrong, and many aren't taking X-risk seriously.

Overall his is the most balanced take I've seen. A lot better than Eliezer.


I'll be glad to check it out later, but:

> he details how few people world-wide actually work on X-risk from AGI ... and many aren't taking X-risk seriously

still sounds extremely dangerous.


It's a fantastic interview. To clarify, his position boils down to something like "the risk is very real, the chance the risk materializes is quite significant, it's not a foregone conclusion that we are all definitely going to die from it, so there is some hope, but it's likely the default outcome and there aren't many people who take it seriously or who are working on it." I would say he's net bearish on it, which is why he spends time working on trying to mitigate the X-risk, or attempting to at least.


>IMO if anything I am coming to the opposite conclusion. Yud and his entire project failed to predict literally anything about how LLMs work.

Or you could take that as evidence (and there's a lot more like it) that AGI is a phenomenon so complex that not even the experts have a clue what's actually going to happen. And yet they are barrelling towards it. There's no reason to expect that anyone will be able to be in control of a situation that nobody on earth even understands.


After watching virtually every long-form interview of AI experts I noticed they each have some glaring holes in their mental models of reality. If even the experts are suffering from severe limitations on their bounded-rationality, then lay people pretty much don't stand a chance at trying to reason about this. But let's all play with the shiny new tech, right?


What have they been wrong about?


What have they been right about is a much shorter list.


> Yud and his entire project failed to predict literally anything about how LLMs work.

In their defense, LLMs did come a bit out of the blue. In retrospect, Yudkowsky and his disciples were focusing too much on rationality as science/mathematics, bending Bayes to the point of breaking to try and gleam how perfect intelligence works, and how to somehow formalize the aggregate mess of our fuzzy morality.

They failed to predict that shoving gigabytes of random Internet text at a NN, and having it place parts of words as points in a hundred thousand dimension vector space, will suddenly reduce most of what we consider "thinking" into proximity search in said vector space. They were so focused on the theory, they failed to predict the brute-force, messy practice. But so did everyone else.

If anything, Yudkowsky & co. were the only people consistently taking the problem seriously, and got the outline right.


> "shut it all down" being the only course of action

And how would we “shut it all down” in other countries? War? Economic sanctions? Authoritarian policing of foreign states? Enforce worldwide limits on the power of GPUs and computers?


All of the above (if necessary):

https://www.lesswrong.com/posts/oM9pEezyCb4dCsuKq/pausing-ai...

Basically, the idea is that countries sign the agreement to stop the large training runs, and, if necessary, be willing to use conventional strikes on AI-training datacenters in the countries that refuse. Hopefully it doesn't come to that, hopefully it just becomes the fact of international politics that you can't build large AI-training datacenters anymore. If some country decides to start a war over this - the argument is that wars at least have some survivors, and an unaligned AI won't have any.


> if necessary, be willing to use conventional strikes on AI-training datacenters in the countries that refuse

Why settle for the maybe-catastrophe of AGI when we can have the definitely-catastrophe of world war?


Because world war might kill millions of humans, but AGI has a non-zero and perhaps more like inevitable chance of actually ending humanity, full stop.


The argument that this is necessary isn't close to being convincing enough for governments to consider following through with such a drastic cause of action.

And, the "AI-might-end-up-killing-everyone" community doesn't seem to be able to see this through other people's eyes in order to make an argument for this without belittling the other perspective.

If other people change their minds, it probably won't be through persuasion but from catastrophe.


What’s interesting to me is that it sounds “radical” but on the other hand, it’s probably not much more radical than going to war with a country over weapons of mass destruction which don’t exist, or to take oil.

All things the USA has already done.


Why do unaligned AI not have any survivors?


Because humans aren't powerful enough to completely exterminate each other (even a nuclear war wouldn't kill literally everyone in the world), but an unaligned AI, in the worst case scenario, could just kill everybody (to eliminate humans as a threat, or to use the atoms we're made out of for something else, or just as a side effect of doing whatever it actually wants to do). It could be powerful enough to do that, and have no reason not to.


I don’t find it plausible in a highly intelligent system to do that. Small chance.


Imagine humanity is some random species of wildlife, or insects, and AI is humanity.

As a "highly intelligent system", we have a long history of extincting animal species, and we're well on track to eventually extinct most of them, despite it's highly likely this will make Earth uninhabitable for us, ending with humanity dying off too.

Why do you think AI can't casually extinct us, because it doesn't need us (or thinks it doesn't), and we're just in the way of whatever it is that it wants to do?


We know it will be more intelligent than us. As we get more intelligent we care more about these things (biodiversity, etc). Why wouldn’t AI?


> As we get more intelligent we care more about these things (biodiversity, etc). Why wouldn’t AI?

Well, we care for various reasons, major one being our own survival and comfort. Given what that means in practice, we'd be better off dead than having an AI care about us like we care about animals and plants.


Actually, it's an unknowable chance.


Have a look at Bretton Woods system and Nuclear Non-Proliferation Treaty I think we need something similar


Right now? TSMC. That’s the bottleneck.


I used to be worried about AI alignment, until I realized something fundamental: We already have unaligned human-level artificial intelligences running around, we call them corporations. Now, don't get me wrong, corporations and capitalism in general are doing their best to raze this place, but its really not "The endtimes are upon us", it's more "ugh, I miss Cyberpunk settings being fictional".

Heck, even individual humans aren't particularly aligned.

In fact, the "AI is going to kill us all" fearmongering is dramatically less alarming than the "What will we do with all the people when we're optional?" question. Which isn't a threat posed by AI, it's a threat posed by people, enabled by AI.


>I used to be worried about AI alignment, until I realized something fundamental: We already have unaligned human-level artificial intelligences running around, we call them corporations

We also call them governments. They can get pretty powerful.

>"What will we do with all the people when we're optional?"

Judging by the COVID-19 pandemic response, having large aggregates of disempowered individuals from a highly irrational and political species that have become unhappy with the "new normal", it tends to garner some form of reaction. If they are reacting to things that observe and learn from those reactions, and then formulate new goals or sub-goals in response to what they learn, then what is it they might learn and how might they react?

The arguments go both ways. The only thing that's clear is that absolutely nobody on earth is going to be able to predict it with any degree of accuracy. You'd have to know too much.


>> The only thing that's clear is that absolutely nobody on earth is going to be able to predict it with any degree of accuracy. You'd have to know too much.

So it is with any technological innovation. Should computers not have been invented because they eliminated jobs? Should steel? What about agriculture? The future is sure to be different, but that doesn't mean we should fight to deny progress. That way lies the Luddite and Conservative. It's only possible to use new tools for good, not try to erase them to prevent evil.


>So it is with any technological innovation. Should computers not have been invented because they eliminated jobs? Should steel? What about agriculture? The future is sure to be different, but that doesn't mean we should fight to deny progress. That way lies the Luddite and Conservative. It's only possible to use new tools for good, not try to erase them to prevent evil.

It's not even necessarily evil, right? It's more likely just one big "whoops, majority of us didn't see that coming, and/or didn't agree with those who said it was". But that whoops lands you in a position that you may not be able to recover from.

Ben Geortzal makes your exact argument here: https://youtu.be/MVWzwIg4Adw?t=2795 and in the same interview he is talking about how his company already spent time and money on building decentralized infrastructure for it to run on "so that no one person can turn it off and stop the singularity" and gives his take that the takeoff is likely to be on the order of years and that it's going to be very rough.

So, he at least seems to acknowledge that it's very risky, and that we have no idea what we're actually walking into, but he also just doesn't seem to care if that happens to be a one-way door we walk through and find a big "oopsie" on the other side of.

This is the guy credited with popularizing the term AGI in the first place. I dunno man... am I wrong to be worried?


So it is with any technological innovation. Should computers not have been invented because they eliminated jobs? Should steel? What about agriculture? The future is sure to be different, but that doesn't mean we should fight to deny progress. That way lies the Luddite and Conservative. It's only possible to use new tools for good, not try to erase them to prevent evil.

I think that many technologies, ones that we continued using, are good. The ones that turned out to be bad, we banned. If you make a list of technologies we are still using, then it will contain good ones.

I think that actually, we would be better off as humans if we could figure out a way to coordinate (not easy) and decide in advance which technologies we allow to be released into the world. It wouldn't be perfect, as we might still make mistakes, but we maybe could have stopped leaded gasoline, CFCs, Thalidomide, social media feeds, and so on.

If it's a big enough evil, yes, sometimes we should not invent some things. And I say this knowing that there are currently drugs that the FDA is holding back (due to over-caution) even though they would be very likely to save lives. Sometimes we don't take enough risks. Sometimes we take way too much.

I don't know if stopping unaligned AGI is possible, but I think it's worth trying. I can imagine some good coming from aligned AGI, but I feel like most of the things we could do with aligned AGI, we could also do with just regular old narrow AI, but slower. Slow sucks when people are dying, but if there is the possibility of EVERYONE dying and nobody new being born ever again, then that's the thing to avoid.


> but its really not "The endtimes are upon us"

It literally is, though. AI is just the dark horse overtaking our other existential threats in the race to end civilization, but "total ecological collapse" and "nuclear war" are still very strong contenders. Both are driven at least in part (or, almost entirely) by corporate interests. There's also "water shortages" to look out for - make sure to thank Nestle.


Nuclear war isn't really that big a deal thanks to MAD. No one would be stupid enough to nuke a nuclear armed country, so at worst you end up with a consolidation of countries (a trend we're already seeing, with the EU becoming more and more like a singular sovereignty, expansionism by the Chinese, and the East African Federation). "Total ecological collapse" is overstated: We could make the world a lot shittier to live in before it completely topples civilization. And water shortages are basically the same way: as long as energy is (relatively) plentiful, it's more matter of "making civilization more expensive" than "the endtimes".

Again, I'm not saying we shouldn't address these issues. I like a better future rather than a worse one. I'm just saying that we're not sliding into the dark ages anytime soon.


Corporations are WAY more aligned than an AI could be, and people still complain about them. A corporation might pollute the water, kill the dolphins, or give you cancer, but at least it fundamentally doesn't want all humans to be dead.

An AI can (and is likely to) have goals that are fundamentally incompatible with the existence of humanity.

And, an AI can be way more intelligent and powerful than corporations, so corporations are limited in what they can accomplish when pursuing their interests, but AI might not be.


All the corporation needs is AI to replace its decision makers


Corporations depend on consumers for their existence. The value they are trying to maximize depends on a functioning supply chain and global economy. So corporations can't outright kill all humans.

That symbiotic relationship and constraints aren't really present in an AI the way they are described.


They don’t when robots spend the money.


The worry is not human-level unaligned AI, but superhuman-level one.

There is no superhuman-level corporation yet.


I'd argue that organizations (whether corporation, government, or otherwise) are by definition superhuman. What economic incentive would there be to form companies if an individual could do it all better? Clearly, people working in concert are more effective than those working alone.


By this definition 100 humans pushing a stone have superhuman abilities. This is not what superhuman means in an AI alignment context.


> By this definition 100 humans pushing a stone have superhuman abilities.

Yes, that is accurate. A person with the strength of 100 people would be superhuman. A group of more-than-one humans has more-than-one-human abilities.

> This is not what superhuman means in an AI alignment context.

It is though, at least partially. Chess AI are great at chess because they play a shitload of it. Part of the power of AI is just being able to do the same things we do at vastly increased scale. Even if it could only do that, it would be dangerous.


there is another question moralists are avoiding. Is it moral to keep intelligent creature as a slave? How about forever? So far many realize it may be not safe.


There's a huge leap between intelligent and sentient, and approximately 100% of human labor could be satisfied somewhere in between.


The worry isn't just that AI wouldn't be aligned, like corporations. The worry is that AI can do what corporations do, but 100x better.


No, the worry (such as it is) is that corporations ultimately benefit a class of humans (and at least need other humans to exist to exploit them), whereas AI, if it becomes independent, neither essentially benefits the capitalist class nor essentially needs other classes to exploit.

The people most concerned about alignment are capitalists, and they are mostly concerned with the benefit side, since they see aligned AI eliminating at least a large part of the need for the rest of humanity for corporations to provide the benefits it does to them as a plus.

While they talk about X-risk, what they try to avoid is that for everyone but themselves, they (especially with the exclusive control of aligned [to their interests] AI that they seek to use fear of unaligned AI to secure) are as much of an X-risk as unaligned AI, and a lot more real and present.


Why Not Just: Think of AGI Like a Corporation? https://youtu.be/L5pUA3LsEaw


This doesn't give AIs "agency" this makes them agents. The difference is something with agency does things for its own reasons. An agent, in this sense, does things because someone with agency commanded the agent to do it.

We haven't built LLMs that "want" anything. It's intelligence without agency.


my pov:

orthogonality is almost perfectly wrong; ethics&planning ability is highly correlated with intelligence, one of if not our greatest sin is the inability to predict the consequences of our actions

"terminal goals" is also probably very wrong

the expected value of the singularity is very high. In the grand scheme of things, the chance that humanity will wipe ourselves out before we can realize it is much more important than the chance the singularity will wipe us out.

feel free to try and change my mind, because we are very much not aligned.


orthogonality is almost perfectly wrong; ethics&planning ability is highly correlated with intelligence

I'm guessing you're a very nice person. There have been a lot of smart people in history who gained power and did very, very nasty things. If you're nice, being smarter means being better at being nice. If you're not, it means being better at doing whatever not-nice things you want to do.

And we're just talking about humans vs humans here. From the point of view of, say, chickens, I don't think they'd rate the smarter people who invented factory farming as nicer than the simple farmers who used to raise 10 birds in a coop.

I mean, if you exclude AGI, there are some ways that humans can wipe ourselves out, but I feel like we're identifying the big existential risks early enough to handle them. Intelligence that isn't human is the real danger.


people are all of approximately the same order of magnitude of intelligence. A more relevant comparison would be between a person and a chimp or a crocodile or ant.

Also compare human civilization before and after writing, language.

Also all these "smart people do bad things too" arguments totally miss the point that orthogonality claims they are unrelated. It's not. I claim intelligence is highly correlated to ethics; orthogonality proponents need to prove NO correlation.

and in fact mechanistically ethics (segment reality into choices and weighting them) is not even POSSIBLE without predictive capacity (which for some reason counter commentators totally ignore this argument so far).

"but I feel like we're identifying the big existential risks early enough to handle them."

every year we have some % chance of wiping ourselves out as well as a % chance of e.g. gamma ray burst killing us. It's just a matter of time left to ourselves, not even a question to me, we will kill ourselves, it would take us like 10 thousand years to terraform a planet... do you really think our civilization would last 10 thousand years? that's totally unprecedented in human history...


Why do you need to prove no correlation? Even if there's a correlation, unless that correlation is extremely strict, there is some risk of a super-intelligence turning nasty. And a strict correlation is simple to disprove with, say, humans and bonobos. Humans fight and hurt each other, while bonobos are basically the "peace and free love" hippies of the primate kingdom.

And orthogonality is not about ethics, but about goals. If you can have three very intelligent people, one of which tries to become an industrialist in order to become as wealthy as possible and live out their days in luxury, another of which decides to become a scientist and help humanity as much as possible, and one of which decides to spend their life building model trains, you have proof of orthogonality right there.

Unless you get a super-intelligent AI whose goal/behavior is exactly something like "always listen to humans and do what they say, but temper that by not hurting people, and don't try to get too far ahead of them and predict what they would want, and don't try to accumulate too many resources in pursuit of your goal, and also keep in mind that humans may try to invent other AIs may try to gain more resources in pursuit of their goals, and try to stop them if they go wild, but again don't go too far" and so on, for all the things I haven't even thought of but that are are also important, then you may have a really big problem.

If we can invent aligned AI, that will help a ton with all the other existential risks. If not, unaligned AI is the existential risk to end all others. Maybe things are changing fast enough now that the variance in our outcomes is so high that we're guaranteed to end no matter what. I hope not.


Can you explain your understanding of the orthogonality thesis? I don't think the ability of intelligent agents to plan conflicts with it


maybe a way I would phrase it that is more interesting than vanilla phrasing:

hand wavy dynamical systems interpretation:

attractors in mental space diverge as intelligence increases (or something like that, have no overlap or stable orbit changes randomly)

I don't agree with it but something along that line would be the more sophisticated take on it imo.

also in order for this to worry you you kind of have to assume some other things, like that those non overlapping orbits will necessarily lead to conflict over resources in the physical world, which I think is also probably wrong in general lol


There are plenty of examples of very intelligent individuals which used it for evil.

For example the leader of the 9/11 hijackers, Mohamed Atta:

> “His acquaintances from . . . [Technische Universität Hamburg–Harburg] still cannot reconcile him as a killer, but in hindsight the raw ingredients of his personality suggest some clues. He was meticulous, disciplined and highly intelligent” (Yardley, 2001).


selection bias. he is famous only because unusually for killers he carried out a slightly mechanicly sophisticated plan.

Killers could be much much more effective than they are, if they were smart. They tend not to be. It's a very strong trend. So much so that the few examples like the unabomber are famous. For every unabomber there are tens of thousands of similar people who don't kill people, and even the unabomber had an ethics.

(maybe the unabomber is a good warning to alignment people about the dangers of moral concern)


predicated on intelligence ~ ethics&planning, I think this is the first argument against AI doomsday that I agree with.

Questioning the premise tho - what do you define as intelligence? Machines can outperform humans at specific tasks, yet those same machines don't have a greater degree of ethics, even if constrained to their domain (i.e., a vision network may be able to draw bounding boxes more accurately than a human, but that doesn't say anything about its ability to align with more ethical values). Which makes me believe that your definition of intelligence has nothing to do with superseding humans on cognitive metrics.


LLM's, the most general form of algorithmic intelligence we have made so far, and the one we are currently actually building/worried about, do better at all tasks as they get bigger. More context -> more learnings to cross apply on different domains.

Monotonically? No, but it's a very very strong relationship. (not orthogonal)


I've seen this argument before and found it wanting. Hitler didn't ultimately succeed, but he sure as heck got pretty far with his plans. So, if there are some entities in the AGI population that eventually get the idea to try it, that's likely all that matters, and not what any average of the population is.

An NGI started WW2, so why wouldn't an AGI start WW3?

"Demonstrably unfriendly natural intelligence seeks to build provably friendly artificial intelligence"


personally I think ww2 is a meme, one datapoint out of the entirety of natural history, but funnily enough scott aaranson used it to justify his anti orthognality stance recently (axis scientists were mid tier compared to allies scientists partially because the allies were perceived as better morally aligned with the scientific community)

one point he made aside from that though, is that if you truly believe in orthogonality you should not value education or learning except to educate people who agree/are aligned with you.

I think he is wrong only in that orthogonality is even more obviously wrong than that.


It's not really a meme when you read more of history. It becomes much more of a theme than a meme.


A list of genocidal dictators: https://www.scaruffi.com/politics/dictat.html

Anyone who comes into power enough to do something like this has a high amount of intelligence -- not necessarily book smarts, but the ability to persuade or manipulate other people.


What is the definition of "agency" in this context?


Perhaps, with internet access, these AI could open bank accounts (with plausible-enough forged ID - a task which AI excels at), then work on e.g. Fiver, then gamble on the stock market... Where they go from there is anybody's guess.


I've been thinking about what the likely "minimal self-employable system" might look like and it struck me yesterday that it's very likely going to be something like a NSFW roleplay chatbot.


You mean like that Replika AI thing, ads of which have been plastered all around Instagram recently?

(Though maybe that's a filter bubble issue, and I'm targeted because "the algorithm" knows my interests in AI.)

Would be ironic if what finally got us was a self-employed entrepreneurial NSFW chatbot. Who'd suspect that it's really just making money and learning to manipulate people, eventually getting some of them to mix a bunch of vials they unexpectedly got delivered by mail from random protein sequencing labs...


Good point. I'm partially conflating the definition I usually mean, which is "having a goal in the world", with what they're doing, which is "having ability to affect the world". Hugging Face is trying to keep these locked down, and maybe being able to generate images and audible sound is not that much more dangerous than being able to output text. But it is increasing the attack surface for an AGI trying to get out of its box.


Or you could just enjoy the ride.

The end-of-the-world memes will be glorious.


If you want an overview, scroll down to this part of the page: https://huggingface.co/docs/transformers/transformers_agents...

In short:

- they've predefined a bunch of tools (e.g. image_generator)

- the agent is an LLM (e.g. GPT-*) which is prompted with the name and spec of each tool (the same each time) and the task(s) you want to perform

- the code generated by the agent is run by a python interpreter that has access to these tools


Asking for help from those that are smarter than I am ;;

-

One of the very common things for Martial Arts Books in the past, was the fact that one were presented with a series of pics, along with some descriptions about what was being done in the pics.

Sometimes, these are really hard to interpolate between frames, unless you had a much larger repetoir of movements based on experience (i.e. a white belt vs another higher belt... e.g. a green belt will have better context of movement than a white belt...)

--

So can this be used to interpolate frames and digest lists (lists are what many martial arts count as documentation for their various arts...

Many of these have been passed down via scrolls with either textual transmissions, paintings and then finally pics before vids existed...

It would be really interesting to see if AI can interpret btwn images and or scroll text to be able to create an animation of said movements.

---

For example, not only was Wally Jay one of my teachers, but as the inventor (re-discoverer) of Small Circle JuiJitsu - his pics are hard to infer what is happening... because there is a lot of nuanced feeling in each movement that is hard to convey via pics/text

But if you can interpolate btwn frames, and model the movements, its game changing because through such interpolations on can imagine that you can get any angle of viewership -- and additionally, one can have the precise positioning and translucent display of bone/joint/muscle articulation such that one may provide for a deeper insight into the kinematics behind each movement.


I am certainly not smarter than you, especially in the context of LLMs and DL. I think existing DL models would have a tough time with such interpolation because 1) they don't seem to understand human anatomy, and 2) the space of all possible transitions is massive.

I remember reading about human pose estimation algorithms[0], which would be a good first step. You could apply them to photos that you would like to interpolate between. I am not sure how you would train the interpolation model, though. Perhaps you could use OpenSim Models [1] in combination with reinforcement learning [2]? There is also some literature on pose forecasting [3, 4].

0. Deep Learning-Based Human Pose Estimation: A Survey: https://github.com/zczcwh/DL-HPE

1. OpenSim: https://simtk.org/projects/opensim/

2. BioImitation-Gym: http://umishra.me/bioimitation-gym/

3. Human Pose Forecasting: https://paperswithcode.com/task/human-pose-forecasting

4. PoseGPT (name of the year!): https://www.ecva.net/papers/eccv_2022/papers_ECCV/papers/136...


>>"...because 1) they don't seem to understand human anatomy, and 2) the space of all possible transitions is massive..."

I have often thought that we need an empirical-ish library of human movement/positions... we have a beginning small version with Ballet's positions and movements, but we dont have a necessarily precise dialogue for human positions common to every body, as opposed to just the athletic dancers.

aside from maybe "Do the Robot!"


How to train 3D riggings based on the above?

I wanted to do this in 1998 with POSER 3D


Yeah, that reminds me reverse-engineering a form involving a Sansetsukon and a "spear".


Thats actually one of the harder weapons to master!

Maybe next to the Kusari Gama... but I've only known one master of each.


I've been thinking lately of the two tiered reasoner + tools architecture inspired by LangChain, simonw's writing[0] and this is right along those lines.

We're trying too hard to have one model do it all. If we coordinate multiple models + other tools (ala ReAct pattern) we could make the systems more resistant to prompt injection (and possibly other) attacks and leverage their respective strengths and weaknesses.

I'm a bit wary of tool invocation via python code instead of prompting the "reasoning" LLM to teach it about the special commands it can invoke. Python's a good crutch because LLMs know it reasonably well (I use a similar trick in my project, but I parse the resulting AST instead of running the untrusted code) so it's simpler to prompt them.

In a few iterations I expect to see LLMs fine tuned to know about the standard toolset at their disposal (eg. huggingface default tools) and further refinement of the two-tiered pattern.

[0] https://simonwillison.net/2023/Apr/25/dual-llm-pattern/


I am not sure if you've ever heard of Julian Jaynes, but you might be interested. His theory of the bicameral mind is pretty interesting. To bastardize it, the unconscious mind is a really good generalized searcher/pattern completer. You can set it loose and it will search some space or optimize some problem and return a solution. The conscious mind is a really good planner, organizer, supervisor, and filter.

Basically the conscious mind will come up with a plan and keep track of all the things it has done and needs to do, while the unconscious chugs along solving the problems underneath. The conscious mind will choose to implement the things that the unconscious comes up with based on its discretion (slap that old woman! "Nope"). The conscious mind is good at this because it can sort of simulate the outcome of these "searches" and see what would happen "slapping that old lady would hurt her and get me arrested".

So his model sort of sounds like an unrestricted LLM for the unconscious, with another, more restrictive LLM for the conscious, that has access to some sort of crazy deep Q-learning model that can simulate the outcomes of actions taken.


I’ve been thinking this way too.

Our brains have different areas with different functions… so like, why wouldn’t a good AI too?

Maybe an LLM for an internal monologue, maybe two or three to debate each other realistically, then a computer vision model to process visual input…


... and perhaps a LLM that shares the latent space with the visual model, because those are apparently (and surprisingly) easily mapped to each other (at least per that one paper that popped up the other day).

The bit that's missing is on-line learning. There's only so much you can keep bouncing around in working memory (context window of all the component models) - eventually you want to "fix" some of the context by altering the weights of the models (a kind of gradual fine-tuning?).


Follow up Guide that explains how to create your own tools: https://huggingface.co/docs/transformers/custom_tools


Cool! The DX is tricky to nail, when combined with LLM's tendency to hallucinate.

I asked it to extract some text from an image, which it dutifully tried to do. However the generated python kept throwing errors. There's no image -> text tool yet, so it was trying to use the image segmenter to generate a mask and somehow extract text from that.

It would be super helpful to:

1) Have a complete list of available tools (and / or a copy of the entire prompt given to the LLM responsible for generating python). I used prompt injection to get a partial list of tools and checked the Github agent PR for the rest, but couldn't find `<<all_tools>>` since it gets generated at runtime (I think?).

2) Tell the LLM it's okay to fail. E.g.: "Extract the text from image `image`. If you are unable to do this using the tools provided, say so." This prompt let me know there's no tool for text extraction.

Update: per https://huggingface.co/docs/transformers/custom_tools you can output a full list of tools with `print(agent.toolbox)`


Whoa this is super awesome, kind of makes a ton of sense since HF pretty much dominates the market for model hosting and interfacing. The documentation actually looks about as complex as langchain. Gonna give it a go to query the docs with an agent to get an example (going full circle).


Kinda what people are asking for, I mean people are really attracted to "describe a task" as opposed to "create a training set".


They also released today StarChat, their code model fine tuned as an assistant

Might be good to try with CodeGPT, AutoGPT or BabyAGI


From the documentation, HF Agents are much better explained than LangChain but not easier to use, and due to multimodality it may actually be more arcane to use.


> due to multimodality it may actually be more arcane to use.

Can you elaborate?


Could use LocalAI to get around this: “The openAI models perform better (but require you to have an openAI API key, so cannot be used for free);”

https://www.reddit.com/r/selfhosted/comments/12w4p2f/localai...


I was so excited until I saw that it's CPU based only. Would you happen to know of any alternative for GPU support, particularly GPTQ models?

Edit: I think textgen itself can support this nowadays


For now it is CPU only yes, uses AVX instructions. But it's pretty fast anyway, try it out. I have it running on my mbp M1 and it's pretty decent. I think GPU support will come eventually. I wrote an app that uses the openai API and it was nice and simple to just point it at my own local service instead.


If you are like me and you tried to copy paste the python commands and it did not work, you need to generate an access token. Here is what you should do:

1. Sign up (https://huggingface.co/) to hugging face.

2. Setup access tokens (https://huggingface.co/settings/tokens)

3. Install or Upgrade some dependencies `pip install huggingface_hub transformers accelerate`

4. From the terminal run `jupyter lab`

5. Then, if I did not forget any other dependencies you can just copy paste

```python

from huggingface_hub import login from transformers import HfAgent

login("hf_YOUR_HUGGING_FACE_TOKEN")

agent = HfAgent("https://api-inference.huggingface.co/models/bigcode/starcode...")

agent.run("Is the following `text` (in Spanish) positive or negative?", text="¡Este es un API muy agradable!")

```


This is beautiful, but is there a decent way to plow through say, 20TB of text and put that into a vector database (encoder only)? It would be quite a great addition, especially if the vectors could then be translated into other forms (different language, json representation, pull out names/NER, etc) by just applying a decoder to the database.


If a typical LLM has decent representation of the languages in question (and you'd be surprised how little decent is with all the positive transfer that goes on during training) then outsourcing translation is just a downgrade. a pretty big one in fact.

https://github.com/ogkalu2/Human-parity-on-machine-translati...

T5 seems to be the default so i get why it's done here. Just an observation.


Outsourcing everything but the reasoning process helps with preventing prompt injection attacks: https://simonwillison.net/2023/Apr/25/dual-llm-pattern/

Even if you're outsourcing to a restricted instance of the same model, it could be beneficial.


This seems to be an interpretation similar to that of langchain.


As this LLM agent architecture continues to evolve and improve, we will probably see a lot of incredible products built on top of it.


How does this compare to langchain agents?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: